Exam AI-102 All Questions

View all questions & answers for the AI-102 exam

Exam AI-102 topic 1 question 69 discussion

Actual exam question from Microsoft's AI-102

Question #: 69
Topic #: 1

You are building an internet-based training solution. The solution requires that a user's camera and microphone remain enabled.

You need to monitor a video stream of the user and detect when the user asks an instructor a question. The solution must minimize development effort.

What should you include in the solution?

A. speech-to-text in the Azure AI Speech service
B. language detection in Azure AI Language Service
C. the Face service in Azure AI Vision
D. object detection in Azure AI Custom Vision

Show Suggested Answer

Suggested Answer: A 🗳️

by audlindr at March 2, 2024, 6:47 p.m.

Comments

Submit Cancel

syupwsh

4 months, 3 weeks ago

Selected Answer: A

speech-to-text in the Azure AI Speech service is designed to convert spoken language into text, which is essential for detecting when a user asks a question during a video stream. This service can process the audio captured by the user's microphone, transcribe the spoken words in real-time, and allow the system to recognize when a question is being asked with minimal development effort. A for answer

upvoted 2 times

...

MostafaAbdellahAhmed

10 months ago

A. Speech-to-text in the Azure AI Speech service Explanation: Speech-to-Text in the Azure AI Speech service can transcribe spoken language into text in real time, enabling the detection of questions when users speak. This approach minimizes development effort by directly converting speech to text, allowing easy identification of questions.

upvoted 3 times

...

SAMBIT

11 months, 3 weeks ago

Definitely its not A. That's a bunker

upvoted 1 times

...

HaraTadahisa

1 year ago

Selected Answer: A

A is correct answer.

upvoted 1 times

...

Belicova

1 year ago

Go with D From Copilot: To monitor a video stream of the user and detect when the user asks an instructor a question while minimizing development effort, consider using object detection. Specifically, you can leverage existing models or frameworks (such as YOLOv3) to detect people in real-time from the video stream1. Once you identify a person asking a question, you can trigger further actions or alerts. This approach avoids the complexity of speech-to-text or language detection and focuses on the specific task at hand. Therefore, go with D. object detection in Azure AI Custom Vision!

upvoted 4 times

...

anto69

1 year ago

Selected Answer: A

To minimize effort: A is enough

upvoted 1 times

...

reiwanotora

1 year, 1 month ago

Selected Answer: A

user's camera and microphone remain enabled, so A is right.

upvoted 2 times

...

anntv252

1 year, 2 months ago

Selected Answer: A

Because user's camera and microphone remain enabled. Azure AI Speech service is recommend for using

upvoted 1 times

...

Barry123456

1 year, 2 months ago

It says video stream. It doesn't say the video stream has audio. I deal with video only streams all day. Don't assume.

upvoted 2 times

...

sivapolam90

1 year, 2 months ago

Selected Answer: A

A. speech-to-text in the Azure AI Speech service

upvoted 1 times

...

Murtuza

1 year, 2 months ago

Selected Answer: A

The best option for this scenario would be A. speech-to-text in the Azure AI Speech service. This service can transcribe the user’s spoken words into written text, which can then be analyzed to detect when a question is being asked. This would be more efficient and direct for detecting questions in a video stream, compared to the other options which focus on language detection, face recognition, and object detection. These other services might not be as effective for this specific use-case.

upvoted 2 times

...

NullVoider_0

1 year, 3 months ago

Selected Answer: A

A. speech-to-text in the Azure AI Speech service This service can transcribe the spoken words into text in real-time, which can then be analyzed to detect questions. It’s an efficient way to monitor for specific verbal cues or keywords that indicate a question is being asked, without the need for extensive programming or manual review. This approach minimizes development effort while providing a robust solution for the requirement.

upvoted 1 times

...

Murtuza

1 year, 3 months ago

The correct CHOICE is C. I made a silly typo but my explanations are right on point.

upvoted 3 times

...

Murtuza

1 year, 3 months ago

The other options are not directly relevant to detecting user questions in a video stream: Speech-to-text (Option A): Converts spoken language into text. While useful for transcribing audio, it doesn’t directly address identifying user questions. Language detection (Option B): Determines the language of text. It’s not specifically designed for monitoring video streams or detecting questions. Object detection (Option D): Identifies objects within images, but it’s not suitable for detecting user interactions or questions. Therefore, Option C (the Face service in Azure AI Vision) is the most appropriate choice for your scenario.

upvoted 3 times

...

Murtuza

1 year, 3 months ago

Selected Answer: A

Face Service (Azure AI Vision): The Face service provides facial recognition capabilities, which can be used to identify when a user is facing the camera (e.g., looking at the instructor). By analyzing facial features, expressions, and head movements, you can detect when a user is likely to be asking a question. This approach minimizes development effort because it directly addresses the requirement of monitoring the video stream for user interactions.

upvoted 1 times

Mehe323

1 year, 3 months ago

The user can talk, but it doesn't have to be a question. I think the focus should be on detecting whether something is a question or not and for that, you need speech to text first. Face doesn't make sense as identifying questions is not the purpose of that service: 'The Azure AI Face service provides AI algorithms that detect, recognize, and analyze human faces in images. Facial recognition software is important in many different scenarios, such as identification, touchless access control, and face blurring for privacy.' https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-identity

upvoted 4 times

...

chandiochan

1 year, 3 months ago

Selected Answer: A

speech-to-text in the Azure AI Speech service/ This service can transcribe spoken words into written text in real-time, allowing you to monitor the audio for specific triggers, like questions, which can then be further processed or flagged for response. This solution is efficient and requires minimal development effort for integrating audio streaming and speech recognition capabilities.

upvoted 2 times

AlviraTony

1 year, 3 months ago

[ChatGPT] A. Speech-to-text in the Azure AI Speech service. Explanation: Speech-to-text functionality can convert spoken words into text, allowing you to analyze the content of the speech. By using speech-to-text, you can transcribe the user's spoken questions and then analyze the text to detect if a question is being asked to the instructor. This option aligns with the requirement to monitor the user's speech in real-time without significant development effort.

upvoted 1 times

...