Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 197 discussion

A company has a podcast platform that has thousands of users. The company has implemented an anomaly detection algorithm to detect low podcast engagement based on a 10-minute running window of user events such as listening, pausing, and exiting the podcast. A machine learning (ML) specialist is designing the data ingestion of these events with the knowledge that the event payload needs some small transformations before inference.

How should the ML specialist design the data ingestion to meet these requirements with the LEAST operational overhead?

  • A. Ingest event data by using a GraphQLAPI in AWS AppSync. Store the data in an Amazon DynamoDB table. Use DynamoDB Streams to call an AWS Lambda function to transform the most recent 10 minutes of data before inference.
  • B. Ingest event data by using Amazon Kinesis Data Streams. Store the data in Amazon S3 by using Amazon Kinesis Data Firehose. Use AWS Glue to transform the most recent 10 minutes of data before inference.
  • C. Ingest event data by using Amazon Kinesis Data Streams. Use an Amazon Kinesis Data Analytics for Apache Flink application to transform the most recent 10 minutes of data before inference.
  • D. Ingest event data by using Amazon Managed Streaming for Apache Kafka (Amazon MSK). Use an AWS Lambda function to transform the most recent 10 minutes of data before inference.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
wjohnny
Highly Voted 1 year, 11 months ago
Selected Answer: B
B, but it was possible to use Kinesis Data Firehose directly, insted Kinesis Data Stream
upvoted 7 times
...
dunhill
Highly Voted 1 year, 11 months ago
I think the answer is B.
upvoted 6 times
...
MultiCloudIronMan
Most Recent 2 weeks ago
Selected Answer: C
This has the least latency
upvoted 1 times
...
72cc81d
3 months, 2 weeks ago
Selected Answer: C
Moving window, and less components
upvoted 1 times
...
Gmishra
7 months, 1 week ago
Selected Answer: B
C: Doesn't talk how to Store the data in Amazon S3
upvoted 2 times
...
AIWave
8 months, 3 weeks ago
Selected Answer: B
With Amazon Kinesis Data Analytics for Apache Flink, the ML specialist needs to manage the scaling and resource allocation for the Flink application, including determining the appropriate number of processing units (KPUs) and handling scaling based on the incoming data volume. This requires monitoring and adjusting resources as needed, adding to the operational overhead.
upvoted 2 times
...
akgarg00
11 months, 3 weeks ago
Selected Answer: C
C is the correct answer. B is workable but is not good for small transformation required in question.
upvoted 1 times
...
geoan13
1 year ago
C Amazon Managed Service for Apache Flink was previously known as Amazon Kinesis Data Analytics for Apache Flink. it allows you to process and analyze streaming data providing the capability to perform transformations on the streaming data. B - no need of using an extra service aws glue
upvoted 1 times
...
DimLam
1 year ago
Selected Answer: C
I would choose C. As we need to implement our detection on running window, and B only allows us to perform operations on the latest 10 minutes of data. If we choose B, we also need to decide how frequently to run the Glue job and it involves some orchestrator tools. C in other way works in real-time mode, and we don't need an orchestration tool to move the window. Based on this, I would go with C as it has less overhead
upvoted 1 times
...
backbencher2022
1 year, 1 month ago
Selected Answer: C
Would choose see as the transformation required is minimal which could be easily achieved with KDA (flink job)
upvoted 1 times
...
loict
1 year, 2 months ago
Selected Answer: B
Not sure between B & C A. NO - too many moving parts B. YES - clean & elegant C. YES - works as well in batch mode D. NO - MSK is outdated
upvoted 2 times
...
Shenannigan
1 year, 2 months ago
Selected Answer: C
https://aws.amazon.com/blogs/architecture/realtime-in-stream-inference-kinesis-sagemaker-flink/
upvoted 2 times
...
Mickey321
1 year, 2 months ago
Selected Answer: C
Amazon Kinesis Data Streams is a fully managed real-time streaming service that can be used to ingest large amounts of data from multiple sources. This makes it a good choice for ingesting the event data from the podcast platform. Amazon Kinesis Data Analytics for Apache Flink is a fully managed service that can be used to process streaming data using Apache Flink. Apache Flink is a popular streaming processing framework that is known for its scalability and fault tolerance. This makes it a good choice for transforming the event data before inference.
upvoted 2 times
...
ADVIT
1 year, 4 months ago
Selected Answer: B
It's B, ""LEAST operational overhead"", C is more operations overhead.
upvoted 2 times
DimLam
1 year ago
No. it's not true. for running B we need some orchestrator to run the glue job frequently. but for C it is running constantly. So C doesn't have step with orchestration
upvoted 1 times
...
...
dkx
1 year, 6 months ago
Selected Answer: C
Flink distributes the data across one or more stream partitions, and user-defined operators can transform the data stream.
upvoted 2 times
...
blanco750
1 year, 8 months ago
Selected Answer: B
B is the answer. least management overhead
upvoted 4 times
blanco750
1 year, 8 months ago
And for C you have to author and build your Apache Flink application. extra work
upvoted 1 times
DimLam
1 year ago
And for Glue you need to write a SQL or spark script. Extra work. Ah, yes. and for B you need to create an orchestrator to run the ETL jobs frequently
upvoted 1 times
...
...
...
pan_b
1 year, 8 months ago
Selected Answer: C
Answer should be C. https://aws.amazon.com/blogs/architecture/realtime-in-stream-inference-kinesis-sagemaker-flink/
upvoted 3 times
dkx
1 year, 6 months ago
Flink distributes the data across one or more stream partitions, and user-defined operators can transform the data stream.
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...