Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 197 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 197
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A company has a podcast platform that has thousands of users. The company has implemented an anomaly detection algorithm to detect low podcast engagement based on a 10-minute running window of user events such as listening, pausing, and exiting the podcast. A machine learning (ML) specialist is designing the data ingestion of these events with the knowledge that the event payload needs some small transformations before inference.

How should the ML specialist design the data ingestion to meet these requirements with the LEAST operational overhead?

A. Ingest event data by using a GraphQLAPI in AWS AppSync. Store the data in an Amazon DynamoDB table. Use DynamoDB Streams to call an AWS Lambda function to transform the most recent 10 minutes of data before inference.
B. Ingest event data by using Amazon Kinesis Data Streams. Store the data in Amazon S3 by using Amazon Kinesis Data Firehose. Use AWS Glue to transform the most recent 10 minutes of data before inference.
C. Ingest event data by using Amazon Kinesis Data Streams. Use an Amazon Kinesis Data Analytics for Apache Flink application to transform the most recent 10 minutes of data before inference.
D. Ingest event data by using Amazon Managed Streaming for Apache Kafka (Amazon MSK). Use an AWS Lambda function to transform the most recent 10 minutes of data before inference.

Show Suggested Answer

Suggested Answer: C 🗳️

by dunhill at Nov. 28, 2022, 6:20 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

wjohnny

Highly Voted 2 years, 4 months ago

Selected Answer: B

B, but it was possible to use Kinesis Data Firehose directly, insted Kinesis Data Stream

upvoted 7 times

...

dunhill

Highly Voted 2 years, 5 months ago

I think the answer is B.

upvoted 6 times

...

MultiCloudIronMan

Most Recent 5 months, 4 weeks ago

Selected Answer: C

This has the least latency

upvoted 1 times

...

72cc81d

9 months ago

Selected Answer: C

Moving window, and less components

upvoted 1 times

...

Gmishra

1 year ago

Selected Answer: B

C: Doesn't talk how to Store the data in Amazon S3

upvoted 2 times

...

AIWave

1 year, 2 months ago

Selected Answer: B

With Amazon Kinesis Data Analytics for Apache Flink, the ML specialist needs to manage the scaling and resource allocation for the Flink application, including determining the appropriate number of processing units (KPUs) and handling scaling based on the incoming data volume. This requires monitoring and adjusting resources as needed, adding to the operational overhead.

upvoted 2 times

...

akgarg00

1 year, 5 months ago

Selected Answer: C

C is the correct answer. B is workable but is not good for small transformation required in question.

upvoted 1 times

...

geoan13

1 year, 5 months ago

C Amazon Managed Service for Apache Flink was previously known as Amazon Kinesis Data Analytics for Apache Flink. it allows you to process and analyze streaming data providing the capability to perform transformations on the streaming data. B - no need of using an extra service aws glue

upvoted 1 times

...

DimLam

1 year, 6 months ago

Selected Answer: C

I would choose C. As we need to implement our detection on running window, and B only allows us to perform operations on the latest 10 minutes of data. If we choose B, we also need to decide how frequently to run the Glue job and it involves some orchestrator tools. C in other way works in real-time mode, and we don't need an orchestration tool to move the window. Based on this, I would go with C as it has less overhead

upvoted 1 times

...

backbencher2022

1 year, 6 months ago

Selected Answer: C

Would choose see as the transformation required is minimal which could be easily achieved with KDA (flink job)

upvoted 1 times

...

loict

1 year, 7 months ago

Selected Answer: B

Not sure between B & C A. NO - too many moving parts B. YES - clean & elegant C. YES - works as well in batch mode D. NO - MSK is outdated

upvoted 2 times

...

Shenannigan

1 year, 7 months ago

Selected Answer: C

https://aws.amazon.com/blogs/architecture/realtime-in-stream-inference-kinesis-sagemaker-flink/

upvoted 2 times

...

Mickey321

1 year, 8 months ago

Selected Answer: C

Amazon Kinesis Data Streams is a fully managed real-time streaming service that can be used to ingest large amounts of data from multiple sources. This makes it a good choice for ingesting the event data from the podcast platform. Amazon Kinesis Data Analytics for Apache Flink is a fully managed service that can be used to process streaming data using Apache Flink. Apache Flink is a popular streaming processing framework that is known for its scalability and fault tolerance. This makes it a good choice for transforming the event data before inference.

upvoted 2 times

...

ADVIT

1 year, 9 months ago

Selected Answer: B

It's B, ""LEAST operational overhead"", C is more operations overhead.

upvoted 2 times

DimLam

1 year, 6 months ago

No. it's not true. for running B we need some orchestrator to run the glue job frequently. but for C it is running constantly. So C doesn't have step with orchestration

upvoted 1 times

...

dkx

1 year, 11 months ago

Selected Answer: C

Flink distributes the data across one or more stream partitions, and user-defined operators can transform the data stream.

upvoted 2 times

...

blanco750

2 years, 1 month ago

Selected Answer: B

B is the answer. least management overhead

upvoted 4 times

blanco750

2 years, 1 month ago

And for C you have to author and build your Apache Flink application. extra work

upvoted 1 times

DimLam

1 year, 6 months ago

And for Glue you need to write a SQL or spark script. Extra work. Ah, yes. and for B you need to create an orchestrator to run the ETL jobs frequently

upvoted 1 times

...

pan_b

2 years, 1 month ago

Selected Answer: C

Answer should be C. https://aws.amazon.com/blogs/architecture/realtime-in-stream-inference-kinesis-sagemaker-flink/

upvoted 3 times

dkx

1 year, 11 months ago

Flink distributes the data across one or more stream partitions, and user-defined operators can transform the data stream.

upvoted 1 times

...

Load full discussion...

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 197 discussion

Comments

wjohnny

dunhill

MultiCloudIronMan

72cc81d

Gmishra

AIWave

akgarg00

geoan13

DimLam

backbencher2022

loict

Shenannigan

Mickey321

ADVIT

DimLam

dkx

blanco750

blanco750

DimLam

pan_b

dkx

SY0-701