Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 289 discussion

Exam question from Amazon's AWS Certified Machine Learning - Specialty

Question #: 289
Topic #: 1

[All AWS Certified Machine Learning - Specialty Questions]

A data engineer is preparing a dataset that a retail company will use to predict the number of visitors to stores. The data engineer created an Amazon S3 bucket. The engineer subscribed the S3 bucket to an AWS Data Exchange data product for general economic indicators. The data engineer wants to join the economic indicator data to an existing table in Amazon Athena to merge with the business data. All these transformations must finish running in 30-60 minutes.

Which solution will meet these requirements MOST cost-effectively?

A. Configure the AWS Data Exchange product as a producer for an Amazon Kinesis data stream. Use an Amazon Kinesis Data Firehose delivery stream to transfer the data to Amazon S3. Run an AWS Glue job that will merge the existing business data with the Athena table. Write the result set back to Amazon S3.
B. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to use Amazon SageMaker Data Wrangler to merge the existing business data with the Athena table. Write the result set back to Amazon S3.
C. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to run an AWS Glue job that will merge the existing business data with the Athena table. Write the results back to Amazon S3.
D. Provision an Amazon Redshift cluster. Subscribe to the AWS Data Exchange product and use the product to create an Amazon Redshift table. Merge the data in Amazon Redshift. Write the results back to Amazon S3.

Show Suggested Answer

Suggested Answer: C 🗳️

by kyuhuck at Feb. 7, 2024, 9:26 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

AIWave

7 months, 2 weeks ago

Selected Answer: C

A - kinessis added unnecessary additional cost and complexity and will add to latency B - wrangler is better suited for daat prep & feature engineering no merging C - Serverless so cost effective, trigger happens immediately so within lambda 15 min window and glue is made for these use cases D - Costly setup and maintenace

upvoted 1 times

...

vkbajoria

7 months, 3 weeks ago

Selected Answer: C

hint is 30 to 60 minutes and lambda has 15 minutes. Plus Glue provides many built-in functionality to perform the merge process much easier

upvoted 1 times

...

Stokvisss

7 months, 4 weeks ago

Selected Answer: C

A is not needed as we don't need to add Kinesis, it has no purpose here. B is possible but DataWrangler is more expensive then C. C is serverless and cost optimized, so C is correct. D is obviously too expensive.

upvoted 2 times

...

Alice1234

8 months, 2 weeks ago

C. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to run an AWS Glue job that will merge the existing business data with the Athena table. Write the results back to Amazon S3. This solution avoids the need for continuous data streams or provisioning a persistent database cluster, which can incur higher costs. AWS Lambda can trigger cost-effective, short-duration tasks, and AWS Glue is a managed ETL service that can handle the data transformation and merging efficiently. The integration with Amazon S3 and Athena also aligns with the existing data flow and tools.

upvoted 1 times

...

kyuhuck

8 months, 2 weeks ago

Selected Answer: B

fix c - > b The most cost-effective solution is to use an S3 event to trigger a Lambda function that uses SageMaker Data Wrangler to merge the data. This solution avoids the need to provision and manage any additional resources, such as Kinesis streams, Firehose delivery streams, Glue jobs, or Redshift clusters. SageMaker Data Wrangler provides a visual interface to import, prepare, transform, and analyze data from various sources, including AWS Data Exchange products. It can also export the data preparation workflow to a Python script that can be executed by a Lambda function. This solution can meet the time requirement of 30-60 minutes, depending on the size and complexity of the data. References: Using Amazon S3 Event Notifications Prepare ML Data with Amazon SageMaker Data Wrangler AWS Lambda Function

upvoted 1 times

...

kyuhuck

8 months, 2 weeks ago

Selected Answer: C

he most cost-effective and straightforward solution is C. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to run an AWS Glue job that will merge the existing business data with the Athena table and write the results back to Amazon S3. This approach leverages the serverless architecture of AWS, minimizing operational overhead and cost while ensuring the transformations can be completed within the desired timeframe.

upvoted 2 times

...

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 289 discussion

Comments

AIWave

vkbajoria

Stokvisss

Alice1234

kyuhuck

kyuhuck

SY0-701