exam questions

Exam AWS Certified Machine Learning - Specialty All Questions

View all questions & answers for the AWS Certified Machine Learning - Specialty exam

Exam AWS Certified Machine Learning - Specialty topic 1 question 289 discussion

A data engineer is preparing a dataset that a retail company will use to predict the number of visitors to stores. The data engineer created an Amazon S3 bucket. The engineer subscribed the S3 bucket to an AWS Data Exchange data product for general economic indicators. The data engineer wants to join the economic indicator data to an existing table in Amazon Athena to merge with the business data. All these transformations must finish running in 30-60 minutes.

Which solution will meet these requirements MOST cost-effectively?

  • A. Configure the AWS Data Exchange product as a producer for an Amazon Kinesis data stream. Use an Amazon Kinesis Data Firehose delivery stream to transfer the data to Amazon S3. Run an AWS Glue job that will merge the existing business data with the Athena table. Write the result set back to Amazon S3.
  • B. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to use Amazon SageMaker Data Wrangler to merge the existing business data with the Athena table. Write the result set back to Amazon S3.
  • C. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to run an AWS Glue job that will merge the existing business data with the Athena table. Write the results back to Amazon S3.
  • D. Provision an Amazon Redshift cluster. Subscribe to the AWS Data Exchange product and use the product to create an Amazon Redshift table. Merge the data in Amazon Redshift. Write the results back to Amazon S3.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
AIWave
7 months, 2 weeks ago
Selected Answer: C
A - kinessis added unnecessary additional cost and complexity and will add to latency B - wrangler is better suited for daat prep & feature engineering no merging C - Serverless so cost effective, trigger happens immediately so within lambda 15 min window and glue is made for these use cases D - Costly setup and maintenace
upvoted 1 times
...
vkbajoria
7 months, 3 weeks ago
Selected Answer: C
hint is 30 to 60 minutes and lambda has 15 minutes. Plus Glue provides many built-in functionality to perform the merge process much easier
upvoted 1 times
...
Stokvisss
7 months, 4 weeks ago
Selected Answer: C
A is not needed as we don't need to add Kinesis, it has no purpose here. B is possible but DataWrangler is more expensive then C. C is serverless and cost optimized, so C is correct. D is obviously too expensive.
upvoted 2 times
...
Alice1234
8 months, 2 weeks ago
C. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to run an AWS Glue job that will merge the existing business data with the Athena table. Write the results back to Amazon S3. This solution avoids the need for continuous data streams or provisioning a persistent database cluster, which can incur higher costs. AWS Lambda can trigger cost-effective, short-duration tasks, and AWS Glue is a managed ETL service that can handle the data transformation and merging efficiently. The integration with Amazon S3 and Athena also aligns with the existing data flow and tools.
upvoted 1 times
...
kyuhuck
8 months, 2 weeks ago
Selected Answer: B
fix c - > b The most cost-effective solution is to use an S3 event to trigger a Lambda function that uses SageMaker Data Wrangler to merge the data. This solution avoids the need to provision and manage any additional resources, such as Kinesis streams, Firehose delivery streams, Glue jobs, or Redshift clusters. SageMaker Data Wrangler provides a visual interface to import, prepare, transform, and analyze data from various sources, including AWS Data Exchange products. It can also export the data preparation workflow to a Python script that can be executed by a Lambda function. This solution can meet the time requirement of 30-60 minutes, depending on the size and complexity of the data. References: Using Amazon S3 Event Notifications Prepare ML Data with Amazon SageMaker Data Wrangler AWS Lambda Function
upvoted 1 times
...
kyuhuck
8 months, 2 weeks ago
Selected Answer: C
he most cost-effective and straightforward solution is C. Use an S3 event on the AWS Data Exchange S3 bucket to invoke an AWS Lambda function. Program the Lambda function to run an AWS Glue job that will merge the existing business data with the Athena table and write the results back to Amazon S3. This approach leverages the serverless architecture of AWS, minimizing operational overhead and cost while ensuring the transformations can be completed within the desired timeframe.
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago