exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 35 discussion

A company uses Amazon S3 to store semi-structured data in a transactional data lake. Some of the data files are small, but other data files are tens of terabytes.
A data engineer must perform a change data capture (CDC) operation to identify changed data from the data source. The data source sends a full snapshot as a JSON file every day and ingests the changed data into the data lake.
Which solution will capture the changed data MOST cost-effectively?

  • A. Create an AWS Lambda function to identify the changes between the previous data and the current data. Configure the Lambda function to ingest the changes into the data lake.
  • B. Ingest the data into Amazon RDS for MySQL. Use AWS Database Migration Service (AWS DMS) to write the changed data to the data lake.
  • C. Use an open source data lake format to merge the data source with the S3 data lake to insert the new data and update the existing data.
  • D. Ingest the data into an Amazon Aurora MySQL DB instance that runs Aurora Serverless. Use AWS Database Migration Service (AWS DMS) to write the changed data to the data lake.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
GiorgioGss
Highly Voted 1 year, 1 month ago
Selected Answer: C
https://aws.amazon.com/blogs/big-data/implement-a-cdc-based-upsert-in-a-data-lake-using-apache-iceberg-and-aws-glue/
upvoted 7 times
...
plutonash
Most Recent 3 months, 1 week ago
Selected Answer: A
Generally, AWS questions never give preference to the others solution than an AWS service so even if C could be better the answer is A
upvoted 1 times
Juan_pc
2 days, 5 hours ago
But some files are tens of terabytes, and Lamda has a time windows of 15 minutes, that could not be enougth time to process big data
upvoted 1 times
...
...
influxy
8 months, 2 weeks ago
https://aws.amazon.com/blogs/big-data/choosing-an-open-table-format-for-your-transactional-data-lake-on-aws/
upvoted 1 times
...
FunkyFresco
11 months ago
Selected Answer: C
Ill go with Delta or something like that. is C
upvoted 2 times
...
certplan
1 year, 1 month ago
Relative to cost, here are docs for the reason for option C: https://docs.aws.amazon.com/AmazonS3/latest/dev/Welcome.html https://aws.amazon.com/blogs/big-data/ https://docs.aws.amazon.com/glue/latest/dg/welcome.html https://docs.aws.amazon.com/emr/ Here are docs for reasons the others are not correct: https://aws.amazon.com/lambda/pricing/ https://aws.amazon.com/rds/pricing/ https://aws.amazon.com/dms/pricing/
upvoted 2 times
...
damaldon
1 year, 1 month ago
Answ. D You can migrate data from any MySQL-compatible database (MySQL, MariaDB, or Amazon Aurora MySQL) using AWS Database Migration Service. https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MySQL.html
upvoted 1 times
Juan_pc
2 days, 5 hours ago
D is not the best cost effectively solution
upvoted 1 times
...
GiorgioGss
1 year, 1 month ago
"other data files are tens of terabytes" - good luck with DMS on that :) I think it's C
upvoted 6 times
...
...
[Removed]
1 year, 3 months ago
Selected Answer: C
This is a tricky one. Although option A seems like the best choice since it uses an AWS service, I believe using Delta/Iceberg APIs would be easier than writing custom code on Lambda
upvoted 4 times
Houyon
1 year, 2 months ago
If all files were small I believe it would be a great idea. However, you wouldn't be able to compare heavy files with lambda due to its memory/capacity and runtime constraints
upvoted 4 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago