exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 46 discussion

A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucket/prefix/year=2023/month=01/day=01.
A data engineer must ensure that the AWS Glue Data Catalog synchronizes with the S3 storage when the company adds new partitions to the bucket.
Which solution will meet these requirements with the LEAST latency?

  • A. Schedule an AWS Glue crawler to run every morning.
  • B. Manually run the AWS Glue CreatePartition API twice each day.
  • C. Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create_partition API call.
  • D. Run the MSCK REPAIR TABLE command from the AWS Glue console.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
rralucard_
Highly Voted 8 months, 3 weeks ago
Selected Answer: C
Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create_partition API call. This approach ensures that the Data Catalog is updated as soon as new data is written to S3, providing the least latency in reflecting new partitions.
upvoted 8 times
Tester_TKK
6 days ago
Hey, Did you have some of the ExamTopics questions in the exam?
upvoted 1 times
...
...
pypelyncar
Most Recent 4 months, 2 weeks ago
Selected Answer: C
By embedding the Boto3 create_partition API call within the code that writes data to S3, you achieve near real-time synchronization. The Data Catalog is updated immediately after a new partition is created in S3.
upvoted 4 times
...
tgv
4 months, 3 weeks ago
Selected Answer: C
The explanation could be more precise regarding the interaction with Amazon S3 and AWS Glue. The key point is that the process should be triggered immediately when new data is added to S3. This can be achieved through event-driven architecture, which indeed makes the solution intuitive and efficient.
upvoted 2 times
...
valuedate
5 months ago
Selected Answer: C
add partition after writing the data in s3
upvoted 1 times
...
DevoteamAnalytix
5 months, 1 week ago
Selected Answer: D
It's about "synchronizing AWS Glue Data Catalog with S3". So for me it's D - using MSCK REPAIR TABLE for existing S3 partitions (https://docs.aws.amazon.com/athena/latest/ug/msck-repair-table.html)
upvoted 1 times
megadba
5 months, 1 week ago
Least latency
upvoted 1 times
...
...
okechi
6 months, 2 weeks ago
The answer is D
upvoted 2 times
...
GiorgioGss
7 months, 1 week ago
Selected Answer: C
It's pure event-driven so... C
upvoted 1 times
...
atu1789
9 months ago
Selected Answer: C
C. Least latency
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago