Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 46 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 46
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucket/prefix/year=2023/month=01/day=01.
A data engineer must ensure that the AWS Glue Data Catalog synchronizes with the S3 storage when the company adds new partitions to the bucket.
Which solution will meet these requirements with the LEAST latency?

A. Schedule an AWS Glue crawler to run every morning.
B. Manually run the AWS Glue CreatePartition API twice each day.
C. Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create_partition API call.
D. Run the MSCK REPAIR TABLE command from the AWS Glue console.

Show Suggested Answer

Suggested Answer: C 🗳️

by atu1789 at Jan. 29, 2024, 12:30 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

rralucard_

Highly Voted 8 months, 3 weeks ago

Selected Answer: C

Use code that writes data to Amazon S3 to invoke the Boto3 AWS Glue create_partition API call. This approach ensures that the Data Catalog is updated as soon as new data is written to S3, providing the least latency in reflecting new partitions.

upvoted 8 times

Tester_TKK

6 days ago

Hey, Did you have some of the ExamTopics questions in the exam?

upvoted 1 times

...

pypelyncar

Most Recent 4 months, 2 weeks ago

Selected Answer: C

By embedding the Boto3 create_partition API call within the code that writes data to S3, you achieve near real-time synchronization. The Data Catalog is updated immediately after a new partition is created in S3.

upvoted 4 times

...

tgv

4 months, 3 weeks ago

Selected Answer: C

The explanation could be more precise regarding the interaction with Amazon S3 and AWS Glue. The key point is that the process should be triggered immediately when new data is added to S3. This can be achieved through event-driven architecture, which indeed makes the solution intuitive and efficient.

upvoted 2 times

...

valuedate

5 months ago

Selected Answer: C

add partition after writing the data in s3

upvoted 1 times

...

DevoteamAnalytix

5 months, 1 week ago

Selected Answer: D

It's about "synchronizing AWS Glue Data Catalog with S3". So for me it's D - using MSCK REPAIR TABLE for existing S3 partitions (https://docs.aws.amazon.com/athena/latest/ug/msck-repair-table.html)

upvoted 1 times

megadba

5 months, 1 week ago

Least latency

upvoted 1 times

...

okechi

6 months, 2 weeks ago

The answer is D

upvoted 2 times

...

GiorgioGss

7 months, 1 week ago

Selected Answer: C

It's pure event-driven so... C

upvoted 1 times

...

atu1789

9 months ago

Selected Answer: C

C. Least latency

upvoted 1 times

...

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 46 discussion

Comments

rralucard_

Tester_TKK

pypelyncar

tgv

valuedate

DevoteamAnalytix

megadba

okechi

GiorgioGss

atu1789

SY0-701