Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 135 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 135
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

A data engineer is using an AWS Glue crawler to catalog data that is in an Amazon S3 bucket. The S3 bucket contains both .csv and json files. The data engineer configured the crawler to exclude the .json files from the catalog.

When the data engineer runs queries in Amazon Athena, the queries also process the excluded .json files. The data engineer wants to resolve this issue. The data engineer needs a solution that will not affect access requirements for the .csv files in the source S3 bucket.

Which solution will meet this requirement with the SHORTEST query times?

A. Adjust the AWS Glue crawler settings to ensure that the AWS Glue crawler also excludes .json files.
B. Use the Athena console to ensure the Athena queries also exclude the .json files.
C. Relocate the .json files to a different path within the S3 bucket.
D. Use S3 bucket policies to block access to the .json files.

Show Suggested Answer

Suggested Answer: C 🗳️

by teo2157 at Aug. 12, 2024, 12:48 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

teo2157

Highly Voted 8 months, 2 weeks ago

Selected Answer: C

Athena does not recognize exclude patterns that you specify an AWS Glue crawler. For example, if you have an Amazon S3 bucket that contains both .csv and .json files and you exclude the .json files from the crawler, Athena queries both groups of files. To avoid this, place the files that you want to exclude in a different location. https://docs.aws.amazon.com/athena/latest/ug/troubleshooting-athena.html

upvoted 8 times

...

AdityaB

Most Recent 6 months, 2 weeks ago

If the AWS Glue crawler is configured to exclude .json files, then the AWS Glue Data Catalog will not have any metadata related to those .json files. In this case, the Athena table that uses the Glue Data Catalog would not be aware of the .json files at all, and Athena queries would only process the files that are included in the Glue catalog (e.g., .csv files).

upvoted 1 times

...

BenLearningDE

7 months, 2 weeks ago

Athena will scan both types of files. Although it may be feasible to adjust Athena query to exclude .json, the SHORTEST query times would be via relocating .json files to different path.

upvoted 1 times

...

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 135 discussion

Comments

teo2157

AdityaB

BenLearningDE

SY0-701