exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 52 discussion

A company needs to set up a data catalog and metadata management for data sources that run in the AWS Cloud. The company will use the data catalog to maintain the metadata of all the objects that are in a set of data stores. The data stores include structured sources such as Amazon RDS and Amazon Redshift. The data stores also include semistructured sources such as JSON files and .xml files that are stored in Amazon S3.
The company needs a solution that will update the data catalog on a regular basis. The solution also must detect changes to the source metadata.
Which solution will meet these requirements with the LEAST operational overhead?

  • A. Use Amazon Aurora as the data catalog. Create AWS Lambda functions that will connect to the data catalog. Configure the Lambda functions to gather the metadata information from multiple sources and to update the Aurora data catalog. Schedule the Lambda functions to run periodically.
  • B. Use the AWS Glue Data Catalog as the central metadata repository. Use AWS Glue crawlers to connect to multiple data stores and to update the Data Catalog with metadata changes. Schedule the crawlers to run periodically to update the metadata catalog.
  • C. Use Amazon DynamoDB as the data catalog. Create AWS Lambda functions that will connect to the data catalog. Configure the Lambda functions to gather the metadata information from multiple sources and to update the DynamoDB data catalog. Schedule the Lambda functions to run periodically.
  • D. Use the AWS Glue Data Catalog as the central metadata repository. Extract the schema for Amazon RDS and Amazon Redshift sources, and build the Data Catalog. Use AWS Glue crawlers for data that is in Amazon S3 to infer the schema and to automatically update the Data Catalog.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
pypelyncar
Highly Voted 4 months, 2 weeks ago
Selected Answer: B
The AWS Glue Data Catalog is a purpose-built, fully managed service designed to serve as a central metadata repository for your data sources. It provides a unified view of your data across various sources, including structured databases (like Amazon RDS and Amazon Redshift) and semi-structured data formats (like JSON and XML files in Amazon S3).
upvoted 7 times
...
valuedate
Most Recent 5 months ago
Selected Answer: B
glue data catalog with crawlers
upvoted 3 times
...
hnk
5 months, 1 week ago
Selected Answer: A
B is the obvious answer
upvoted 1 times
Just_Ninja
5 months, 1 week ago
Sorry there is no Aurora Data Catalog :)
upvoted 1 times
...
tgv
4 months, 3 weeks ago
Even though you picked A.
upvoted 3 times
...
...
GiorgioGss
7 months, 1 week ago
Selected Answer: B
A,C out for obvious reason D out because it involves manual schema extract
upvoted 4 times
...
rralucard_
8 months, 3 weeks ago
Selected Answer: B
Option B, using the AWS Glue Data Catalog with AWS Glue Crawlers, is the best solution to meet the requirements with the least operational overhead. It provides a fully managed, integrated solution for cataloging both structured and semistructured data across various AWS data stores without the need for extensive manual configuration or custom coding.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago