Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 163 discussion

Exam question from Amazon's AWS Certified Data Engineer - Associate DEA-C01

Question #: 163
Topic #: 1

[All AWS Certified Data Engineer - Associate DEA-C01 Questions]

A company has three subsidiaries. Each subsidiary uses a different data warehousing solution. The first subsidiary hosts its data warehouse in Amazon Redshift. The second subsidiary uses Teradata Vantage on AWS. The third subsidiary uses Google BigQuery.

The company wants to aggregate all the data into a central Amazon S3 data lake. The company wants to use Apache Iceberg as the table format.

A data engineer needs to build a new pipeline to connect to all the data sources, run transformations by using each source engine, join the data, and write the data to Iceberg.

Which solution will meet these requirements with the LEAST operational effort?

A. Use native Amazon Redshift, Teradata, and BigQuery connectors to build the pipeline in AWS Glue. Use native AWS Glue transforms to join the data. Run a Merge operation on the data lake Iceberg table.
B. Use the Amazon Athena federated query connectors for Amazon Redshift, Teradata, and BigQuery to build the pipeline in Athena. Write a SQL query to read from all the data sources, join the data, and run a Merge operation on the data lake Iceberg table.
C. Use the native Amazon Redshift connector, the Java Database Connectivity (JDBC) connector for Teradata, and the open source Apache Spark BigQuery connector to build the pipeline in Amazon EMR. Write code in PySpark to join the data. Run a Merge operation on the data lake Iceberg table.
D. Use the native Amazon Redshift, Teradata, and BigQuery connectors in Amazon Appflow to write data to Amazon S3 and AWS Glue Data Catalog. Use Amazon Athena to join the data. Run a Merge operation on the data lake Iceberg table.

Show Suggested Answer

Suggested Answer: A 🗳️

by Parandhaman_Margan at Oct. 27, 2024, 5:21 a.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Mitchdu

2 weeks, 2 days ago

Selected Answer: A

Glue, for sure. Athena is an ad-hoc querying tool, not and ETL tool and besides doesn't have connectors for Bigquery and Terradata!

upvoted 1 times

...

AWSMM

2 months ago

Selected Answer: A

Native Connectors: AWS Glue provides built-in connectors for Amazon Redshift, Teradata, and Google BigQuery. This eliminates the need for custom-built connectors, reducing development and maintenance overhead.

upvoted 1 times

...

bad1ccc

2 months, 4 weeks ago

Selected Answer: B

https://docs.aws.amazon.com/athena/latest/ug/federated-queries.html

upvoted 2 times

...

Palee

3 months, 2 weeks ago

Selected Answer: D

The requirement is to aggregate the data in S3. Only option has exclusively called this out. So Ans D is correct

upvoted 1 times

...

MerryLew

5 months, 2 weeks ago

Selected Answer: A

Athena can be used to build certain types of data pipelines, particularly when the primary focus is on ad-hoc analysis and querying large datasets stored in S3 without the need for complex data transformations, but for more intricate data processing and heavy ETL operations, other AWS services like Glue are often more suitable due to their dedicated data processing capabilities.

upvoted 1 times

...

Eeshav15

5 months, 2 weeks ago

Selected Answer: A

Glue is the right tool to build pipeline

upvoted 1 times

...

michele_scar

7 months, 2 weeks ago

Selected Answer: B

https://docs.aws.amazon.com/athena/latest/ug/connectors-available.html

upvoted 2 times

...

Eleftheriia

7 months, 2 weeks ago

Selected Answer: B

Would it be B "If you have data in sources other than Amazon S3, you can use Athena Federated Query to query the data in place or build pipelines that extract data from multiple data sources and store them in Amazon S3. With Athena Federated Query, you can run SQL queries across data stored in relational, non-relational, object, and custom data sources." https://docs.aws.amazon.com/athena/latest/ug/connect-to-a-data-source.html

upvoted 3 times

...

kupo777

7 months, 4 weeks ago

Correct Answer: B Use the Amazon Athena federated query connectors for Amazon Redshift, Teradata, and BigQuery to build the pipeline in Athena. Write a SQL query to read from all the data sources, join the data, and run a Merge operation on the data lake Iceberg table.

upvoted 3 times

...

ae35a02

8 months ago

Selected Answer: A

AWS GLUE has native connectors to Redshift, BigQuery and Terradata, and integrates with Iceberg format. Athena is not for building Pipelines, AppFlow is for transfering data from Saas applications

upvoted 3 times

...

Parandhaman_Margan

8 months, 1 week ago

Answer:A

upvoted 2 times

...