Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 39 discussion

A company is migrating its database servers from Amazon EC2 instances that run Microsoft SQL Server to Amazon RDS for Microsoft SQL Server DB instances. The company's analytics team must export large data elements every day until the migration is complete. The data elements are the result of SQL joins across multiple tables. The data must be in Apache Parquet format. The analytics team must store the data in Amazon S3.
Which solution will meet these requirements in the MOST operationally efficient way?

  • A. Create a view in the EC2 instance-based SQL Server databases that contains the required data elements. Create an AWS Glue job that selects the data directly from the view and transfers the data in Parquet format to an S3 bucket. Schedule the AWS Glue job to run every day.
  • B. Schedule SQL Server Agent to run a daily SQL query that selects the desired data elements from the EC2 instance-based SQL Server databases. Configure the query to direct the output .csv objects to an S3 bucket. Create an S3 event that invokes an AWS Lambda function to transform the output format from .csv to Parquet.
  • C. Use a SQL query to create a view in the EC2 instance-based SQL Server databases that contains the required data elements. Create and run an AWS Glue crawler to read the view. Create an AWS Glue job that retrieves the data and transfers the data in Parquet format to an S3 bucket. Schedule the AWS Glue job to run every day.
  • D. Create an AWS Lambda function that queries the EC2 instance-based databases by using Java Database Connectivity (JDBC). Configure the Lambda function to retrieve the required data, transform the data into Parquet format, and transfer the data into an S3 bucket. Use Amazon EventBridge to schedule the Lambda function to run every day.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Christina666
Highly Voted 5 months ago
Selected Answer: C
Leveraging SQL Views: Creating a view on the source database simplifies the data extraction process and keeps your SQL logic centralized. Glue Crawler Efficiency: Using a Glue crawler to automatically discover and catalog the view's metadata reduces manual setup. Glue Job for ETL: A dedicated Glue job is well-suited for the data transformation (to Parquet) and loading into S3. Glue jobs offer built-in scheduling capabilities. Operational Efficiency: This approach minimizes custom code and leverages native AWS services for data movement and cataloging.
upvoted 6 times
Dummy92yash
3 weeks, 1 day ago
Glue crawler is used to catalog and find the schema. In this requirement the data was already stored in MS SQL server which a relational database. Hence I think A is correct
upvoted 1 times
...
...
bakarys
Most Recent 2 months, 2 weeks ago
Selected Answer: A
Option A involves creating a view in the EC2 instance-based SQL Server databases that contains the required data elements. An AWS Glue job is then created to select the data directly from the view and transfer the data in Parquet format to an S3 bucket. This job is scheduled to run every day. This approach is operationally efficient as it leverages managed services (AWS Glue) and does not require additional transformation steps. Option D involves creating an AWS Lambda function that queries the EC2 instance-based databases using JDBC. The Lambda function is configured to retrieve the required data, transform the data into Parquet format, and transfer the data into an S3 bucket. This approach could work, but managing and scheduling Lambda functions could add operational overhead compared to using managed services like AWS Glue.
upvoted 1 times
...
GiorgioGss
6 months ago
Selected Answer: C
Just beacuse it decouples the whole architecture I will go with C
upvoted 2 times
...
taka5094
6 months ago
Selected Answer: C
Choice A) is almost the same approach, but it doesn't use the AWS Glue crawler, so have to manage the view's metadata manually.
upvoted 4 times
...
Felix_G
6 months, 2 weeks ago
Option C seems to be the most operationally efficient: It leverages Glue for both schema discovery (via the crawler) and data transfer (via the Glue job). The Glue job can directly handle the Parquet format conversion. Scheduling the Glue job ensures regular data export without manual intervention.
upvoted 1 times
helpaws
6 months ago
you're right: https://aws.amazon.com/blogs/big-data/extracting-multidimensional-data-from-microsoft-sql-server-analysis-services-using-aws-glue/
upvoted 1 times
taka5094
6 months ago
Is this right? https://aws.amazon.com/jp/blogs/big-data/extracting-multidimensional-data-from-microsoft-sql-server-analysis-services-using-aws-glue/
upvoted 1 times
...
...
...
rralucard_
7 months, 2 weeks ago
Selected Answer: A
Option A (Creating a view in the EC2 instance-based SQL Server databases and creating an AWS Glue job that selects data from the view, transfers it in Parquet format to S3, and schedules the job to run every day) seems to be the most operationally efficient solution. It leverages AWS Glue’s ETL capabilities for direct data extraction and transformation, minimizes manual steps, and effectively automates the process.
upvoted 1 times
...
evntdrvn76
7 months, 2 weeks ago
A. Create a view in the EC2 instance-based SQL Server databases that contains the required data elements. Create an AWS Glue job that selects the data directly from the view and transfers the data in Parquet format to an S3 bucket. Schedule the AWS Glue job to run every day. This solution is operationally efficient for exporting data in the required format.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...