Exam AWS Certified Solutions Architect - Associate SAA-C03 All Questions

View all questions & answers for the AWS Certified Solutions Architect - Associate SAA-C03 exam

Exam AWS Certified Solutions Architect - Associate SAA-C03 topic 1 question 317 discussion

Exam question from Amazon's AWS Certified Solutions Architect - Associate SAA-C03

Question #: 317
Topic #: 1

[All AWS Certified Solutions Architect - Associate SAA-C03 Questions]

A company uses a legacy application to produce data in CSV format. The legacy application stores the output data in Amazon S3. The company is deploying a new commercial off-the-shelf (COTS) application that can perform complex SQL queries to analyze data that is stored in Amazon Redshift and Amazon S3 only. However, the COTS application cannot process the .csv files that the legacy application produces.

The company cannot update the legacy application to produce data in another format. The company needs to implement a solution so that the COTS application can use the data that the legacy application produces.

Which solution will meet these requirements with the LEAST operational overhead?

A. Create an AWS Glue extract, transform, and load (ETL) job that runs on a schedule. Configure the ETL job to process the .csv files and store the processed data in Amazon Redshift.
B. Develop a Python script that runs on Amazon EC2 instances to convert the .csv files to .sql files. Invoke the Python script on a cron schedule to store the output files in Amazon S3.
C. Create an AWS Lambda function and an Amazon DynamoDB table. Use an S3 event to invoke the Lambda function. Configure the Lambda function to perform an extract, transform, and load (ETL) job to process the .csv files and store the processed data in the DynamoDB table.
D. Use Amazon EventBridge to launch an Amazon EMR cluster on a weekly schedule. Configure the EMR cluster to perform an extract, transform, and load (ETL) job to process the .csv files and store the processed data in an Amazon Redshift table.

Show Suggested Answer

Suggested Answer: A 🗳️

by cloudbusting at Feb. 18, 2023, 3:25 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

awsgeek75

Highly Voted 11 months, 2 weeks ago

Selected Answer: A

Time to sell some Glue. I believe these kind of questions are there to indoctrinate us into acknowledging how blessed we are to have managed services like AWS Glue when you look at other horrible and painful options

upvoted 20 times

...

elearningtakai

Highly Voted 1 year, 9 months ago

Selected Answer: A

A, AWS Glue is a fully managed ETL service that can extract data from various sources, transform it into the required format, and load it into a target data store. In this case, the ETL job can be configured to read the CSV files from Amazon S3, transform the data into a format that can be loaded into Amazon Redshift, and load it into an Amazon Redshift table. B requires the development of a custom script to convert the CSV files to SQL files, which could be time-consuming and introduce additional operational overhead. C, while using serverless technology, requires the additional use of DynamoDB to store the processed data, which may not be necessary if the data is only needed in Amazon Redshift. D, while an option, is not the most efficient solution as it requires the creation of an EMR cluster, which can be costly and complex to manage.

upvoted 8 times

...

pentium75

Most Recent 1 year ago

Selected Answer: A

B - Developing a script is surely not minimizing operational effort C - Stores data in DynamoDB where the new app cannot use it D - Could work but is total overkill (EMR is for Big Data analysis, not for simple ETL)

upvoted 4 times

...

Ruffyit

1 year, 1 month ago

A-ETL is serverless & best suited with the requirement who primary job is ETL B-Usage of Ec2 adds operational overhead & incur costs C-DynamoDB(NoSql) does suit the requirement as company is performing SQL queries D-EMR adds operational overhead & incur costs

upvoted 3 times

...

ACloud_Guru15

1 year, 1 month ago

Selected Answer: A

upvoted 2 times

...

TariqKipkemei

1 year, 2 months ago

Selected Answer: A

Data transformation = AWS Glue

upvoted 2 times

...

Guru4Cloud

1 year, 4 months ago

Selected Answer: A

Create an AWS Glue ETL job to process the CSV files Configure the job to run on a schedule Output the transformed data to Amazon Redshift The key points: Legacy app generates CSV files in S3 New app requires data in Redshift or S3 Need to transform CSV to support new app with minimal ops overhead

upvoted 2 times

...

kraken21

1 year, 9 months ago

Selected Answer: A

Glue is server less and has less operational head than EMR so A.

upvoted 2 times

...

[Removed]

1 year, 9 months ago

Selected Answer: C

o meet the requirement with the least operational overhead, a serverless approach should be used. Among the options provided, option C provides a serverless solution using AWS Lambda, S3, and DynamoDB. Therefore, the solution should be to create an AWS Lambda function and an Amazon DynamoDB table. Use an S3 event to invoke the Lambda function. Configure the Lambda function to perform an extract, transform, and load (ETL) job to process the .csv files and store the processed data in the DynamoDB table. Option A is also a valid solution, but it may involve more operational overhead than Option C. With Option A, you would need to set up and manage an AWS Glue job, which would require more setup time than creating an AWS Lambda function. Additionally, AWS Glue jobs have a minimum execution time of 10 minutes, which may not be necessary or desirable for this use case. However, if the data processing is particularly complex or requires a lot of data transformation, AWS Glue may be a more appropriate solution.

upvoted 1 times

pentium75

1 year ago

Creating and maintaining a Lambda function is more "operational overhead" than using a ready-made service such as Glue. But more important, answer C says "store the processed data in the DynamoDB table" while the application can "analyze data that is stored in Amazon Redshift and Amazon S3 only".

upvoted 2 times

...

MssP

1 year, 9 months ago

Important point: The COTS performs complex SQL queries to analyze data in Amazon Redshift. If you use DynamoDB -> No SQL querires. Option A makes more sense.

upvoted 4 times

...

LuckyAro

1 year, 10 months ago

Selected Answer: A

A would be the best solution as it involves the least operational overhead. With this solution, an AWS Glue ETL job is created to process the .csv files and store the processed data directly in Amazon Redshift. This is a serverless approach that does not require any infrastructure to be provisioned, configured, or maintained. AWS Glue provides a fully managed, pay-as-you-go ETL service that can be easily configured to process data from S3 and load it into Amazon Redshift. This approach allows the legacy application to continue to produce data in the CSV format that it currently uses, while providing the new COTS application with the ability to analyze the data using complex SQL queries.

upvoted 4 times

...

jennyka76

1 year, 10 months ago

A https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-csv-home.html I AGREE AFTER READING LINK

upvoted 2 times

...

cloudbusting

1 year, 10 months ago

A: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html

upvoted 2 times

...