exam questions

Exam AWS Certified Solutions Architect - Associate SAA-C03 All Questions

View all questions & answers for the AWS Certified Solutions Architect - Associate SAA-C03 exam

Exam AWS Certified Solutions Architect - Associate SAA-C03 topic 1 question 317 discussion

A company uses a legacy application to produce data in CSV format. The legacy application stores the output data in Amazon S3. The company is deploying a new commercial off-the-shelf (COTS) application that can perform complex SQL queries to analyze data that is stored in Amazon Redshift and Amazon S3 only. However, the COTS application cannot process the .csv files that the legacy application produces.

The company cannot update the legacy application to produce data in another format. The company needs to implement a solution so that the COTS application can use the data that the legacy application produces.

Which solution will meet these requirements with the LEAST operational overhead?

  • A. Create an AWS Glue extract, transform, and load (ETL) job that runs on a schedule. Configure the ETL job to process the .csv files and store the processed data in Amazon Redshift.
  • B. Develop a Python script that runs on Amazon EC2 instances to convert the .csv files to .sql files. Invoke the Python script on a cron schedule to store the output files in Amazon S3.
  • C. Create an AWS Lambda function and an Amazon DynamoDB table. Use an S3 event to invoke the Lambda function. Configure the Lambda function to perform an extract, transform, and load (ETL) job to process the .csv files and store the processed data in the DynamoDB table.
  • D. Use Amazon EventBridge to launch an Amazon EMR cluster on a weekly schedule. Configure the EMR cluster to perform an extract, transform, and load (ETL) job to process the .csv files and store the processed data in an Amazon Redshift table.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
awsgeek75
Highly Voted 6 months, 1 week ago
Selected Answer: A
Time to sell some Glue. I believe these kind of questions are there to indoctrinate us into acknowledging how blessed we are to have managed services like AWS Glue when you look at other horrible and painful options
upvoted 17 times
...
elearningtakai
Highly Voted 1 year, 3 months ago
Selected Answer: A
A, AWS Glue is a fully managed ETL service that can extract data from various sources, transform it into the required format, and load it into a target data store. In this case, the ETL job can be configured to read the CSV files from Amazon S3, transform the data into a format that can be loaded into Amazon Redshift, and load it into an Amazon Redshift table. B requires the development of a custom script to convert the CSV files to SQL files, which could be time-consuming and introduce additional operational overhead. C, while using serverless technology, requires the additional use of DynamoDB to store the processed data, which may not be necessary if the data is only needed in Amazon Redshift. D, while an option, is not the most efficient solution as it requires the creation of an EMR cluster, which can be costly and complex to manage.
upvoted 8 times
...
pentium75
Most Recent 6 months, 3 weeks ago
Selected Answer: A
B - Developing a script is surely not minimizing operational effort C - Stores data in DynamoDB where the new app cannot use it D - Could work but is total overkill (EMR is for Big Data analysis, not for simple ETL)
upvoted 4 times
...
Ruffyit
8 months, 2 weeks ago
A-ETL is serverless & best suited with the requirement who primary job is ETL B-Usage of Ec2 adds operational overhead & incur costs C-DynamoDB(NoSql) does suit the requirement as company is performing SQL queries D-EMR adds operational overhead & incur costs
upvoted 3 times
...
ACloud_Guru15
8 months, 2 weeks ago
Selected Answer: A
A-ETL is serverless & best suited with the requirement who primary job is ETL B-Usage of Ec2 adds operational overhead & incur costs C-DynamoDB(NoSql) does suit the requirement as company is performing SQL queries D-EMR adds operational overhead & incur costs
upvoted 2 times
...
TariqKipkemei
9 months, 2 weeks ago
Selected Answer: A
Data transformation = AWS Glue
upvoted 2 times
...
Guru4Cloud
10 months, 3 weeks ago
Selected Answer: A
Create an AWS Glue ETL job to process the CSV files Configure the job to run on a schedule Output the transformed data to Amazon Redshift The key points: Legacy app generates CSV files in S3 New app requires data in Redshift or S3 Need to transform CSV to support new app with minimal ops overhead
upvoted 2 times
...
kraken21
1 year, 3 months ago
Selected Answer: A
Glue is server less and has less operational head than EMR so A.
upvoted 2 times
...
[Removed]
1 year, 4 months ago
Selected Answer: C
o meet the requirement with the least operational overhead, a serverless approach should be used. Among the options provided, option C provides a serverless solution using AWS Lambda, S3, and DynamoDB. Therefore, the solution should be to create an AWS Lambda function and an Amazon DynamoDB table. Use an S3 event to invoke the Lambda function. Configure the Lambda function to perform an extract, transform, and load (ETL) job to process the .csv files and store the processed data in the DynamoDB table. Option A is also a valid solution, but it may involve more operational overhead than Option C. With Option A, you would need to set up and manage an AWS Glue job, which would require more setup time than creating an AWS Lambda function. Additionally, AWS Glue jobs have a minimum execution time of 10 minutes, which may not be necessary or desirable for this use case. However, if the data processing is particularly complex or requires a lot of data transformation, AWS Glue may be a more appropriate solution.
upvoted 1 times
pentium75
6 months, 3 weeks ago
Creating and maintaining a Lambda function is more "operational overhead" than using a ready-made service such as Glue. But more important, answer C says "store the processed data in the DynamoDB table" while the application can "analyze data that is stored in Amazon Redshift and Amazon S3 only".
upvoted 2 times
...
MssP
1 year, 3 months ago
Important point: The COTS performs complex SQL queries to analyze data in Amazon Redshift. If you use DynamoDB -> No SQL querires. Option A makes more sense.
upvoted 4 times
...
...
LuckyAro
1 year, 5 months ago
Selected Answer: A
A would be the best solution as it involves the least operational overhead. With this solution, an AWS Glue ETL job is created to process the .csv files and store the processed data directly in Amazon Redshift. This is a serverless approach that does not require any infrastructure to be provisioned, configured, or maintained. AWS Glue provides a fully managed, pay-as-you-go ETL service that can be easily configured to process data from S3 and load it into Amazon Redshift. This approach allows the legacy application to continue to produce data in the CSV format that it currently uses, while providing the new COTS application with the ability to analyze the data using complex SQL queries.
upvoted 4 times
...
jennyka76
1 year, 5 months ago
A https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-csv-home.html I AGREE AFTER READING LINK
upvoted 2 times
...
cloudbusting
1 year, 5 months ago
A: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago