Exam AWS Certified Solutions Architect - Associate SAA-C03 All Questions

View all questions & answers for the AWS Certified Solutions Architect - Associate SAA-C03 exam

Exam AWS Certified Solutions Architect - Associate SAA-C03 topic 1 question 557 discussion

Exam question from Amazon's AWS Certified Solutions Architect - Associate SAA-C03

Question #: 557
Topic #: 1

[All AWS Certified Solutions Architect - Associate SAA-C03 Questions]

A solutions architect manages an analytics application. The application stores large amounts of semistructured data in an Amazon S3 bucket. The solutions architect wants to use parallel data processing to process the data more quickly. The solutions architect also wants to use information that is stored in an Amazon Redshift database to enrich the data.

Which solution will meet these requirements?

A. Use Amazon Athena to process the S3 data. Use AWS Glue with the Amazon Redshift data to enrich the S3 data.
B. Use Amazon EMR to process the S3 data. Use Amazon EMR with the Amazon Redshift data to enrich the S3 data.
C. Use Amazon EMR to process the S3 data. Use Amazon Kinesis Data Streams to move the S3 data into Amazon Redshift so that the data can be enriched.
D. Use AWS Glue to process the S3 data. Use AWS Lake Formation with the Amazon Redshift data to enrich the S3 data.

Show Suggested Answer

Suggested Answer: B 🗳️

by zjcorpuz at Aug. 4, 2023, 3:30 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Guru4Cloud

Highly Voted 1 year, 4 months ago

Selected Answer: B

Option B is the correct solution that meets the requirements: Use Amazon EMR to process the semi-structured data in Amazon S3. EMR provides a managed Hadoop framework optimized for processing large datasets in S3. EMR supports parallel data processing across multiple nodes to speed up the processing. EMR can integrate directly with Amazon Redshift using the EMR-Redshift integration. This allows querying the Redshift data from EMR and joining it with the S3 data. This enables enriching the semi-structured S3 data with the information stored in Redshift

upvoted 18 times

...

zjcorpuz

Highly Voted 1 year, 5 months ago

By combining AWS Glue and Amazon Redshift, you can process the semistructured data in parallel using Glue ETL jobs and then store the processed and enriched data in a structured format in Amazon Redshift. This approach allows you to perform complex analytics efficiently and at scale.

upvoted 9 times

...

upliftinghut

Most Recent 11 months, 3 weeks ago

Selected Answer: B

D: not relevant, data is semistructured and Glue is more batch than stream data A: not correct, Athena is for querying data B & C look ok but C is out => redundant with Kinesis data stream; EMR already processed data as input into Redshift for parallel processing Only B is most logical

upvoted 4 times

...

awsgeek75

11 months, 4 weeks ago

Selected Answer: B

Key requirement: parallel data processing parallel data processing is EMR (Kind of Apache Hadoop) so it only leave B and C C is Kinesis to Redshift which is pointless logic here B EMR for S3 and EMR for Redshift gives maximum parallel processing here

upvoted 3 times

...

pentium75

1 year ago

Selected Answer: B

A has a pitfall, "use Amazon Athena to PROCESS the data". With Athena you can query, not process, data. C is wrong because Kinesis has no place here. D is wrong because it does not process the Redshift data, and Glue does ETL, not analyze Thus it's B. EMR can use semi-structured data from from S3 and structured data from Redshift and is ideal for "parallel data processing" of "large amounts" of data.

upvoted 8 times

...

aws94

1 year ago

Selected Answer: B

large amount of data + parallel data processing = EMR

upvoted 3 times

...

[Removed]

1 year, 1 month ago

Selected Answer: A

Amazon Athena is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL.

upvoted 1 times

pentium75

1 year ago

Y, but A says "process", not "query" data with Athena.

upvoted 2 times

...

SHAAHIBHUSHANAWS

1 year, 1 month ago

Selected Answer: D Glue use apache pyspark cluster for parallel processing. EMR or Glue are possible options. Glue is serverless so better use this plus pyspark is in memory parallel processing.

upvoted 1 times

...

aragornfsm

1 year, 1 month ago

i think a is correct semistructured data ==> Athena

upvoted 1 times

pentium75

1 year ago

"Hadoop [as used by EMR] helps you turn petabytes of un-structured or semi-structured data into useful insights" https://aws.amazon.com/emr/features/hadoop/

upvoted 2 times

...

riyasara

1 year, 1 month ago

Athena is not designed for parallel data processing. So it's B

upvoted 3 times

...

TariqKipkemei

1 year, 1 month ago

Selected Answer: A

Answer is A

upvoted 1 times

...

TariqKipkemei

1 year, 1 month ago

Selected Answer: B

From this documentation looks like EMR cannot interface with S3. https://aws.amazon.com/emr/ I will settle with option A.

upvoted 2 times

pentium75

1 year ago

Of course EMR can access S3 https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-file-systems.html

upvoted 2 times

...

bogobob

1 year, 1 month ago

Selected Answer: B

For those answering A, AWS Glue can directly query S3, it can't use Athena as a source of data. The questions say the Redshift data should be user to "enrich" which means thats the redshift data needs to be "added" to the s3 data. A doesn't allow that.

upvoted 2 times

...

hungta

1 year, 1 month ago

Selected Answer: B

Choose option B. Option A is not correct. Amazon Athena is suitable for querying data directly from S3 using SQL and allows parallel processing of S3 data. AWS Glue can be used for data preparation and enrichment but might not directly integrate with Amazon Redshift for enrichment.

upvoted 2 times

...

potomac

1 year, 2 months ago

Selected Answer: A

Athena and Redshift both do SQL query

upvoted 1 times

...

Sab123

1 year, 3 months ago

Selected Answer: A

semi-structure supported by Athena not by EMR

upvoted 4 times

pentium75

1 year ago

"Hadoop helps you turn petabytes of un-structured or semi-structured data into useful insights about your applications or users." https://aws.amazon.com/emr/features/hadoop/?nc1=h_ls

upvoted 2 times

...

JKevin778

1 year, 3 months ago

Selected Answer: A

athena for s3

upvoted 1 times

...

Load full discussion...