exam questions

Exam AWS Certified Solutions Architect - Professional All Questions

View all questions & answers for the AWS Certified Solutions Architect - Professional exam

Exam AWS Certified Solutions Architect - Professional topic 1 question 710 discussion

A company is running an Apache Hadoop cluster on Amazon EC2 instances. The Hadoop cluster stores approximately 100 TB of data for weekly operational reports and allows occasional access for data scientists to retrieve data. The company needs to reduce the cost and operational complexity for storing and serving this data.
Which solution meets these requirements in the MOST cost-effective manner?

  • A. Move the Hadoop cluster from EC2 instances to Amazon EMR. Allow data access patterns to remain the same.
  • B. Write a script that resizes the EC2 instances to a smaller instance type during downtime and resizes the instances to a larger instance type before the reports are created.
  • C. Move the data to Amazon S3 and use Amazon Athena to query the data for reports. Allow the data scientists to access the data directly in Amazon S3.
  • D. Migrate the data to Amazon DynamoDB and modify the reports to fetch data from DynamoDB. Allow the data scientists to access the data directly in DynamoDB.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
kejam
Highly Voted 3 years, 2 months ago
C: S3 and Athena. "The company needs to reduce the cost and operational complexity for storing and serving this data. Which solution meets these requirements in the MOST cost-effective manner?" EMR storage is ephemeral. The company has 100TB that need to persist, they would have to use EMRFS to backup to S3 anyway. https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-storage.html
upvoted 29 times
AWSum1
3 years, 1 month ago
Great explanation. I suppose the deliberately put in EMR to confuse you into thinking it solves the Hadoop problem
upvoted 5 times
...
...
doris0306
Highly Voted 3 years, 2 months ago
A EMR helps creating Hadoop clusters to analyse vase amount of Data
upvoted 8 times
WhyIronMan
3 years, 1 month ago
but is not cost effective
upvoted 3 times
...
...
rbm2023
Most Recent 1 year, 6 months ago
Selected Answer: A
Athena would not replace a map reduce for data analysis. you might reduce costs but you are not applying the right tool for a current solution.
upvoted 1 times
...
romiao106
1 year, 7 months ago
Selected Answer: A
with 100 TB not actually sure s3 + athena would work
upvoted 1 times
...
dev112233xx
1 year, 7 months ago
Selected Answer: A
A - EMR is cheaper than S3+Athena for such huge storage Athena will cost $5 per query per TB of data scanned (5x100TB = $500 per full query): https://aws.amazon.com/athena/pricing/
upvoted 1 times
...
aws0909
1 year, 9 months ago
Selected Answer: C
Cost Effective solution is S3 and Athena
upvoted 2 times
...
evargasbrz
1 year, 11 months ago
Selected Answer: A
I'll go with A
upvoted 1 times
...
davideccc
2 years, 1 month ago
Selected Answer: C
athena + S3 is definitely the cheaper option here
upvoted 1 times
...
JohnPi
2 years, 1 month ago
Selected Answer: C
Move the data to Amazon S3 and use Amazon Athena to query the data for reports. Allow the data scientists to access the data directly in Amazon S3.
upvoted 1 times
...
dcdcdc3
2 years, 2 months ago
Selected Answer: A
Per the below article, EMR is way cheaper than ec2. I would choose A as I am not sure if the structure of hte data can be queried by Athena in cost-effective way https://blogs.perficient.com/2016/05/19/two-choices-1-amazon-emr-or-2-hadoop-on-ec2/
upvoted 2 times
...
chase12345
2 years, 2 months ago
I will choose A AWS EMR because Amazon EMR makes it simple and cost effective to run highly distributed processing frameworks such as Hadoop, Spark, and Presto when compared to on-premises https://docs.aws.amazon.com/athena/latest/ug/when-should-i-use-ate.html
upvoted 1 times
...
AYANtheGLADIATOR
2 years, 3 months ago
C is the answer because EMR is not a cheap option.
upvoted 2 times
...
MarkChoi
2 years, 4 months ago
Selected Answer: A
100TB?? Is it possible to use Athena? I'll go with A
upvoted 3 times
...
AzureDP900
2 years, 12 months ago
I agree with C as right answer.
upvoted 1 times
...
cldy
2 years, 12 months ago
C. Move the data to Amazon S3 and use Amazon Athena to query the data for reports. Allow the data scientists to access the data directly in Amazon S3.
upvoted 2 times
...
andylogan
3 years, 1 month ago
It's C
upvoted 1 times
...
DerekKey
3 years, 1 month ago
100TB EBS - 8.109$ S3 - 2.355$ You have saved 5.752$ This amount can be used for Athen. BTW. we don't know indexes, amount of data that is scanned. What we know is that tit will be: "occasional access for data scientists to retrieve data" I am choosing C as CORRECT answer
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...