Exam AWS Certified Solutions Architect - Professional SAP-C02 All Questions

View all questions & answers for the AWS Certified Solutions Architect - Professional SAP-C02 exam

Exam AWS Certified Solutions Architect - Professional SAP-C02 topic 1 question 75 discussion

Exam question from Amazon's AWS Certified Solutions Architect - Professional SAP-C02

Question #: 75
Topic #: 1

[All AWS Certified Solutions Architect - Professional SAP-C02 Questions]

A solutions architect is designing the data storage and retrieval architecture for a new application that a company will be launching soon. The application is designed to ingest millions of small records per minute from devices all around the world. Each record is less than 4 KB in size and needs to be stored in a durable location where it can be retrieved with low latency. The data is ephemeral and the company is required to store the data for 120 days only, after which the data can be deleted.

The solutions architect calculates that, during the course of a year, the storage requirements would be about 10-15 TB.

Which storage strategy is the MOST cost-effective and meets the design requirements?

A. Design the application to store each incoming record as a single .csv file in an Amazon S3 bucket to allow for indexed retrieval. Configure a lifecycle policy to delete data older than 120 days.
B. Design the application to store each incoming record in an Amazon DynamoDB table properly configured for the scale. Configure the DynamoDB Time to Live (TTL) feature to delete records older than 120 days.
C. Design the application to store each incoming record in a single table in an Amazon RDS MySQL database. Run a nightly cron job that runs a query to delete any records older than 120 days.
D. Design the application to batch incoming records before writing them to an Amazon S3 bucket. Update the metadata for the object to contain the list of records in the batch and use the Amazon S3 metadata search feature to retrieve the data. Configure a lifecycle policy to delete the data after 120 days.

Show Suggested Answer

Suggested Answer: B 🗳️

by masetromain at Jan. 14, 2023, 5:26 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

masetromain

Highly Voted 2 years, 5 months ago

Selected Answer: B

The most cost-effective and efficient solution that meets the design requirements would be option B, Design the application to store each incoming record in an Amazon DynamoDB table properly configured for the scale. Configure the DynamoDB Time to Live (TTL) feature to delete records older than 120 days. DynamoDB is a NoSQL key-value store designed for high scale and performance. It is fully managed by AWS and can easily handle millions of small records per minute. Additionally, with the TTL feature, you can set an expiration time for each record, so that the data can be automatically deleted after the specified time period.

upvoted 23 times

masetromain

2 years, 5 months ago

Option A, storing each incoming record as a single .csv file in an Amazon S3 bucket, would not be a good option because it would be difficult to retrieve individual records from the .csv files, and will likely increase the cost of data retrieval. Option C, storing each incoming record in a single table in an Amazon RDS MySQL database, would be a more expensive option as RDS is typically more expensive than DynamoDB. Additionally, running a cron job to delete old data could lead to additional operational overhead. Option D, storing incoming records in batches in an S3 bucket, would be a less efficient option as it would require additional processing and parsing of the data to retrieve individual records.

upvoted 7 times

...

dkx

Highly Voted 1 year, 11 months ago

A. No, because millions of writes to a single .csv file would cause read and write latency B. Yes, because DynamoDB can support peaks of more than 20 million requests per second. C. No, because creating nightly cron is unnecessary, and a relation database isn't designed to ingest millions of small records per minute D. No, because S3 supports 210,000 PUT requests per minute (3,500 requests per second * 60 seconds per min) which is far less than 1,000,000+ writes per minute

upvoted 6 times

ahhatem

6 months, 3 weeks ago

Actually, the limit you mentioned for point D is per prefix or path…. Not the whole bucket. With proper data distribution across prefixes it can accommodate easily for the load mentioned.

upvoted 2 times

...

jimee11

Most Recent 1 month, 3 weeks ago

Selected Answer: B

Read the requirements: MOST cost-effective and meets the design requirements. Note "retrieved with low latency". DynamoDB latency is single digits, where as S3 is 100-200 milliseconds.

upvoted 1 times

...

vmia159

3 months, 2 weeks ago

Selected Answer: D

For those who said B, how many WCU is needed for dynamoDB? Given: 1 million records per minute 4KB per record This translates to approximately 16,667 records per second (1,000,000 / 60) For DynamoDB WCU calculation: 1 WCU = 1 write per second for items up to 1KB For items larger than 1KB, the WCU is rounded up to the next 1KB For 4KB items, each write will consume 4 WCUs Therefore: WCUs needed = (Records per second) × (Item size in KB rounded up) WCUs = 16,667 × 4 WCUs = 66,668 WCUs First, you need to increase the quotas for that table by submitting a support ticket. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ServiceQuotas.html Second, this is very expensive. Obviously, combine it with kinesis data agent and firehorse that write to S3 will be much reliable options but it will increase the cost significantly. But still cheaper than the dynamo db options. https://calculator.aws/#/estimate?id=87f1df21449660b0b9d61a6c1153632b1983d2e4

upvoted 1 times

...

soulation

3 months, 4 weeks ago

Selected Answer: D

Option B is too expensive.

upvoted 1 times

...

sergza

6 months, 3 weeks ago

Selected Answer: D

If you really think about being cost effective than Option D is the right choice

upvoted 1 times

...

Heman31in

6 months, 3 weeks ago

Selected Answer: D

Why Option D Might Be Cost-Effective: Lower Storage Costs: S3 storage is generally cheaper than DynamoDB when dealing with large amounts of data (e.g., $0.023/GB/month for S3 Standard vs. $0.25/GB/month for DynamoDB on-demand). Batching Reduces API Call Costs: By batching multiple records into a single object, you reduce the number of PUT requests to S3. This can lead to lower API costs compared to writing each record individually to DynamoDB. Lifecycle Policies for Data Expiry: S3 lifecycle policies automatically clean up data older than 120 days, similar to DynamoDB's TTL feature.

upvoted 1 times

...

amministrazione

10 months ago

D. Design the application to batch incoming records before writing them to an Amazon S3 bucket. Update the metadata for the object to contain the list of records in the batch and use the Amazon S3 metadata search feature to retrieve the data. Configure a lifecycle policy to delete the data after 120 days.

upvoted 1 times

...

ahhatem

1 year ago

Selected Answer: B

Obviously it is DynamoDB. Although as a side node I would say it is probably a very bad choice as it would be astronomically expensive for millions of writes per minute…. A Kinesis Data Streams would make much more sense especially that the data is only needed for 3 months…

upvoted 2 times

ahhatem

6 months, 3 weeks ago

After a second thought, I am not sure it is B. D would be much cheaper if it means that objects buffered and combined before write. But the word “batch” doesn’t make me comfortable, batching means writing the objects in one go… nothing implies the objects would be combined …

upvoted 1 times

...

gofavad926

1 year, 3 months ago

Selected Answer: B

B, dynamodb is the best option

upvoted 1 times

...

8608f25

1 year, 4 months ago

Selected Answer: B

For small records less than 4 KB, DynamoDB can efficiently handle the ingestion of millions o records per minute from devices around the world, meeting the application's design requirements for low-latency data access. Additionally, DynamoDB's Time to Live (TTL) feature allows for automatic deletion of items after a specific period, aligning with the requirement to store data for only 120 days.

upvoted 1 times

...

ninomfr64

1 year, 5 months ago

Selected Answer: B

A = S3 is not great with small files and searching for data based on index (a common pattern is to store object metadata in a database like DDB, OpenSearch or RDS/Aurora). Many small files can lead to high costs for retrieval B = correct C = single-table design, high volume write/retrieval os small object and no need for complex query are better served and cost less with DDB rather than RDS D = more efficient than A, but still S3 metadata search feature is limited

upvoted 1 times

...

severlight

1 year, 7 months ago

Selected Answer: B

see uC6rW1aB's answer

upvoted 1 times

...

vjp_training

1 year, 9 months ago

Selected Answer: B

B is the best for cost-effective. D is more cost for S3 request

upvoted 1 times

...

uC6rW1aB

1 year, 10 months ago

Selected Answer: B

Ref: https://aws.amazon.com/dynamodb/pricing/on-demand/ DynamoDB read requests can be either strongly consistent, eventually consistent, or transactional. A strongly consistent read request of up to 4 KB requires one read request unit. For items larger than 4 KB, additional read request units are required.

upvoted 3 times

uC6rW1aB

1 year, 10 months ago

for a US East write object price: S3 Standard put object per thound cost $0.005 -> 1 million put cost $5 ( per minutes in this situation ) Dynamo DB 1 million write cost $1.25 is a lot of cheaper

upvoted 4 times

...