For workloads with low access frequency where you only need to query data occasionally (for example, during audits), option (A)—S3 Glacier Flexible Retrieval combined with S3 Glacier Select—provides the most cost-effective solution.
Transaction Data Refers To Data Which Are Updating Frequently and To Query That Data occasionally Means It Can Be Query At Any Time (In Question Time Is Not Define). So We Can't Take Risk For Customer To Wait For Hours To Get The Result And The Best Way To Query The Data On Top of The S3 Bucket We Can Use Athena.
https://aws.amazon.com/about-aws/whats-new/2017/11/amazon-glacier-select-makes-big-data-analytics-of-archive-data-possible/#:~:text=Amazon%20Glacier%20Select%20is%20a,archives%20to%20use%20for%20analytics.
Glacier Select allows queries to run directly on data stored in Amazon Glacier
I am not sure whether to go for B or C. Can anyone comment on this?
B: No problem, but not available if Parquet is Gzip compressed. But the problem statement doesn't say Parquet is Gzip compressed.
C: Correct if Parquet is Gzip compressed, but B is more cost-effective if csv or json is Gzip compressed
I think the solution is either B or D but I would go with B because they mentioned storing the data in gzip and not parquet which is optimised for Athena queries
B. Store the data in Amazon S3. Use Amazon S3 Select to query the data.
Amazon S3 is a cost-effective object storage service, and S3 Select allows you to retrieve only a subset of data from an object by using simple SQL expressions. S3 Select works on objects stored in CSV, JSON, or Apache Parquet format. It also supports GZIP and BZIP2 compression formats, which makes it suitable for the given scenario where the data is compressed with gzip.
While Amazon Athena is a powerful query service, it can be more expensive than S3 Select for occasional queries. Amazon Glacier and Glacier Select are designed for long-term archival storage and not for frequent access or queries, which might not be suitable for occasional audits. Therefore, option B is the most cost-effective choice for this scenario.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
tgv
Highly Voted 7 months, 3 weeks agoYUICH
Most Recent 1 week, 2 days agoYUICH
1 week, 2 days agodiv_div
2 weeks agoBigMrT
1 month agoctndba
2 months, 2 weeks agoRockyLeon
2 months, 1 week agomohamedTR
4 months agomanig
4 months, 1 week agoLR2023
4 months, 1 week agoPashoQ
4 months, 3 weeks agocas_tori
5 months, 3 weeks agoIanJang
5 months, 3 weeks agomns0173
6 months agolenneth39
6 months agoandrologin
6 months, 3 weeks ago4bc91ae
6 months, 4 weeks agocatoteja
6 months agobakarys
7 months agoFunkyFresco
7 months ago