exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 9 discussion

A manufacturing company has been collecting IoT sensor data from devices on its factory floor for a year and is storing the data in Amazon Redshift for daily analysis. A data analyst has determined that, at an expected ingestion rate of about 2 TB per day, the cluster will be undersized in less than 4 months. A long-term solution is needed. The data analyst has indicated that most queries only reference the most recent 13 months of data, yet there are also quarterly reports that need to query all the data generated from the past 7 years. The chief technology officer (CTO) is concerned about the costs, administrative effort, and performance of a long-term solution.
Which solution should the data analyst use to meet these requirements?

  • A. Create a daily job in AWS Glue to UNLOAD records older than 13 months to Amazon S3 and delete those records from Amazon Redshift. Create an external table in Amazon Redshift to point to the S3 location. Use Amazon Redshift Spectrum to join to data that is older than 13 months.
  • B. Take a snapshot of the Amazon Redshift cluster. Restore the cluster to a new cluster using dense storage nodes with additional storage capacity.
  • C. Execute a CREATE TABLE AS SELECT (CTAS) statement to move records that are older than 13 months to quarterly partitioned data in Amazon Redshift Spectrum backed by Amazon S3.
  • D. Unload all the tables in Amazon Redshift to an Amazon S3 bucket using S3 Intelligent-Tiering. Use AWS Glue to crawl the S3 bucket location to create external tables in an AWS Glue Data Catalog. Create an Amazon EMR cluster using Auto Scaling for any daily analytics needs, and use Amazon Athena for the quarterly reports, with both using the same AWS Glue Data Catalog.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Prodip
Highly Voted 3 years, 7 months ago
Option A; We have implemented this to save cost .
upvoted 53 times
...
awssp12345
Highly Voted 3 years, 6 months ago
B is not correct because snapshotting will save costs but not solve problem of cluster being undersized C is not correct because - CTAS is not used to move data to S3 via spectrum. CTAS Creates a new table based on a query. The owner of this table is the user that issues the command. D is incorrect because EMR cannot be used as Data Warehouse solution And they do not need interactive query with Athena. A is correct because that exactly specifies how to move data to Redshift spectrum and reduce cluster space: https://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum.html
upvoted 20 times
...
kondi2309
Most Recent 1 year, 2 months ago
Selected Answer: A
Def A, to save cost and less admin.
upvoted 1 times
...
NikkyDicky
1 year, 8 months ago
Selected Answer: A
its an A
upvoted 1 times
...
pk349
1 year, 11 months ago
A: I passed the test
upvoted 1 times
...
Aina
2 years ago
A. The Udemy course by Stephane Maarek and Frank Kane has a really similar question in the practice exam.
upvoted 2 times
...
cloudlearnerhere
2 years, 5 months ago
Selected Answer: A
Correct answer is A as the AWS Glue job can be used to offload the data older than 13 months from Redshift to S3. 13 months data can be queried from Redshift, while 7 years data in S3 can be queried using Redshift Spectrum. Option B is wrong as this would increase the cost further and would not scale far. Option C is wrong as CTAS is not used to move data to S3 via the spectrum. CTAS creates a new table based on a query. The owner of this table is the user that issues the command. Option D is wrong as EMR would increase the administrative effort as compared to Redshift.
upvoted 4 times
...
Rejju
2 years, 6 months ago
I am wondering why in the portal the correct ans is given as B. who validated and gives the right ans here?
upvoted 2 times
...
Abep
2 years, 7 months ago
Selected Answer: A
Answer is A https://d1.awsstatic.com/whitepapers/amazon-redshift-cost-optimization.pdf
upvoted 3 times
...
rocky48
2 years, 9 months ago
Selected Answer: A
Answer-A
upvoted 1 times
...
Bik000
2 years, 11 months ago
Selected Answer: A
Answer should be A
upvoted 1 times
...
jrheen
2 years, 12 months ago
Answer-A
upvoted 1 times
...
jmensah60
3 years, 1 month ago
Selected Answer: A
A ticks all the boxes
upvoted 3 times
...
aws2019
3 years, 5 months ago
A is the right answer
upvoted 1 times
...
Shraddha
3 years, 5 months ago
B = wrong, this will not solve either cost or scale problem. C = wrong, to create table on S3 you use CREATE EXTERNAL TABLE not CTAS, also this does not remove older data. D = wrong, nonsense.
upvoted 1 times
...
leliodesouza
3 years, 6 months ago
The answer is A.
upvoted 2 times
...
AjithkumarSL
3 years, 6 months ago
When reading the Post : https://aws.amazon.com/blogs/big-data/amazon-redshift-dense-compute-dc2-nodes-deliver-twice-the-performance-as-dc1-at-the-same-price/, Option B Makes More sense.. any thoughts..
upvoted 1 times
asg76
3 years, 6 months ago
It's not cost effective..
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago