Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 77 discussion

Exam question from Amazon's AWS Certified Data Analytics - Specialty

Question #: 77
Topic #: 1

[All AWS Certified Data Analytics - Specialty Questions]

A bank operates in a regulated environment. The compliance requirements for the country in which the bank operates say that customer data for each state should only be accessible by the bank's employees located in the same state. Bank employees in one state should NOT be able to access data for customers who have provided a home address in a different state.
The bank's marketing team has hired a data analyst to gather insights from customer data for a new campaign being launched in certain states. Currently, data linking each customer account to its home state is stored in a tabular .csv file within a single Amazon S3 folder in a private S3 bucket. The total size of the S3 folder is 2 GB uncompressed. Due to the country's compliance requirements, the marketing team is not able to access this folder.
The data analyst is responsible for ensuring that the marketing team gets one-time access to customer data for their campaign analytics project, while being subject to all the compliance requirements and controls.
Which solution should the data analyst implement to meet the desired requirements with the LEAST amount of setup effort?

A. Re-arrange data in Amazon S3 to store customer data about each state in a different S3 folder within the same bucket. Set up S3 bucket policies to provide marketing employees with appropriate data access under compliance controls. Delete the bucket policies after the project.
B. Load tabular data from Amazon S3 to an Amazon EMR cluster using s3DistCp. Implement a custom Hadoop-based row-level security solution on the Hadoop Distributed File System (HDFS) to provide marketing employees with appropriate data access under compliance controls. Terminate the EMR cluster after the project.
C. Load tabular data from Amazon S3 to Amazon Redshift with the COPY command. Use the built-in row-level security feature in Amazon Redshift to provide marketing employees with appropriate data access under compliance controls. Delete the Amazon Redshift tables after the project.
D. Load tabular data from Amazon S3 to Amazon QuickSight Enterprise edition by directly importing it as a data source. Use the built-in row-level security feature in Amazon QuickSight to provide marketing employees with appropriate data access under compliance controls. Delete Amazon QuickSight data sources after the project is complete.

Show Suggested Answer

Suggested Answer: D 🗳️

by Nicki1013 at Aug. 29, 2020, 6:38 p.m.

Disclaimers:

- ExamTopics website is not related to, affiliated with, endorsed or authorized by Amazon.
- Trademarks, certification & product names are used for reference only and belong to Amazon.

Comments

Submit Cancel

Nicki1013

Highly Voted 3 years, 9 months ago

My answer is D

upvoted 26 times

awssp12345

3 years, 9 months ago

agreed with D.

upvoted 4 times

...

cloudlearnerhere

Highly Voted 2 years, 8 months ago

Selected Answer: D

Correct answer is D as using QuickSight with its built-in-row-level security features allows the data analyst to provide limited one-time access while maintaining data compliance requirements and controls and a minimal amount of setup. In the Enterprise edition of Amazon QuickSight, you can restrict access to a dataset by configuring row-level security (RLS) on it. You can do this before or after you have shared the dataset. Only the people whom you shared with can see any of the data. By adding row-level security, you can further control their access. Option A is wrong as it would take some amount of setup to repartition the data in S3. Options B & C are wrong as using EMR and Redshift would need set up and provisioning effort.

upvoted 8 times

...

bansalhp

Most Recent 1 year, 6 months ago

I think option A is right answer. Option A proposes reorganizing the data by storing customer data for each state in a different S3 folder within the same bucket. This makes it easier to manage access control at the folder level. Setting up S3 bucket policies allows for controlling access to specific folders, meeting the compliance requirements without requiring additional services. After the project is complete, the bucket policies can be deleted, ensuring that access control is removed as needed. Option C and D would be costlier as we need to spin up redshift or Quicksight for 2 GB data

upvoted 1 times

...

pk349

2 years, 2 months ago

D: I passed the test

upvoted 2 times

...

milofficial

2 years, 5 months ago

Selected Answer: D

Least operational overhead is D

upvoted 1 times

...

b33f

2 years, 8 months ago

Redshift now has RLS. Can't the answer be C as well? https://aws.amazon.com/about-aws/whats-new/2022/07/amazon-redshift-row-level-security/ https://docs.aws.amazon.com/redshift/latest/dg/t_rls.html https://aws.amazon.com/blogs/big-data/achieve-fine-grained-data-security-with-row-level-access-control-in-amazon-redshift/

upvoted 1 times

Gabba

2 years, 7 months ago

The solution should be with least efforts. Setting up redshift is big task hence C is incorrect.

upvoted 2 times

...

t47

2 years, 8 months ago

answer is D

upvoted 1 times

...

awsdatacert

2 years, 9 months ago

Will go with D.

upvoted 1 times

...

rocky48

2 years, 11 months ago

Selected Answer: D

upvoted 1 times

...

f4bi4n

3 years, 1 month ago

Selected Answer: D

D, everything else has too much setup or is not usable (A)

upvoted 1 times

...

keitahigaki

3 years, 7 months ago

Answer: D Quicksight Enterprise provides row-level security. https://docs.aws.amazon.com/ja_jp/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html

upvoted 4 times

...

sayed

3 years, 8 months ago

for C there is no built-in row-level security feature in Amazon Redshift it is in quicksight so i think C is not the correct answer

upvoted 3 times

god_father

1 year, 5 months ago

There is! https://docs.aws.amazon.com/redshift/latest/dg/t_rls.html "Using row-level security (RLS) in Amazon Redshift, you can have granular access control over your sensitive data. You can decide which users or roles can access specific records of data within schemas or tables, based on security policies that are defined at the database objects level. In addition to column-level security, where you can grant users permissions to a subset of columns, use RLS policies to further restrict access to particular rows of the visible columns."

upvoted 1 times

...

lostsoul07

3 years, 8 months ago

B is the right answer

upvoted 1 times

...

LRyan2020

3 years, 8 months ago

Topic 2 - Question 57 A healthcare company uses Amazon S3 to store all its data and is planning to use Amazon EMR, backed with EMR File System (EMRFS), to process and transform the data. The company data is stored in multiple buckets and encrypted using different encryption keys for each bucket. How can the EMR Cluster be configured to access the encrypted data? A) Modify the S3 bucket policies to grant public access to the S3 buckets. B) Create a security configuration that specifies the encryption keys for the buckets using per bucket encryption overrides. C) Configure the cluster to use 53 Select to access the data in the buckets and specify the encryption keys as options. D) Copy the encryption keys to the master node and create a security configuration that references the keys.

upvoted 3 times

Umer24

3 years, 8 months ago

Again BrainCert practice question: Correct Answer B. Create a security configuration that specifies the encryption keys for the buckets using per bucket encryption overrides. Explanation Correct answer is B as EMR provides per bucket encryption overrides option. Refer AWS documentation - EMR Securing Data https://aws.amazon.com/blogs/big-data/secure-your-data-on-amazon-emr-using-native-ebs-and-per-bucket-s3-encryption-options/ With S3 encryption on Amazon EMR, all the encryption modes use a single CMK by default to encrypt objects in S3. If you have highly sensitive content in specific S3 buckets, you may want to manage the encryption of these buckets separately by using different CMKs or encryption modes for individual buckets. You can accomplish this using the per bucket encryption overrides option in Amazon EMR.

upvoted 3 times

blubb

3 years, 8 months ago

Are BrainCert practice question not real exam questions?

upvoted 1 times

...

rocky48

2 years, 11 months ago

@LRyan2020 Why are these questions being posted here ? Are they sample questions or exam practice questions ?

upvoted 1 times

...

LRyan2020

3 years, 8 months ago

Topic 2 - Question 56 A company currently processes real-time streaming data using Apache Kafka. The company is facing challenges with managing, setting up, and scaling during production. The company want to optimize the deployment with the Kafka brokers. The solution should be managed by AWS, secure, and require minimal changes to the current client code. Which solution meets these requirements? A) Use Apache Zookeeper to scale Kafka installed on Amazon EC2 instances. B) Use Amazon Managed Streaming for Kafka to scale the brokers. C) Use Apache Zookeeper to scale the brokers of Amazon Managed Streaming for Kafka D) Scale the number of client machines, and use a single broker with Amazon Managed Streaming for Kafka.

upvoted 1 times

Umer24

3 years, 8 months ago

@LRyan2020. These are BrainCert practice questions buddy !!!! Correct Answer B. Use Amazon Managed Streaming for Kafka to scale the brokers. Explanation Correct answer is B as Amazon Managed Streaming is fully managed, secure Kafka streaming solution with no operation overhead. Refer AWS documentation - Managed Streaming Kafka https://aws.amazon.com/msk/ Amazon MSK lets you focus on creating your streaming applications without having to worry about the operational overhead of managing your Apache Kafka environment. Amazon MSK manages the provisioning, configuration, and maintenance of Apache Kafka clusters and Apache ZooKeeper nodes for you. Amazon MSK also shows key Apache Kafka performance metrics in the AWS console. Options A, C & D are wrong as they are not managed by AWS and would need user involvement.

upvoted 3 times

...

LRyan2020

3 years, 8 months ago

Question 55 An online retailer is planning to capture clickstream data from its ecommerce website and then use the data to drive a new custom-built recommendation engine that provides product recommendations to online users. The retailer will use Amazon Kinesis Data Streams to ingest the streaming data and Amazon Kinesis Data Analytics to perform SQL queries on the stream, using windowed queries to process the data that arrives at inconsistent intervals. What type of windowed query must be used to aggregate the data using time-based windows that open as data arrives? A) Continuous queries B) Tumbling window queries C) Sliding window queries D) Stagger window queries

upvoted 1 times

Umer24

3 years, 8 months ago

Correct Answer D. Stagger window queries Explanation Correct answer is D as using stagger windows is a windowing method that is suited for analyzing groups of data that arrive at inconsistent times. It is well suited for any timeseries analytics use case, such as a set of related sales or log records. Refer AWS documentation - Kinesis Stagger Window Concepts Option A is wrong as Continuous queries is a query over a stream that executes continuously over streaming data. This continuous execution enables scenarios, such as the ability for applications to continuously query a stream and generate alerts. Option B is wrong as Tumbling window queries are suitable when a windowed query processes each window in a non-overlapping manner Option C is wrong as Sliding window queries help define a time-based or row-based window, instead of grouping records using GROUP BY

upvoted 4 times

Umer24

3 years, 8 months ago

https://docs.aws.amazon.com/kinesisanalytics/latest/dev/stagger-window-concepts.html

upvoted 2 times

...

Umer24

3 years, 8 months ago

Question-54 A marketing company is storing its campaign response data in Amazon S3. A consistent set of sources has generated the data for each campaign. The data is saved into Amazon S3 as .csv files. A business analyst will use Amazon Athena to analyze each campaign's data. The company needs the cost of ongoing data analysis with Athena to be minimized. Which combination of actions should a data analytics specialist take to meet these requirements? (Select TWO.) A. Convert the .csv files to Apache Parquet. B. Convert the .csv files to Apache Avro. C. Partition the data by campaign. D. Partition the data by source. E. Compress the .csv files.

upvoted 1 times