exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 77 discussion

A bank operates in a regulated environment. The compliance requirements for the country in which the bank operates say that customer data for each state should only be accessible by the bank's employees located in the same state. Bank employees in one state should NOT be able to access data for customers who have provided a home address in a different state.
The bank's marketing team has hired a data analyst to gather insights from customer data for a new campaign being launched in certain states. Currently, data linking each customer account to its home state is stored in a tabular .csv file within a single Amazon S3 folder in a private S3 bucket. The total size of the S3 folder is 2 GB uncompressed. Due to the country's compliance requirements, the marketing team is not able to access this folder.
The data analyst is responsible for ensuring that the marketing team gets one-time access to customer data for their campaign analytics project, while being subject to all the compliance requirements and controls.
Which solution should the data analyst implement to meet the desired requirements with the LEAST amount of setup effort?

  • A. Re-arrange data in Amazon S3 to store customer data about each state in a different S3 folder within the same bucket. Set up S3 bucket policies to provide marketing employees with appropriate data access under compliance controls. Delete the bucket policies after the project.
  • B. Load tabular data from Amazon S3 to an Amazon EMR cluster using s3DistCp. Implement a custom Hadoop-based row-level security solution on the Hadoop Distributed File System (HDFS) to provide marketing employees with appropriate data access under compliance controls. Terminate the EMR cluster after the project.
  • C. Load tabular data from Amazon S3 to Amazon Redshift with the COPY command. Use the built-in row-level security feature in Amazon Redshift to provide marketing employees with appropriate data access under compliance controls. Delete the Amazon Redshift tables after the project.
  • D. Load tabular data from Amazon S3 to Amazon QuickSight Enterprise edition by directly importing it as a data source. Use the built-in row-level security feature in Amazon QuickSight to provide marketing employees with appropriate data access under compliance controls. Delete Amazon QuickSight data sources after the project is complete.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Nicki1013
Highly Voted 3 years, 7 months ago
My answer is D
upvoted 26 times
awssp12345
3 years, 7 months ago
agreed with D.
upvoted 4 times
...
...
cloudlearnerhere
Highly Voted 2 years, 5 months ago
Selected Answer: D
Correct answer is D as using QuickSight with its built-in-row-level security features allows the data analyst to provide limited one-time access while maintaining data compliance requirements and controls and a minimal amount of setup. In the Enterprise edition of Amazon QuickSight, you can restrict access to a dataset by configuring row-level security (RLS) on it. You can do this before or after you have shared the dataset. Only the people whom you shared with can see any of the data. By adding row-level security, you can further control their access. Option A is wrong as it would take some amount of setup to repartition the data in S3. Options B & C are wrong as using EMR and Redshift would need set up and provisioning effort.
upvoted 8 times
...
bansalhp
Most Recent 1 year, 3 months ago
I think option A is right answer. Option A proposes reorganizing the data by storing customer data for each state in a different S3 folder within the same bucket. This makes it easier to manage access control at the folder level. Setting up S3 bucket policies allows for controlling access to specific folders, meeting the compliance requirements without requiring additional services. After the project is complete, the bucket policies can be deleted, ensuring that access control is removed as needed. Option C and D would be costlier as we need to spin up redshift or Quicksight for 2 GB data
upvoted 1 times
...
pk349
1 year, 12 months ago
D: I passed the test
upvoted 2 times
...
milofficial
2 years, 2 months ago
Selected Answer: D
Least operational overhead is D
upvoted 1 times
...
b33f
2 years, 5 months ago
Redshift now has RLS. Can't the answer be C as well? https://aws.amazon.com/about-aws/whats-new/2022/07/amazon-redshift-row-level-security/ https://docs.aws.amazon.com/redshift/latest/dg/t_rls.html https://aws.amazon.com/blogs/big-data/achieve-fine-grained-data-security-with-row-level-access-control-in-amazon-redshift/
upvoted 1 times
Gabba
2 years, 5 months ago
The solution should be with least efforts. Setting up redshift is big task hence C is incorrect.
upvoted 2 times
...
...
t47
2 years, 6 months ago
answer is D
upvoted 1 times
...
awsdatacert
2 years, 6 months ago
Will go with D.
upvoted 1 times
...
rocky48
2 years, 9 months ago
Selected Answer: D
Selected Answer: D
upvoted 1 times
...
f4bi4n
2 years, 11 months ago
Selected Answer: D
D, everything else has too much setup or is not usable (A)
upvoted 1 times
...
keitahigaki
3 years, 5 months ago
Answer: D Quicksight Enterprise provides row-level security. https://docs.aws.amazon.com/ja_jp/quicksight/latest/user/restrict-access-to-a-data-set-using-row-level-security.html
upvoted 4 times
...
sayed
3 years, 5 months ago
for C there is no built-in row-level security feature in Amazon Redshift it is in quicksight so i think C is not the correct answer
upvoted 3 times
god_father
1 year, 3 months ago
There is! https://docs.aws.amazon.com/redshift/latest/dg/t_rls.html "Using row-level security (RLS) in Amazon Redshift, you can have granular access control over your sensitive data. You can decide which users or roles can access specific records of data within schemas or tables, based on security policies that are defined at the database objects level. In addition to column-level security, where you can grant users permissions to a subset of columns, use RLS policies to further restrict access to particular rows of the visible columns."
upvoted 1 times
...
...
lostsoul07
3 years, 5 months ago
B is the right answer
upvoted 1 times
...
LRyan2020
3 years, 6 months ago
Topic 2 - Question 57 A healthcare company uses Amazon S3 to store all its data and is planning to use Amazon EMR, backed with EMR File System (EMRFS), to process and transform the data. The company data is stored in multiple buckets and encrypted using different encryption keys for each bucket. How can the EMR Cluster be configured to access the encrypted data? A) Modify the S3 bucket policies to grant public access to the S3 buckets. B) Create a security configuration that specifies the encryption keys for the buckets using per bucket encryption overrides. C) Configure the cluster to use 53 Select to access the data in the buckets and specify the encryption keys as options. D) Copy the encryption keys to the master node and create a security configuration that references the keys.
upvoted 3 times
Umer24
3 years, 6 months ago
Again BrainCert practice question: Correct Answer B. Create a security configuration that specifies the encryption keys for the buckets using per bucket encryption overrides. Explanation Correct answer is B as EMR provides per bucket encryption overrides option. Refer AWS documentation - EMR Securing Data https://aws.amazon.com/blogs/big-data/secure-your-data-on-amazon-emr-using-native-ebs-and-per-bucket-s3-encryption-options/ With S3 encryption on Amazon EMR, all the encryption modes use a single CMK by default to encrypt objects in S3. If you have highly sensitive content in specific S3 buckets, you may want to manage the encryption of these buckets separately by using different CMKs or encryption modes for individual buckets. You can accomplish this using the per bucket encryption overrides option in Amazon EMR.
upvoted 3 times
blubb
3 years, 6 months ago
Are BrainCert practice question not real exam questions?
upvoted 1 times
...
...
rocky48
2 years, 9 months ago
@LRyan2020 Why are these questions being posted here ? Are they sample questions or exam practice questions ?
upvoted 1 times
...
...
LRyan2020
3 years, 6 months ago
Topic 2 - Question 56 A company currently processes real-time streaming data using Apache Kafka. The company is facing challenges with managing, setting up, and scaling during production. The company want to optimize the deployment with the Kafka brokers. The solution should be managed by AWS, secure, and require minimal changes to the current client code. Which solution meets these requirements? A) Use Apache Zookeeper to scale Kafka installed on Amazon EC2 instances. B) Use Amazon Managed Streaming for Kafka to scale the brokers. C) Use Apache Zookeeper to scale the brokers of Amazon Managed Streaming for Kafka D) Scale the number of client machines, and use a single broker with Amazon Managed Streaming for Kafka.
upvoted 1 times
Umer24
3 years, 6 months ago
@LRyan2020. These are BrainCert practice questions buddy !!!! Correct Answer B. Use Amazon Managed Streaming for Kafka to scale the brokers. Explanation Correct answer is B as Amazon Managed Streaming is fully managed, secure Kafka streaming solution with no operation overhead. Refer AWS documentation - Managed Streaming Kafka https://aws.amazon.com/msk/ Amazon MSK lets you focus on creating your streaming applications without having to worry about the operational overhead of managing your Apache Kafka environment. Amazon MSK manages the provisioning, configuration, and maintenance of Apache Kafka clusters and Apache ZooKeeper nodes for you. Amazon MSK also shows key Apache Kafka performance metrics in the AWS console. Options A, C & D are wrong as they are not managed by AWS and would need user involvement.
upvoted 3 times
...
...
LRyan2020
3 years, 6 months ago
Question 55 An online retailer is planning to capture clickstream data from its ecommerce website and then use the data to drive a new custom-built recommendation engine that provides product recommendations to online users. The retailer will use Amazon Kinesis Data Streams to ingest the streaming data and Amazon Kinesis Data Analytics to perform SQL queries on the stream, using windowed queries to process the data that arrives at inconsistent intervals. What type of windowed query must be used to aggregate the data using time-based windows that open as data arrives? A) Continuous queries B) Tumbling window queries C) Sliding window queries D) Stagger window queries
upvoted 1 times
Umer24
3 years, 6 months ago
Correct Answer D. Stagger window queries Explanation Correct answer is D as using stagger windows is a windowing method that is suited for analyzing groups of data that arrive at inconsistent times. It is well suited for any timeseries analytics use case, such as a set of related sales or log records. Refer AWS documentation - Kinesis Stagger Window Concepts Option A is wrong as Continuous queries is a query over a stream that executes continuously over streaming data. This continuous execution enables scenarios, such as the ability for applications to continuously query a stream and generate alerts. Option B is wrong as Tumbling window queries are suitable when a windowed query processes each window in a non-overlapping manner Option C is wrong as Sliding window queries help define a time-based or row-based window, instead of grouping records using GROUP BY
upvoted 4 times
Umer24
3 years, 6 months ago
https://docs.aws.amazon.com/kinesisanalytics/latest/dev/stagger-window-concepts.html
upvoted 2 times
...
...
...
Umer24
3 years, 6 months ago
Question-54 A marketing company is storing its campaign response data in Amazon S3. A consistent set of sources has generated the data for each campaign. The data is saved into Amazon S3 as .csv files. A business analyst will use Amazon Athena to analyze each campaign's data. The company needs the cost of ongoing data analysis with Athena to be minimized. Which combination of actions should a data analytics specialist take to meet these requirements? (Select TWO.) A. Convert the .csv files to Apache Parquet. B. Convert the .csv files to Apache Avro. C. Partition the data by campaign. D. Partition the data by source. E. Compress the .csv files.
upvoted 1 times
Roontha
3 years, 6 months ago
Answer : A, C
upvoted 4 times
...
dkp
3 years, 5 months ago
E also works. https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/
upvoted 1 times
...
sivajiboss
3 years, 5 months ago
A and C
upvoted 2 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago