exam questions

Exam AWS Certified Data Analytics - Specialty All Questions

View all questions & answers for the AWS Certified Data Analytics - Specialty exam

Exam AWS Certified Data Analytics - Specialty topic 1 question 101 discussion

A company wants to collect and process events data from different departments in near-real time. Before storing the data in Amazon S3, the company needs to clean the data by standardizing the format of the address and timestamp columns. The data varies in size based on the overall load at each particular point in time. A single data record can be 100 KB-10 MB.
How should a data analytics specialist design the solution for data ingestion?

  • A. Use Amazon Kinesis Data Streams. Configure a stream for the raw data. Use a Kinesis Agent to write data to the stream. Create an Amazon Kinesis Data Analytics application that reads data from the raw stream, cleanses it, and stores the output to Amazon S3.
  • B. Use Amazon Kinesis Data Firehose. Configure a Firehose delivery stream with a preprocessing AWS Lambda function for data cleansing. Use a Kinesis Agent to write data to the delivery stream. Configure Kinesis Data Firehose to deliver the data to Amazon S3.
  • C. Use Amazon Managed Streaming for Apache Kafka. Configure a topic for the raw data. Use a Kafka producer to write data to the topic. Create an application on Amazon EC2 that reads data from the topic by using the Apache Kafka consumer API, cleanses the data, and writes to Amazon S3.
  • D. Use Amazon Simple Queue Service (Amazon SQS). Configure an AWS Lambda function to read events from the SQS queue and upload the events to Amazon S3.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
ariane_tateishi
Highly Voted 3 years, 6 months ago
C. Should be the right answer, because of the main requirement "A single data record can be 100 KB-10 MB." - Kinesis firehose - The maximum size of a record sent to Kinesis Data Firehose, before base64-encoding, is 1,000 KiB.
 - Kinesis stream - The maximum size of the data payload of a record before base64-encoding is up to 1 MB. - SQS - https://aws.amazon.com/pt/about-aws/whats-new/2015/10/now-send-payloads-up-to-2gb-with-amazon-sqs/
upvoted 26 times
lakediver
3 years, 4 months ago
I think answer should be D You rightly mentioned above about Firehose and Stream limits. However MSK has maximum record size of 8MB - https://docs.aws.amazon.com/msk/latest/developerguide/limits.html SQS now allow payloads upto 2GB - Using the Extended Client Library, message payloads larger than 256KB are stored in an Amazon Simple Storage Service (S3) bucket, using SQS to send and receive a reference to the payload location.
upvoted 5 times
lucaseo90
3 years, 2 months ago
The Kafka record size depends on message size. It is up to 100MB. I have experience setting the configuration for sending the encoded image through kafka.
upvoted 3 times
...
nadavw
2 years, 3 months ago
The limit for MSK mentioned here (8MB) is for MSK serverless only
upvoted 5 times
...
...
khchan123
2 years, 10 months ago
It should be D. Maximum message size for Amazon MSK is 8MB. https://docs.aws.amazon.com/msk/latest/developerguide/limits.html
upvoted 2 times
...
...
Monika14Sharma
Highly Voted 3 years, 6 months ago
It should be C. 1 MB is the soft limit for the Kafka which can be increased up to 10 MB. That's the main difference b/w KDA and Kafka
upvoted 8 times
...
MLCL
Most Recent 1 year, 8 months ago
Selected Answer: C
MSK is the only one that can handle single put of 10MB. Firehose can handle up to 128 MB Buffered data, but it just catches single records of up to 3MB in that window and sends them in batch, you can't put a record that is higher than 3MB
upvoted 1 times
MLCL
1 year, 8 months ago
But answer C really needs a lot of setup, if the question was about 1 MB records it would be Firehose immediately.
upvoted 1 times
...
...
pk349
1 year, 11 months ago
C: I passed the test
upvoted 3 times
...
cloudlearnerhere
2 years, 5 months ago
Correct answer is C as only Amazon Managed Streaming for Apache Kafka seems to provide a maximum size of a record that can be configured up to 10MB. Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that enables you to build and run applications that use Apache Kafka to process streaming data. Amazon MSK provides the control-plane operations, such as those for creating, updating, and deleting clusters. It lets you use Apache Kafka data-plane operations, such as those for producing and consuming data. It runs open-source versions of Apache Kafka. This means existing applications, tooling, and plugins from partners and the Apache Kafka community are supported without requiring changes to application code. Options A & B are wrong as the maximum size of a record sent to both Kinesis Data Stream and Kinesis Data Firehose base64-encoding, is 1,000 KiB. Option D is wrong SQS data size limit is 256KB.
upvoted 5 times
...
cloudlearnerhere
2 years, 5 months ago
Correct answer is C as only Amazon Managed Streaming for Apache Kafka seems to provide a maximum size of a record that can be configured up to 10MB. Correct answer is C as only Amazon Managed Streaming for Apache Kafka seems to provide a maximum size of a record that can be configured up to 10MB. Correct answer is C as only Amazon Managed Streaming for Apache Kafka seems to provide a maximum size of a record that can be configured up to 10MB. Option D is wrong SQS data size limit is 256KB.
upvoted 2 times
...
MultiCloudIronMan
2 years, 6 months ago
Selected Answer: D
Kalka can handle the limit, Firehose is 8mb max
upvoted 1 times
MultiCloudIronMan
2 years, 6 months ago
Sorry i meant C not D
upvoted 1 times
...
...
rocky48
2 years, 9 months ago
Selected Answer: C
Selected Answer: C
upvoted 3 times
...
GarfieldBin
2 years, 10 months ago
Selected Answer: D
KDS, KDF and MSK all have record/message limit lower than 10MB. https://aws.amazon.com/kinesis/data-streams/faqs/, https://aws.amazon.com/kinesis/data-streams/faqs/, https://docs.aws.amazon.com/msk/latest/developerguide/limits.html. SQS is recommended for "Using the ability of Amazon SQS to scale transparently. For example, you buffer requests and the load changes as a result of occasional load spikes or the natural growth of your business. Because each buffered request can be processed independently, Amazon SQS can scale transparently to handle the load without any provisioning instructions from you.": https://aws.amazon.com/kinesis/data-streams/faqs/
upvoted 1 times
...
certificationJunkie
2 years, 11 months ago
Although looks unnecessary to use EC2 to clean up data, C is still the right answer. Both Kinesis data streams and firehose have upper limit of 1 mb record size. Ideally, instead of using ec2, I would prefer to use something like Kafka Streams to perform required transformation.
upvoted 1 times
...
[Removed]
2 years, 11 months ago
C The maximum size of a record sent to Kinesis Data Firehose, before base64-encoding, is 1,000 KiB.
upvoted 1 times
...
jrheen
2 years, 11 months ago
Answer - C
upvoted 1 times
...
gaindas
3 years, 2 months ago
Selected Answer: C
C. For AWS Lambda processing, you can set a buffering hint between 1 MiB and 3 MiB using the BufferSizeInMBs processor parameter.
upvoted 2 times
...
npt
3 years, 4 months ago
C - cleanses the data, and writes to Amazon S3. D does not mention cleaning the data
upvoted 1 times
...
arun004
3 years, 4 months ago
why not B ? Buffer size can be set up to 128 MB
upvoted 1 times
...
Dr_Kiko
3 years, 5 months ago
Between B and C, I chose C because The maximum size of a record sent to Kinesis Data Firehose, before base64-encoding, is 1,000 KiB.
upvoted 1 times
...
tukai
3 years, 5 months ago
Why not B? Question is saying about near real time. and Firehose quota can be increased using Amazon Kinesis data firehose limit form.
upvoted 2 times
allanm
2 years, 5 months ago
Kinesis firehose - The maximum size of a record sent to Kinesis Data Firehose, before base64-encoding, is 1,000 KiB.
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago