exam questions

Exam AWS Certified Data Engineer - Associate DEA-C01 All Questions

View all questions & answers for the AWS Certified Data Engineer - Associate DEA-C01 exam

Exam AWS Certified Data Engineer - Associate DEA-C01 topic 1 question 118 discussion

A banking company uses an application to collect large volumes of transactional data. The company uses Amazon Kinesis Data Streams for real-time analytics. The company’s application uses the PutRecord action to send data to Kinesis Data Streams.

A data engineer has observed network outages during certain times of day. The data engineer wants to configure exactly-once delivery for the entire processing pipeline.

Which solution will meet this requirement?

  • A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source.
  • B. Update the checkpoint configuration of the Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) data collection application to avoid duplicate processing of events.
  • C. Design the data source so events are not ingested into Kinesis Data Streams multiple times.
  • D. Stop using Kinesis Data Streams. Use Amazon EMR instead. Use Apache Flink and Apache Spark Streaming in Amazon EMR.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Ramdi1
1 week, 1 day ago
Selected Answer: A
Amazon Kinesis Data Streams does not provide exactly-once delivery natively. It ensures at-least-once delivery, meaning that under certain conditions (e.g., network failures, retries), duplicate records can occur. To achieve exactly-once processing, deduplication must be handled at the application level.
upvoted 1 times
...
PashoQ
6 months ago
Selected Answer: A
A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source.
upvoted 1 times
...
Ja13
8 months, 2 weeks ago
Selected Answer: A
A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source. Explanation: Exactly-Once Delivery: Ensuring exactly-once delivery is a challenge in distributed systems, especially in the presence of network outages and retries. By embedding a unique ID in each record at the source, you can track and identify duplicate records during processing. This approach allows you to implement idempotent processing, where duplicate records can be detected and discarded, ensuring that each record is processed exactly once. De-duplication Logic: Implementing de-duplication logic based on unique IDs ensures that even if the same record is ingested multiple times due to retries or network issues, it will be processed only once by the downstream applications.
upvoted 2 times
...
bakarys
8 months, 2 weeks ago
Selected Answer: A
A. Design the application so it can remove duplicates during processing by embedding a unique ID in each record at the source. This approach ensures that even if a record is sent more than once due to network outages or other issues, it will only be processed once because the unique ID can be used to identify and remove any duplicates. This is a common pattern for achieving exactly-once processing semantics in distributed systems. The other options do not guarantee exactly-once delivery across the entire pipeline. Option B is partially correct but it only avoids duplicate processing within the Amazon Managed Service for Apache Flink, not across the entire pipeline. Option C is not always feasible because network issues and other factors can lead to events being ingested into Kinesis Data Streams multiple times. Option D involves changing the entire technology stack, which is not necessary to achieve the desired outcome and could introduce additional complexity and cost.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago