exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 201 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 201
Topic #: 1
[All Certified Data Engineer Professional Questions]

An upstream source writes Parquet data as hourly batches to directories named with the current date. A nightly batch job runs the following code to ingest all data from the previous day as indicated by the date variable:



Assume that the fields customer_id and order_id serve as a composite key to uniquely identify each order.

If the upstream system is known to occasionally produce duplicate entries for a single order hours apart, which statement is correct?

  • A. Each write to the orders table will only contain unique records, and only those records without duplicates in the target table will be written.
  • B. Each write to the orders table will only contain unique records, but newly written records may have duplicates already present in the target table.
  • C. Each write to the orders table will only contain unique records; if existing records with the same key are present in the target table, these records will be overwritten.
  • D. Each write to the orders table will run deduplication over the union of new and existing records, ensuring no duplicate records are present.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
_lene_
3 weeks, 1 day ago
Selected Answer: B
No intra-batch duplicates, can be inter-batch duplicates
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago