Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Associate All Questions

View all questions & answers for the Certified Data Engineer Associate exam

Exam Certified Data Engineer Associate topic 1 question 32 discussion

Actual exam question from Databricks's Certified Data Engineer Associate
Question #: 32
Topic #: 1
[All Certified Data Engineer Associate Questions]

A dataset has been defined using Delta Live Tables and includes an expectations clause:
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION DROP ROW
What is the expected behavior when a batch of data containing data that violates these constraints is processed?

  • A. Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.
  • B. Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.
  • C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
  • D. Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.
  • E. Records that violate the expectation cause the job to fail.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
XiltroX
Highly Voted 1 year, 7 months ago
Selected Answer: C
I am simply appalled by the number of wrong answers in this series of questions. The statement in the question already says "ON VIOLATE DROP ROW" which means if condition is violated, there will be nothing saved to quarantine table and a log of all invalid entries will be recoded. All invalid data that doesn't meet condition will be dropped. So C is the correct answer.
upvoted 17 times
...
rafahb
Highly Voted 1 year, 7 months ago
Selected Answer: C
C is correct
upvoted 5 times
...
806e7d2
Most Recent 2 days, 23 hours ago
Selected Answer: C
In Delta Live Tables, expectations are used to enforce data quality rules. In this specific case, the expectation is that the timestamp column should be greater than '2020-01-01'. When a batch of data is processed, if a record violates this expectation, the following happens: Drop the violating rows: The rows that don't meet the expectation (timestamp > '2020-01-01') will be dropped from the dataset. Logging of the violation: The fact that these rows were dropped due to the violation will be recorded in the event log for audit and tracking purposes. This ensures that only valid data (according to the expectation) is loaded into the final dataset, while invalid data is tracked.
upvoted 1 times
...
Stefan94
2 months ago
Selected Answer: C
100% C
upvoted 1 times
...
gdc.moser
2 months, 3 weeks ago
Selected Answer: C
C is the correct answer.
upvoted 1 times
...
3fbc31b
4 months, 2 weeks ago
Selected Answer: C
C is the correct answer. The DROP ROW clause will cause them to NOT be added to the destination; only marked in the log.
upvoted 1 times
...
benni_ale
6 months, 3 weeks ago
Selected Answer: C
C is correct
upvoted 1 times
...
SerGrey
10 months, 2 weeks ago
Selected Answer: C
C is correct
upvoted 1 times
...
Garyn
10 months, 3 weeks ago
Selected Answer: C
C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log. Explanation: The defined expectation specifies that if the timestamp is not greater than '2020-01-01', the row will be considered in violation of the constraint. The ON VIOLATION DROP ROW clause states that rows that violate the constraint will be dropped from the target dataset. Additionally, the expectation clause will log these violations in the event log, indicating which records did not meet the specified constraint criteria. This behavior ensures that the rows failing the defined constraint are not included in the target dataset and are logged as invalid in the event log for reference or further investigation, maintaining data integrity within the dataset based on the specified constraints.
upvoted 3 times
...
Huroye
1 year ago
who choses these answers? The correct answer is C. The record is dropped. This is not about the default behavior. It is explicit.
upvoted 1 times
...
DavidRou
1 year ago
Selected Answer: C
Right answer: C Invalid rows will be dropped as requested by the constraint and flagged as such in log files. If you need a quarantine table, you'll have to write more code.
upvoted 1 times
...
vctrhugo
1 year, 2 months ago
Selected Answer: C
C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log. With the defined constraint and expectation clause, when a batch of data is processed, any records that violate the expectation (in this case, where the timestamp is not greater than '2020-01-01') will be dropped from the target dataset. These dropped records will also be recorded as invalid in the event log, allowing for auditing and tracking of the data quality issues without causing the entire job to fail.
upvoted 2 times
...
AndreFR
1 year, 3 months ago
Selected Answer: C
https://docs.databricks.com/en/delta-live-tables/expectations.html
upvoted 2 times
...
Atnafu
1 year, 4 months ago
C When a batch of data is processed in Delta Live Tables and contains data that violates the defined expectations or constraints, the expected behavior is that the records violating the expectation are dropped from the target dataset. Additionally, these violated records are recorded as invalid in the event log.
upvoted 1 times
...
mehroosali
1 year, 4 months ago
Selected Answer: C
C is correct
upvoted 1 times
...
SHINGX
1 year, 7 months ago
B is correct. This question is number 35 on the practice test on databricks patner academy. https://partner-academy.databricks.com/ correct answer is "Records that violate the expectation are added to the target dataset and recorded as invalid in the event log"
upvoted 2 times
SHINGX
1 year, 7 months ago
Sorry, D
upvoted 1 times
SHINGX
1 year, 7 months ago
I was wrong, the ON VIOLATION DROP ROW makes C the correct answer
upvoted 5 times
...
...
...
surrabhi_4
1 year, 7 months ago
Selected Answer: C
option C
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...