Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 150 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 150
Topic #: 1
[All Certified Data Engineer Professional Questions]

A nightly job ingests data into a Delta Lake table using the following code:



The next step in the pipeline requires a function that returns an object that can be used to manipulate new records that have not yet been processed to the next table in the pipeline.

Which code snippet completes this function definition?

def new_records():

  • A. return spark.readStream.table("bronze")
  • B. return spark.read.option("readChangeFeed", "true").table ("bronze")
  • C.
  • D.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Freyr
Highly Voted 5 months, 3 weeks ago
Selected Answer: B
Correct Answer: B The Change Data Feed (CDF) feature in Delta Lake enables reading only the changes (inserts and updates) to a Delta table. This would allow the function to focus on new or modified data since the last trigger, making it ideal for processing only the new records that have not been processed yet. This directly meets the requirement for identifying and manipulating new records efficiently.
upvoted 6 times
practicioner
3 months, 1 week ago
We are ingesting data from the folder with a parquet in the bronze table. It doesn't make any sense to use the CDF feature for bronze table )
upvoted 1 times
practicioner
3 months, 1 week ago
I've changed my opinion. Yes, B looks as correct answer
upvoted 2 times
...
...
...
m79590530
Most Recent 1 month ago
Selected Answer: A
Correct answer is A as we have append-only mode writes which are ideal for simple Structured Streaming as a next step ;)
upvoted 2 times
...
shaojunni
1 month, 1 week ago
Selected Answer: A
delta table returns new records in streaming read.
upvoted 2 times
...
pk07
1 month, 4 weeks ago
Selected Answer: B
B. Set the skipChangeCommits flag to true on raw_iot Let's break down the requirements and explain why this is the best solution: Retain manually deleted or updated records in raw_iot: The skipChangeCommits flag, when set to true, tells Delta Live Tables (DLT) to ignore any manual changes (updates or deletes) made to the table outside of the pipeline. This means that even if records are manually deleted or updated in the raw_iot table, these changes won't be reflected in the table when the pipeline runs again. Recompute downstream bpm_stats table: By default, DLT will recompute downstream tables when their upstream dependencies change. Since bpm_stats is based on raw_iot, it will naturally be recomputed when the pipeline updates, without any special configuration. Why the other options are not correct: A. Setting pipelines.reset.allowed to false on raw_iot would prevent the table from being reset, but it wouldn't address the requirement to retain manually deleted or updated records.
upvoted 1 times
...
shaojunni
2 months ago
Selected Answer: D
You have to know the CDF's current version and last processed the version in order to get not processed records. B does not provide those versions. It will just return content from the bronze table with CDF turned on. D is only possible solution.
upvoted 1 times
...
HelixAbdu
4 months ago
I did not test it. But i think D is wrong as it filtering agenst directory path using ==
upvoted 2 times
...
MDWPartners
5 months, 4 weeks ago
Selected Answer: D
Seems D
upvoted 4 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...