Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 20 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 20
Topic #: 1
[All Certified Data Engineer Professional Questions]

A data architect has designed a system in which two Structured Streaming jobs will concurrently write to a single bronze Delta table. Each job is subscribing to a different topic from an Apache Kafka source, but they will write data with the same schema. To keep the directory structure simple, a data engineer has decided to nest a checkpoint directory to be shared by both streams.
The proposed directory structure is displayed below:

Which statement describes whether this checkpoint directory structure is valid for the given scenario and why?

  • A. No; Delta Lake manages streaming checkpoints in the transaction log.
  • B. Yes; both of the streams can share a single checkpoint directory.
  • C. No; only one stream can write to a Delta Lake table.
  • D. Yes; Delta Lake supports infinite concurrent writers.
  • E. No; each of the streams needs to have its own checkpoint directory.
Show Suggested Answer Hide Answer
Suggested Answer: E 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
thxsgod
Highly Voted 1 year, 2 months ago
Selected Answer: E
Correct, E. Source: https://docs.databricks.com/en/optimizations/isolation-level.html#:~:text=If%20a%20streaming%20query%20using%20the%20same%20checkpoint%20location%20is%20started%20multiple%20times%20concurrently%20and%20tries%20to%20write%20to%20the%20Delta%20table%20at%20the%20same%20time.%20You%20should%20never%20have%20two%20streaming%20queries%20use%20the%20same%20checkpoint%20location%20and%20run%20at%20the%20same%20time.
upvoted 8 times
...
benni_ale
Most Recent 1 month, 1 week ago
Selected Answer: E
E is the correct
upvoted 1 times
...
imatheushenrique
5 months, 3 weeks ago
E. No; each of the streams needs to have its own checkpoint directory. The checkpoint directory is 1 to 1
upvoted 1 times
...
svik
6 months, 2 weeks ago
Selected Answer: B
It is not clear from the question that year_week=2020_01 and year_week=2020_02 are used by stream 1 and stream 2 respectively. If they use the common parent checkpoint directory with individual sub folders for checkpointing, that should work fine. In that case the answer should be B
upvoted 1 times
Kill9
5 months ago
That are table partitions. They are not used to build checkpoint adress. The adress finish at /bronze
upvoted 1 times
...
...
Jay_98_11
10 months, 2 weeks ago
Selected Answer: E
correct E
upvoted 1 times
...
kz_data
10 months, 2 weeks ago
Selected Answer: E
E is correct
upvoted 1 times
...
sturcu
1 year, 1 month ago
E is correct. If user wants 1 checkpoint directory then he needs to unions streams before writing.
upvoted 2 times
...
Eertyy
1 year, 2 months ago
answer is correct
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...