Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 107 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 107
Topic #: 1
[All Certified Data Engineer Professional Questions]

Which statement describes Delta Lake optimized writes?

  • A. Before a Jobs cluster terminates, OPTIMIZE is executed on all tables modified during the most recent job.
  • B. An asynchronous job runs after the write completes to detect if files could be further compacted; if yes, an OPTIMIZE job is executed toward a default of 1 GB.
  • C. Data is queued in a messaging bus instead of committing data directly to memory; all data is committed from the messaging bus in one batch once the job is complete.
  • D. Optimized writes use logical partitions instead of directory partitions; because partition boundaries are only represented in metadata, fewer small files are written.
  • E. A shuffle occurs prior to writing to try to group similar data together resulting in fewer files instead of each executor writing multiple files based on directory partitions.
Show Suggested Answer Hide Answer
Suggested Answer: E 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
vctrhugo
9 months, 3 weeks ago
Selected Answer: E
Optimized writes improve file size as data is written and benefit subsequent reads on the table. Optimized writes are most effective for partitioned tables, as they reduce the number of small files written to each partition. Writing fewer large files is more efficient than writing many small files, but you might still see an increase in write latency because data is shuffled before being written. https://learn.microsoft.com/en-us/azure/databricks/delta/tune-file-size#--optimized-writes-for-delta-lake-on-azure-databricks
upvoted 1 times
...
lexaneon
10 months, 3 weeks ago
Selected Answer: E
https://docs.databricks.com/en/delta/tune-file-size.html#optimized-writes
upvoted 3 times
...
alexvno
11 months, 1 week ago
Selected Answer: E
Optimized writes are most effective for partitioned tables, as they reduce the number of small files written to each partition. Writing fewer large files is more efficient than writing many small files, but you might still see an increase in write latency because data is shuffled before being writte
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...