Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 128 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 128
Topic #: 1
[All Certified Data Engineer Professional Questions]

The data engineering team has configured a job to process customer requests to be forgotten (have their data deleted). All user data that needs to be deleted is stored in Delta Lake tables using default table settings.

The team has decided to process all deletions from the previous week as a batch job at 1am each Sunday. The total duration of this job is less than one hour. Every Monday at 3am, a batch job executes a series of VACUUM commands on all Delta Lake tables throughout the organization.

The compliance officer has recently learned about Delta Lake's time travel functionality. They are concerned that this might allow continued access to deleted data.

Assuming all delete logic is correctly implemented, which statement correctly addresses this concern?

  • A. Because the VACUUM command permanently deletes all files containing deleted records, deleted records may be accessible with time travel for around 24 hours.
  • B. Because the default data retention threshold is 24 hours, data files containing deleted records will be retained until the VACUUM job is run the following day.
  • C. Because the default data retention threshold is 7 days, data files containing deleted records will be retained until the VACUUM job is run 8 days later.
  • D. Because Delta Lake's delete statements have ACID guarantees, deleted records will be permanently purged from all storage systems as soon as a delete job completes.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
cales
1 month, 1 week ago
Selected Answer: C
Is C since by default Vacuum retains files no more referenced in the current table version for 7 days. https://docs.databricks.com/en/delta/history.html#configure-data-retention-for-time-travel-queries
upvoted 1 times
...
Hadiler
4 months ago
Selected Answer: C
C is the correct answer
upvoted 2 times
...
vexor3
4 months ago
Selected Answer: C
C is correct
upvoted 3 times
...
03355a2
5 months ago
Selected Answer: A
Since the team is expecting last week's data to be deleted on Sunday at 1am to 2am. The data will be available for approx 24hrs until the vacuum command is run on Monday at 3am.
upvoted 1 times
cales
1 month, 1 week ago
No! By default Vacuum does not remove rows deleted whithin the last 7 days. To do it you should modify the property delta.deletedFileRetentionDuration https://docs.databricks.com/en/delta/history.html#configure-data-retention-for-time-travel-queries
upvoted 1 times
...
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...