Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 164 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 164
Topic #: 1
[All Certified Data Engineer Professional Questions]

All records from an Apache Kafka producer are being ingested into a single Delta Lake table with the following schema:

key BINARY, value BINARY, topic STRING, partition LONG, offset LONG, timestamp LONG

There are 5 unique topics being ingested. Only the "registration" topic contains Personal Identifiable Information (PII). The company wishes to restrict access to PII. The company also wishes to only retain records containing PII in this table for 14 days after initial ingestion. However, for non-PII information, it would like to retain these records indefinitely.

Which solution meets the requirements?

  • A. All data should be deleted biweekly; Delta Lake's time travel functionality should be leveraged to maintain a history of non-PII information.
  • B. Data should be partitioned by the registration field, allowing ACLs and delete statements to be set for the PII directory.
  • C. Data should be partitioned by the topic field, allowing ACLs and delete statements to leverage partition boundaries.
  • D. Separate object storage containers should be specified based on the partition field, allowing isolation at the storage level.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
hpkr
5 months, 2 weeks ago
Selected Answer: C
C is correct
upvoted 2 times
...
imatheushenrique
5 months, 4 weeks ago
C. Partitioning the data by the topic field allows the company to apply different access control policies and retention policies for different topics. Althought there is a performance optmization gain because of the read in the partition path.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...