Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 39 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 39
Topic #: 1
[All Certified Data Engineer Professional Questions]

Which of the following is true of Delta Lake and the Lakehouse?

  • A. Because Parquet compresses data row by row. strings will only be compressed when a character is repeated multiple times.
  • B. Delta Lake automatically collects statistics on the first 32 columns of each table which are leveraged in data skipping based on query filters.
  • C. Views in the Lakehouse maintain a valid cache of the most recent versions of source tables at all times.
  • D. Primary and foreign key constraints can be leveraged to ensure duplicate values are never entered into a dimension table.
  • E. Z-order can only be applied to numeric values stored in Delta Lake tables.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
PrashantTiwari
9 months, 2 weeks ago
B is correct
upvoted 1 times
...
guillesd
9 months, 3 weeks ago
Selected Answer: B
B is correct
upvoted 2 times
...
spaceexplorer
10 months ago
Selected Answer: B
B is correct
upvoted 1 times
...
Crocjun
10 months, 3 weeks ago
Can anyone explain why D is not correct?
upvoted 1 times
cryptoflam
10 months, 3 weeks ago
Because Primary & Foreign Key information is not enforced. "Primary and foreign keys are informational only and are not enforced" from: https://docs.databricks.com/en/tables/constraints.html#declare-primary-key-and-foreign-key-relationships
upvoted 2 times
...
...
Patito
11 months ago
Selected Answer: B
B is correct since statistics are collected for the first 32 columns and stored in the transaction log.
upvoted 3 times
AndreFR
3 months ago
https://www.databricks.com/discover/pages/optimize-data-workloads-guide#delta-data Delta data skipping automatically collects the stats (min, max, etc.) for the first 32 columns for each underlying Parquet file when you write data into a Delta table. Databricks takes advantage of this information (minimum and maximum values) at query time to skip unnecessary files in order to speed up the queries.
upvoted 1 times
...
...
ervinshang
11 months ago
Selected Answer: B
B is correct, C is error, con't have new cache in view
upvoted 1 times
...
f728f7f
11 months, 1 week ago
Selected Answer: C
C is correct
upvoted 1 times
...
chokthewa
1 year, 1 month ago
B is correct. https://docs.delta.io/2.0.0/table-properties.html
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...