Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 127 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 127
Topic #: 1
[All Certified Data Engineer Professional Questions]

The data science team has requested assistance in accelerating queries on free-form text from user reviews. The data is currently stored in Parquet with the below schema:

item_id INT, user_id INT, review_id INT, rating FLOAT, review STRING

The review column contains the full text of the review left by the user. Specifically, the data science team is looking to identify if any of 30 key words exist in this field.

A junior data engineer suggests converting this data to Delta Lake will improve query performance.

Which response to the junior data engineer’s suggestion is correct?

  • A. Delta Lake statistics are not optimized for free text fields with high cardinality.
  • B. Delta Lake statistics are only collected on the first 4 columns in a table.
  • C. ZORDER ON review will need to be run to see performance gains.
  • D. The Delta log creates a term matrix for free text fields to support selective filtering.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
m79590530
1 month ago
Selected Answer: A
Delta Lake optimizations are not well suited for long TIMESTAMP or STRING fields and can not provide good indexing, data skipping or statistics logging for them.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...