Exam DP-203 topic 4 question 43 discussion

Actual exam question from Microsoft's DP-203

Question #: 43
Topic #: 4

You are designing a solution that will use tables in Delta Lake on Azure Databricks.

You need to minimize how long it takes to perform the following:

• Queries against non-partitioned tables
• Joins on non-partitioned columns

Which two options should you include in the solution? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

A. the clone command
B. Z-Ordering
C. Apache Spark caching
D. dynamic file pruning (DFP)

Show Suggested Answer

Suggested Answer: BD 🗳️

by [deleted] at May 9, 2023, 9:30 a.m.

Comments

Submit Cancel

[Removed]

Highly Voted 1 year, 11 months ago

Selected Answer: BD

Seems correct: https://learn.microsoft.com/en-us/azure/databricks/optimizations/dynamic-file-pruning https://learn.microsoft.com/en-us/azure/databricks/delta/data-skipping

upvoted 9 times

...

vctrhugo

Highly Voted 1 year, 10 months ago

Selected Answer: BD

Dynamic file pruning, can significantly improve the performance of many queries on Delta Lake tables. Dynamic file pruning is especially efficient for non-partitioned tables, or for joins on non-partitioned columns. The performance impact of dynamic file pruning is often correlated to the clustering of data so consider using Z-Ordering to maximize the benefit.

upvoted 8 times

...

Aurangzaib

Most Recent 8 months, 2 weeks ago

Selected Answer: BC

By sorting the data files based on one or more columns, Z-Ordering can significantly improve the performance of queries that filter on those columns. Caching in Apache Spark allows frequently accessed data to be stored in memory, reducing the time it takes to read the data for subsequent operations. This can be particularly useful for speeding up joins and repeated queries against non-partitioned tables, as the data is readily available without having to be read from disk repeatedly.

upvoted 1 times

...

MBRSDG

1 year ago

https://www.databricks.com/blog/2020/04/30/faster-sql-queries-on-delta-lake-with-dynamic-file-pruning.html

upvoted 1 times

...

kkk5566

1 year, 8 months ago

Selected Answer: BD

correct

upvoted 2 times

...

Exam DP-203 All Questions

View all questions & answers for the DP-203 exam

Exam DP-203 topic 4 question 43 discussion

Comments

[Removed]

vctrhugo

Aurangzaib

MBRSDG

kkk5566

SY0-701