exam questions

Exam DP-600 All Questions

View all questions & answers for the DP-600 exam

Exam DP-600 topic 1 question 48 discussion

Actual exam question from Microsoft's DP-600
Question #: 48
Topic #: 1
[All DP-600 Questions]

DRAG DROP -
You have a Fabric tenant that contains a lakehouse named Lakehouse1.
Readings from 100 IoT devices are appended to a Delta table in Lakehouse1. Each set of readings is approximately 25 KB. Approximately 10 GB of data is received daily.
All the table and SparkSession settings are set to the default.
You discover that queries are slow to execute. In addition, the lakehouse storage contains data and log files that are no longer used.
You need to remove the files that are no longer used and combine small files into larger files with a target size of 1 GB per file.
What should you do? To answer, drag the appropriate actions to the correct requirements. Each action may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.

Show Suggested Answer Hide Answer
Suggested Answer:

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
SamuComqi
Highly Voted 1 year, 2 months ago
VACUUM: to remove old files no longer referenced. OPTIMIZE: to create fewer files with a larger size. Sources: * https://learn.microsoft.com/en-us/fabric/data-engineering/delta-optimization-and-v-order?tabs=sparksql * VACUUM: https://docs.delta.io/latest/delta-utility.html#-delta-vacuum * OPTIMIZE: https://docs.delta.io/latest/optimizations-oss.html
upvoted 23 times
...
Rakesh16
Most Recent 5 months, 2 weeks ago
Remove the files-->Run the VACUUM command on a schedule Combine the files-->Run the OPTIMIZE command on a schedule
upvoted 2 times
...
Pegooli
9 months, 2 weeks ago
answer is correct :)
upvoted 2 times
...
282b85d
11 months, 1 week ago
• Remove the files that are no longer used: Run the VACUUM command on a schedule: The VACUUM command cleans up old files and log files that are no longer needed by the Delta table, helping to free up storage and potentially improve performance by reducing the number of files the query engine needs to consider. • Combine small files into larger files: Run the OPTIMIZE command on a schedule: The OPTIMIZE command compacts small files into larger ones, improving read performance by reducing the overhead associated with opening many small files. This can be particularly useful when you have a large number of small files due to frequent appends of small data sets.
upvoted 1 times
...
stilferx
11 months, 3 weeks ago
IMHO, Vacuum & Optimize are good for optimizing Delta Lake :)
upvoted 1 times
...
Valcon_doo_NoviSad
1 year, 1 month ago
I agree that it is VACUUM and OPTIMIZE, but I would say Set the optimizeWrite table setting (B) and not Run the OPTIMIZE command on a schedule (E).
upvoted 2 times
thuss
1 year, 1 month ago
Isn't optimizeWrite set by default though? However that would only optimize the data as it is written, not over time.
upvoted 1 times
...
...
Momoanwar
1 year, 2 months ago
Correct : OPTIMIZE Improves query performance by optimizing file sizes. See Compact data files with optimize on Delta Lake. VACUUM Reduces storage costs by deleting data files no longer referenced by the table. See Remove unused data files with vacuum.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago