Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 143 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 143
Topic #: 1
[All Certified Data Engineer Professional Questions]

A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.

Which situation is causing increased duration of the overall job?

  • A. Task queueing resulting from improper thread pool assignment.
  • B. Spill resulting from attached volume storage being too small.
  • C. Network latency due to some cluster nodes being in different regions from the source data
  • D. Skew caused by more data being assigned to a subset of spark-partitions.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
benni_ale
2 weeks, 4 days ago
Selected Answer: D
D is ok
upvoted 1 times
...
vexor3
4 months, 1 week ago
Selected Answer: D
D is correct
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...