Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Data Engineer Professional All Questions

View all questions & answers for the Certified Data Engineer Professional exam

Exam Certified Data Engineer Professional topic 1 question 25 discussion

Actual exam question from Databricks's Certified Data Engineer Professional
Question #: 25
Topic #: 1
[All Certified Data Engineer Professional Questions]

A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.
Which situation is causing increased duration of the overall job?

  • A. Task queueing resulting from improper thread pool assignment.
  • B. Spill resulting from attached volume storage being too small.
  • C. Network latency due to some cluster nodes being in different regions from the source data
  • D. Skew caused by more data being assigned to a subset of spark-partitions.
  • E. Credential validation errors while pulling data from an external system.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
AndreFR
3 months ago
A excluded because task queueing does not increase the duration of a task B excluded, spill is writing to storage when a memory is insufficient (not storage insufficient) C excluded, region cannot have a 100 times impact on duration E excluded, no errors mentioned in question
upvoted 1 times
...
imatheushenrique
5 months, 3 weeks ago
D. Skew caused by more data being assigned to a subset of spark-partitions.
upvoted 1 times
...
vikram12apr
8 months, 3 weeks ago
Selected Answer: D
because a particular executors are executing majority of data while rest are processing very less. The total execution time depends upon the slowest executors. Answer is D.
upvoted 2 times
...
Jay_98_11
10 months, 2 weeks ago
Selected Answer: D
correct
upvoted 1 times
...
kz_data
10 months, 2 weeks ago
Selected Answer: D
I think D is correct
upvoted 1 times
...
sturcu
1 year, 1 month ago
Selected Answer: D
D is correct
upvoted 1 times
...
Eertyy
1 year, 2 months ago
D is the correct answer
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...