Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Exam Certified Associate Developer for Apache Spark topic 1 question 47 discussion

The code block shown below contains an error. The code block is intended to return a new 12-partition DataFrame from the 8-partition DataFrame storesDF by inducing a shuffle. Identify the error.
Code block:
storesDF.coalesce(12)

  • A. The coalesce() operation cannot guarantee the number of target partitions – the repartition() operation should be used instead.
  • B. The coalesce() operation does not induce a shuffle and cannot increase the number of partitions – the repartition() operation should be used instead.
  • C. The coalesce() operation will only work if the DataFrame has been cached to memory – the repartition() operation should be used instead.
  • D. The coalesce() operation requires a column by which to partition rather than a number of partitions – the repartition() operation should be used instead.
  • E. The number of resulting partitions, 12, is not achievable for an 8-partition DataFrame.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Raju_Bhai
1 year, 1 month ago
with version 3.4.0, df.repartition(12).coalesce(16).rdd.getNumPartitions() returns 12. it doesn't throw error, but only doesn't increase partition either
upvoted 1 times
...
4be8126
1 year, 6 months ago
Selected Answer: B
The correct answer is B. The coalesce() operation can decrease the number of partitions but cannot increase the number of partitions. It also does not induce a shuffle, and is therefore more efficient when decreasing the number of partitions. If the goal is to increase the number of partitions, repartition() should be used instead.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...