Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Go to Exam

Exam Certified Associate Developer for Apache Spark topic 1 question 12 discussion

Actual exam question from Databricks's Certified Associate Developer for Apache Spark

Question #: 12
Topic #: 1

[All Certified Associate Developer for Apache Spark Questions]

Which of the following statements about Spark’s stability is incorrect?

A. Spark is designed to support the loss of any set of worker nodes.
B. Spark will rerun any failed tasks due to failed worker nodes.
C. Spark will recompute data cached on failed worker nodes.
D. Spark will spill data to disk if it does not fit in memory.
E. Spark will reassign the driver to a worker node if the driver’s node fails.

Show Suggested Answer

Suggested Answer: E 🗳️

by GuidoDC at March 25, 2023, 11:42 a.m.

Comments

Submit Cancel

TmData

Highly Voted 2 years ago

Selected Answer: E

Option E is incorrect because the driver program in Spark is not reassigned to another worker node if the driver's node fails. The driver program is responsible for the coordination and control of the Spark application and runs on a separate machine, typically the client machine or cluster manager. If the driver's node fails, the Spark application as a whole may fail or need to be restarted, but the driver is not automatically reassigned to another worker node.

upvoted 6 times

...

Pazzobg

Most Recent 1 month, 2 weeks ago

Selected Answer: E

Reassigning the work to another node is valid for the worker nodes, but in case the original Driver node fails, Spark does not reassign its work to another node making it a new Driver.

upvoted 1 times

...

zic00

10 months, 2 weeks ago

Selected Answer: E

In Spark, the driver node is crucial for orchestrating the execution of the Spark application. If the driver node fails, the Spark application fails. Spark does not automatically reassign the driver to another node if the driver fails. This would require the application to be restarted manually or through external high-availability mechanisms.

upvoted 2 times

...

YoSpark

11 months, 2 weeks ago

Considering the word "any set" looks like A is not correct either. What if all the worker nodes fail. A- "Spark is designed to support the loss of any set of worker nodes."

upvoted 1 times

...

Sonu124

1 year, 9 months ago

Option E beacuse spark doesn't assiggned driver if faild

upvoted 2 times

...

TmData

2 years ago

Selected Answer: E

The incorrect statement about Spark's stability is: E. Spark will reassign the driver to a worker node if the driver’s node fails. Explanation: Option A is correct because Spark is designed to handle the failure of worker nodes. When a worker node fails, Spark redistributes the lost tasks to other available worker nodes to ensure fault tolerance. Option C is correct because Spark is able to recompute data that was cached on failed worker nodes. Spark maintains lineage information about RDDs (Resilient Distributed Datasets), allowing it to reconstruct lost data partitions in case of failures.

upvoted 2 times

...

SonicBoom10C9

2 years, 1 month ago

Selected Answer: E

The driver is responsible for maintaining spark context. If it fails, there is no recourse. The driver can mitigate the failure of worker nodes through limited fault tolerance mechanisms.

upvoted 2 times

...

4be8126

2 years, 2 months ago

Selected Answer: E

All of the following statements about Spark's stability are correct except for: E. Spark will reassign the driver to a worker node if the driver’s node fails. The driver is a special process in Spark that is responsible for coordinating tasks and executing the main program. If the driver fails, the entire Spark application fails and cannot be restarted. Therefore, Spark does not reassign the driver to a worker node if the driver's node fails.

upvoted 2 times

...

Indiee

2 years, 2 months ago

The E is only valid when spark-submit is in cluster modes

upvoted 2 times

raghavendra516

11 months, 3 weeks ago

And it also depened on Resource Manger of cluser on which spark is running.

upvoted 1 times

...

GuidoDC

2 years, 3 months ago

Selected Answer: E

If the driver node fails your cluster will fail. If the worker node fails, Databricks will spawn a new worker node to replace the failed node and resumes the workload.

upvoted 3 times

TC007

2 years, 3 months ago

If the node running the driver program fails, Spark's built-in fault-tolerance mechanism can reassign the driver program to run on another node.

upvoted 1 times

...