Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Certified Associate Developer for Apache Spark All Questions

View all questions & answers for the Certified Associate Developer for Apache Spark exam

Exam Certified Associate Developer for Apache Spark topic 1 question 124 discussion

Which of the following statements about the Spark driver is true?

  • A. Spark driver is horizontally scaled to increase overall processing throughput.
  • B. Spark driver is the most coarse level of the Spark execution hierarchy.
  • C. Spark driver is fault tolerant — if it fails, it will recover the entire Spark application.
  • D. Spark driver is responsible for scheduling the execution of data by various worker nodes in cluster mode.
  • E. Spark driver is only compatible with its included cluster manager.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Sowwy1
7 months, 3 weeks ago
It's D
upvoted 1 times
...
martcerv
1 year, 3 months ago
I believe D is the correct one according to documentation from Databricks [1]: "The driver process runs your main() function, sits on a node in the cluster, and is responsible for three things: maintaining information about the Spark Application; responding to a user’s program or input; and analyzing, distributing, and scheduling work across the executors (defined momentarily)." Addittionaly: "The cluster manager controls physical machines and allocates resources to Spark Applications." Based on the above we could say that cluster manager is charge assign resources (CPU, Memory, etc) to the VMs used. Keep in mind that this is based on the definition from Databricks other definitions may include what was mentioned by cookiemonster42. [1] https://www.databricks.com/glossary/what-are-spark-applications
upvoted 2 times
...
cookiemonster42
1 year, 3 months ago
Should be B - D - Spark driver is not directly responsible for scheduling the execution of data by various worker nodes in cluster mode. It submits tasks to the cluster manager (e.g., YARN, Mesos, or Kubernetes), and the cluster manager handles the scheduling of tasks on worker nodes.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...