exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 313 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 313
Topic #: 1
[All Professional Data Engineer Questions]

You want to migrate an Apache Spark 3 batch job from on-premises to Google Cloud. You need to minimally change the job so that the job reads from Cloud Storage and writes the result to BigQuery. Your job is optimized for Spark, where each executor has 8 vCPU and 16 GB memory, and you want to be able to choose similar settings. You want to minimize installation and management effort to run your job. What should you do?

  • A. Execute the job as part of a deployment in a new Google Kubernetes Engine cluster.
  • B. Execute the job from a new Compute Engine VM.
  • C. Execute the job in a new Dataproc cluster.
  • D. Execute as a Dataproc Serverless job.
Show Suggested Answer Hide Answer
Suggested Answer: D 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
chicity_de
Highly Voted 1 month, 3 weeks ago
Selected Answer: D
Priority is "minimize installation and management effort" which is done via Dataproc Serverless. Furthermore, with Dataproc serverless you can still specify resource settings for your job, such as the number of vCPUs and memory per executor (https://cloud.google.com/dataproc-serverless/docs/concepts/properties)
upvoted 6 times
...
plum21
Most Recent 1 day, 11 hours ago
Selected Answer: C
It's not possible to specify a machine type using Dataproc Serverless
upvoted 1 times
...
marlon.andrei
3 weeks ago
Selected Answer: C
I choice "C", just: "where each executor has 8 vCPU and 16 GB memory, and you want to be able to choose similar settings"
upvoted 1 times
...
Pime13
1 month ago
Selected Answer: D
Dataproc Serverless allows you to run Spark jobs without needing to manage the underlying infrastructure. It automatically handles resource provisioning and scaling, which simplifies the process and reduces management overhead
upvoted 1 times
...
mcdaley
2 months ago
Selected Answer: C
Dataproc supports Spark 3, ensuring compatibility with your existing job. It also allows you to customize the cluster configuration, including the number of executors, vCPUs, and memory per executor, to match your on-premises setup (8 vCPU and 16 GB memory)
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago