exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 287 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 287
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You are tasked with building an MLOps pipeline to retrain tree-based models in production. The pipeline will include components related to data ingestion, data processing, model training, model evaluation, and model deployment. Your organization primarily uses PySpark-based workloads for data preprocessing. You want to minimize infrastructure management effort. How should you set up the pipeline?

  • A. Set up a TensorFlow Extended (TFX) pipeline on Vertex AI Pipelines to orchestrate the MLOps pipeline. Write a custom component for the PySpark-based workloads on Dataproc.
  • B. Set up a Vertex AI Pipelines to orchestrate the MLOps pipeline. Use the predefined Dataproc component for the PySpark-based workloads.
  • C. Set up Kubeflow Pipelines on Google Kubernetes Engine to orchestrate the MLOps pipeline. Write a custom component for the PySparkbased workloads on Dataproc.
  • D. Set up Cloud Composer to orchestrate the MLOps pipeline. Use Dataproc workflow templates for the PySpark-based workloads in Cloud Composer.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
Pau1234
4 months, 2 weeks ago
Selected Answer: B
minimize infrastructure management effort -- hence B
upvoted 1 times
...
Omi_04040
4 months, 2 weeks ago
Selected Answer: B
A- Rejected due to component for the PySpark-based C- Kubeflow Pipelines not a managed service and the question mentions 'minimize infrastructure management effort' D-
upvoted 1 times
Omi_04040
4 months, 2 weeks ago
D- Cloud Composer to orchestrate is an overhead hence B
upvoted 1 times
...
...
AB_C
5 months ago
Selected Answer: B
This is the most suitable approach
upvoted 2 times
...
carolctech
6 months ago
Selected Answer: B
B) Best option due to higher ease of use, integration with existing PySpark infrastructure (via Dataproc) and minimal infrastructure management overhead, because: Vertex AI Pipelines is fully managed, minimizing infra management effort and natively integrated with Dataproc for PySpark (while Composer is not); Dataproc’s predefined component for PySpark workload reduces effort and error probability; It is suitable for tree-based models (other options are too, but with more effort)
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago