exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 233 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 233
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You are building a custom image classification model and plan to use Vertex AI Pipelines to implement the end-to-end training. Your dataset consists of images that need to be preprocessed before they can be used to train the model. The preprocessing steps include resizing the images, converting them to grayscale, and extracting features. You have already implemented some Python functions for the preprocessing tasks. Which components should you use in your pipeline?

  • A. DataprocSparkBatchOp and CustomTrainingJobOp
  • B. DataflowPythonJobOp, WaitGcpResourcesOp, and CustomTrainingJobOp
  • C. dsl.ParallelFor, dsl.component, and CustomTrainingJobOp
  • D. ImageDatasetImportDataOp, dsl.component, and AutoMLImageTrainingJobRunOp
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
guilhermebutzke
Highly Voted 1 year, 2 months ago
Selected Answer: B
My Answer: B Looking for the options, DataflowPythonJobOp can be used for parallelizing the preprocessing tasks, which is suitable for image resizing, converting to grayscale, and extracting features. dsl.ParallelFor could be useful for parallelizing tasks but might not be the most straightforward option for image preprocessing. Generally DataflowPythonJobOp is followed by WaitGcpResourcesOp. https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/fe7d3e4b8edc137d90ec061789b879b7cc8d3854/notebooks/community/ml_ops/stage3/get_started_with_dataflow_flex_template_component.ipynb
upvoted 5 times
...
Dirtie_Sinkie
Most Recent 7 months, 1 week ago
Selected Answer: B
B is definitely right, no doubt
upvoted 2 times
...
pinimichele01
1 year ago
Selected Answer: B
https://cloud.google.com/vertex-ai/docs/pipelines/dataflow-component#dataflowpythonjobop
upvoted 1 times
...
b1a8fae
1 year, 3 months ago
Selected Answer: B
I go with B. Custom training is surely required. Discarding A because Spark is not mentioned anywhere in the problem description. C involves Kubeflow which seems a bit overkill imo. DataflowPythonJobOp operator lets you create a Vertex AI Pipelines component that prepares data -> seems like the appropriate course of action to me. https://cloud.google.com/vertex-ai/docs/pipelines/dataflow-component#dataflowpythonjobop
upvoted 2 times
...
pikachu007
1 year, 3 months ago
Selected Answer: B
A. DataprocSparkBatchOp: While capable of data processing, it's less well-suited for image-specific tasks like resizing and grayscale conversion compared to DataflowPythonJobOp. C. dsl.ParallelFor, dsl.component: While offering flexibility, they require more manual orchestration and potentially less efficient for image preprocessing compared to DataflowPythonJobOp. D. ImageDatasetImportDataOp, AutoMLImageTrainingJobRunOp: These components are designed for AutoML Image training, not directly compatible with custom preprocessing and training tasks.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago