Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 70 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 70
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You lead a data science team at a large international corporation. Most of the models your team trains are large-scale models using high-level TensorFlow APIs on AI Platform with GPUs. Your team usually takes a few weeks or months to iterate on a new version of a model. You were recently asked to review your team’s spending. How should you reduce your Google Cloud compute costs without impacting the model’s performance?

  • A. Use AI Platform to run distributed training jobs with checkpoints.
  • B. Use AI Platform to run distributed training jobs without checkpoints.
  • C. Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs with checkpoints.
  • D. Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs without checkpoints.
Show Suggested Answer Hide Answer
Suggested Answer: C 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
seifou
Highly Voted 1 year, 11 months ago
Selected Answer: C
https://cloud.google.com/blog/products/ai-machine-learning/reduce-the-costs-of-ml-workflows-with-preemptible-vms-and-gpus?hl=en
upvoted 10 times
...
sashimii14
Most Recent 2 weeks, 3 days ago
Selected Answer: C
C for me
upvoted 1 times
...
PhilipKoku
5 months, 2 weeks ago
Selected Answer: C
C) Preemptible VMs with Check points
upvoted 1 times
...
MultiCloudIronMan
7 months, 3 weeks ago
Selected Answer: C
Pre-emptive VMs are cheaper and checkpoints will enable termination if the result is acceptable
upvoted 3 times
...
libo1985
1 year, 1 month ago
I guess distributed training is not cheap. So C.
upvoted 1 times
...
joaquinmenendez
1 year, 2 months ago
C is the best approach because it allows you to reduce your compute costs without impacting the model's performance. Preemptible VMs are much cheaper than standard VMs, but they can be terminated at any time. By using checkpoints, you can ensure that your training job can be resumed if a preemptible VM is terminated. Also, even if training takes days, the checkpoints will prevent lossing the progress if preemtible VM are down.
upvoted 4 times
...
Liting
1 year, 4 months ago
Selected Answer: C
Optimize cost then should use kubeflow
upvoted 2 times
...
M25
1 year, 6 months ago
Selected Answer: C
Went with C
upvoted 1 times
...
CloudKida
1 year, 6 months ago
Selected Answer: C
https://cloud.google.com/ai-platform/prediction/docs/ai-explanations/overview AI Explanations helps you understand your model's outputs for classification and regression tasks. Whenever you request a prediction on AI Platform, AI Explanations tells you how much each feature in the data contributed to the predicted result. You can then use this information to verify that the model is behaving as expected, recognize bias in your models, and get ideas for ways to improve your model and your training data.
upvoted 1 times
...
_learner_
1 year, 6 months ago
Selected Answer: A
preemtible vm are valid for 24hrs. Hence training needs months to complete which is mentioned in question that makes A is answer.
upvoted 2 times
...
tavva_prudhvi
1 year, 8 months ago
Additionally, AI Platform's autoscaling feature can automatically adjust the number of resources used based on the workload, further optimizing costs.
upvoted 1 times
tavva_prudhvi
1 year, 8 months ago
I think it’s a. By using distributed training jobs with checkpoints, you can train your models on multiple GPUs simultaneously, which reduces the training time. Checkpoints allow you to save the progress of your training jobs regularly, so if the training job gets interrupted or fails, you can restart it from the last checkpoint instead of starting from scratch. This saves time and resources, which reduces costs. Additionally, AI Platform's autoscaling feature can automatically adjust the number of resources used based on the workload, further optimizing costs.
upvoted 1 times
...
...
John_Pongthorn
1 year, 10 months ago
C is out of date ? AI Platform is Vertex-AI ,so , this is a simple scenario that would accommodate infrastructure for this case.
upvoted 1 times
...
ares81
1 year, 10 months ago
Selected Answer: A
It's A.
upvoted 2 times
...
hiromi
1 year, 11 months ago
Selected Answer: C
It's seem C - https://www.kubeflow.org/docs/distributions/gke/pipelines/preemptible/ - https://cloud.google.com/optimization/docs/guide/checkpointing
upvoted 4 times
...
ares81
1 year, 11 months ago
"A Preemptible VM (PVM) is a Google Compute Engine (GCE) virtual machine (VM) instance that can be purchased for a steep discount as long as the customer accepts that the instance will terminate after 24 hours." This excludes C and D. Checkpoints are needed for long processing, so A.
upvoted 3 times
...
neochaotic
1 year, 11 months ago
Selected Answer: C
C - Reduce cost with preemptive instances and add checkpoints to snapshot intermediate results
upvoted 3 times
...
LearnSodas
1 year, 11 months ago
Selected Answer: A
Saving checkpoints avoids re-run from scratch
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...