exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 198 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 198
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You developed a Transformer model in TensorFlow to translate text. Your training data includes millions of documents in a Cloud Storage bucket. You plan to use distributed training to reduce training time. You need to configure the training job while minimizing the effort required to modify code and to manage the cluster’s configuration. What should you do?

  • A. Create a Vertex AI custom training job with GPU accelerators for the second worker pool. Use tf.distribute.MultiWorkerMirroredStrategy for distribution.
  • B. Create a Vertex AI custom distributed training job with Reduction Server. Use N1 high-memory machine type instances for the first and second pools, and use N1 high-CPU machine type instances for the third worker pool.
  • C. Create a training job that uses Cloud TPU VMs. Use tf.distribute.TPUStrategy for distribution.
  • D. Create a Vertex AI custom training job with a single worker pool of A2 GPU machine type instances. Use tf.distribute.MirroredStrategv for distribution.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
fitri001
6 months, 1 week ago
Selected Answer: A
Vertex AI custom training job: This leverages a managed service within GCP, reducing cluster configuration and management overhead. GPU accelerators for the second worker pool: This allows for distributed training across multiple GPUs, significantly speeding up training compared to a single worker pool. tf.distribute.MultiWorkerMirroredStrategy: This is a TensorFlow strategy specifically designed for distributed training on multiple machines. It minimizes code changes as it handles data parallelization and model replication across devices.
upvoted 3 times
fitri001
6 months, 1 week ago
B. Reduction Server: While Vertex AI supports Reduction Servers, it's generally not required for text translation with Transformers. It's more commonly used for distributed training with specific model architectures. C. Cloud TPU VMs: While Cloud TPUs offer excellent performance, they require significant code modifications to work with Transformer models in TensorFlow. Additionally, managing Cloud TPU VMs involves more complexity compared to Vertex AI custom training jobs. D. Single worker pool: This limits training to a single machine, negating the benefits of distributed training.
upvoted 1 times
...
...
Carlose2108
8 months ago
Why not C?
upvoted 2 times
pinimichele01
6 months, 2 weeks ago
for me is C
upvoted 1 times
...
tavva_prudhvi
5 months, 3 weeks ago
Yeah, but as the question mentions "minimizing the effort required to modify code and to manage the cluster’s configuration", and TPus may require specific adaptations in the model code to fully exploit TPU capabilities.
upvoted 2 times
...
...
guilhermebutzke
8 months, 2 weeks ago
Selected Answer: A
My Answer: A - Distributed training: Utilizes GPUs in 2nd worker pool for speedup. - Minimal code changes: Vertex AI custom job for ease of use. - Managed cluster: No manual configuration needed. Other options: - B: Complex setup with different machine types and Reduction Server. - C: TPUs may not be optimal for Transformers and require code changes. - D: Lacks distributed training, limiting speed improvement.
upvoted 2 times
...
pikachu007
9 months, 2 weeks ago
Selected Answer: A
Minimizes code modification: MultiWorkerMirroredStrategy often requires minimal code changes to distribute training across multiple workers, aligning with the goal of minimizing effort. Simplifies cluster management: Vertex AI handles cluster configuration and scaling for custom training jobs, reducing the need for manual management. Effective distributed training: MultiWorkerMirroredStrategy is well-suited for large models and datasets, efficiently distributing training across GPUs.
upvoted 3 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago