Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 198 discussion

Actual exam question from Google's Professional Machine Learning Engineer

Question #: 198
Topic #: 1

[All Professional Machine Learning Engineer Questions]

You developed a Transformer model in TensorFlow to translate text. Your training data includes millions of documents in a Cloud Storage bucket. You plan to use distributed training to reduce training time. You need to configure the training job while minimizing the effort required to modify code and to manage the cluster’s configuration. What should you do?

A. Create a Vertex AI custom training job with GPU accelerators for the second worker pool. Use tf.distribute.MultiWorkerMirroredStrategy for distribution.
B. Create a Vertex AI custom distributed training job with Reduction Server. Use N1 high-memory machine type instances for the first and second pools, and use N1 high-CPU machine type instances for the third worker pool.
C. Create a training job that uses Cloud TPU VMs. Use tf.distribute.TPUStrategy for distribution.
D. Create a Vertex AI custom training job with a single worker pool of A2 GPU machine type instances. Use tf.distribute.MirroredStrategv for distribution.

Show Suggested Answer

Suggested Answer: A 🗳️

by pikachu007 at Jan. 13, 2024, 4:11 a.m.

Comments

Submit Cancel

fitri001

6 months, 1 week ago

Selected Answer: A

Vertex AI custom training job: This leverages a managed service within GCP, reducing cluster configuration and management overhead. GPU accelerators for the second worker pool: This allows for distributed training across multiple GPUs, significantly speeding up training compared to a single worker pool. tf.distribute.MultiWorkerMirroredStrategy: This is a TensorFlow strategy specifically designed for distributed training on multiple machines. It minimizes code changes as it handles data parallelization and model replication across devices.

upvoted 3 times

fitri001

6 months, 1 week ago

B. Reduction Server: While Vertex AI supports Reduction Servers, it's generally not required for text translation with Transformers. It's more commonly used for distributed training with specific model architectures. C. Cloud TPU VMs: While Cloud TPUs offer excellent performance, they require significant code modifications to work with Transformer models in TensorFlow. Additionally, managing Cloud TPU VMs involves more complexity compared to Vertex AI custom training jobs. D. Single worker pool: This limits training to a single machine, negating the benefits of distributed training.

upvoted 1 times

...

Carlose2108

8 months ago

Why not C?

upvoted 2 times

pinimichele01

6 months, 2 weeks ago

for me is C

upvoted 1 times

...

tavva_prudhvi

5 months, 3 weeks ago

Yeah, but as the question mentions "minimizing the effort required to modify code and to manage the cluster’s configuration", and TPus may require specific adaptations in the model code to fully exploit TPU capabilities.

upvoted 2 times

...

guilhermebutzke

8 months, 2 weeks ago

Selected Answer: A

My Answer: A - Distributed training: Utilizes GPUs in 2nd worker pool for speedup. - Minimal code changes: Vertex AI custom job for ease of use. - Managed cluster: No manual configuration needed. Other options: - B: Complex setup with different machine types and Reduction Server. - C: TPUs may not be optimal for Transformers and require code changes. - D: Lacks distributed training, limiting speed improvement.

upvoted 2 times

...

pikachu007

9 months, 2 weeks ago

Selected Answer: A

Minimizes code modification: MultiWorkerMirroredStrategy often requires minimal code changes to distribute training across multiple workers, aligning with the goal of minimizing effort. Simplifies cluster management: Vertex AI handles cluster configuration and scaling for custom training jobs, reducing the need for manual management. Effective distributed training: MultiWorkerMirroredStrategy is well-suited for large models and datasets, efficiently distributing training across GPUs.

upvoted 3 times

...

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 198 discussion

Comments

fitri001

fitri001

Carlose2108

pinimichele01

tavva_prudhvi

guilhermebutzke

pikachu007

SY0-701