Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 289 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 289
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You developed a BigQuery ML linear regressor model by using a training dataset stored in a BigQuery table. New data is added to the table every minute. You are using Cloud Scheduler and Vertex AI Pipelines to automate hourly model training, and use the model for direct inference. The feature preprocessing logic includes quantile bucketization and MinMax scaling on data received in the last hour. You want to minimize storage and computational overhead. What should you do?

  • A. Preprocess and stage the data in BigQuery prior to feeding it to the model during training and inference.
  • B. Use the TRANSFORM clause in the CREATE MODEL statement in the SQL query to calculate the required statistics.
  • C. Create a component in the Vertex AI Pipelines directed acyclic graph (DAG) to calculate the required statistics, and pass the statistics on to subsequent components.
  • D. Create SQL queries to calculate and store the required statistics in separate BigQuery tables that are referenced in the CREATE MODEL statement.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
f084277
6 days, 8 hours ago
Selected Answer: C
Docs say BQ is not suitable for full-pass transformations such as Minmax.
upvoted 1 times
...
carolctech
3 weeks, 5 days ago
Selected Answer: A
A) Preprocessing and staging the data in BigQuery before training and inference, is the most efficient approach because: 1) You can use BQ’s optimized processing by preprocessing data before training 2) Avoiding redundant calculations, by directly using the preprocessed data (already bucketized and scaled) for training and inference; 3) Reducing storage by keeping only preprocessed data, not raw data and statistics separately.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...