exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 171 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 171
Topic #: 1
[All Professional Data Engineer Questions]

You work for a large real estate firm and are preparing 6 TB of home sales data to be used for machine learning. You will use SQL to transform the data and use
BigQuery ML to create a machine learning model. You plan to use the model for predictions against a raw dataset that has not been transformed. How should you set up your workflow in order to prevent skew at prediction time?

  • A. When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data.
  • B. When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. Before requesting predictions, use a saved query to transform your raw input data, and then use ML.EVALUATE.
  • C. Use a BigQuery view to define your preprocessing logic. When creating your model, use the view as your model training data. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data.
  • D. Preprocess all data using Dataflow. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any further transformations on the input data.
Show Suggested Answer Hide Answer
Suggested Answer: A 🗳️

Comments

Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.
Switch to a voting comment New
AWSandeep
Highly Voted 2 years, 3 months ago
Selected Answer: A
A. When creating your model, use BigQuery's TRANSFORM clause to define preprocessing steps. At prediction time, use BigQuery's ML.EVALUATE clause without specifying any transformations on the raw input data. Using the TRANSFORM clause, you can specify all preprocessing during model creation. The preprocessing is automatically applied during the prediction and evaluation phases of machine learning. Reference: https://cloud.google.com/bigquery-ml/docs/bigqueryml-transform
upvoted 14 times
...
zellck
Highly Voted 2 years ago
Selected Answer: A
A is the answer. https://cloud.google.com/bigquery-ml/docs/bigqueryml-transform Using the TRANSFORM clause, you can specify all preprocessing during model creation. The preprocessing is automatically applied during the prediction and evaluation phases of machine learning
upvoted 6 times
...
SamuelTsch
Most Recent 2 months ago
Selected Answer: A
A
upvoted 1 times
...
Lenifia
5 months, 3 weeks ago
Selected Answer: B
The key to preventing skew in machine learning models is to ensure that the same data preprocessing steps are applied consistently to both the training data and the prediction data. In option B, the TRANSFORM clause in BigQuery ML is used to define preprocessing steps during model creation, and a saved query is used to apply the same transformations to the raw input data before making predictions. This ensures consistency and prevents skew. The ML.EVALUATE function is then used to evaluate the model’s performance on the transformed prediction data. This is the recommended workflow
upvoted 2 times
...
Matt_108
11 months, 2 weeks ago
Selected Answer: A
Option A
upvoted 1 times
...
Prudvi3266
1 year, 8 months ago
Selected Answer: A
A is correct answer if we use TRANSFORM clause in BigQuery no need to use any transform while evaluating and predicting https://cloud.google.com/bigquery/docs/bigqueryml-transform
upvoted 3 times
...
Kvk117
1 year, 11 months ago
Selected Answer: A
A is the correct answer
upvoted 2 times
...
jkhong
2 years ago
Selected Answer: A
Problem: Skew One thing that I overlooked when answering previously is that B, C does not address skew. When we preprocess our training data, we need to save our scaled factors somewhere, and when performing predictions on our test data, we need to use the scaling factors of our training data to predict the results. ML.EVALUATE already incorporates preprocessing steps for our test data using the saved scaled factors.
upvoted 3 times
...
GCPSharon
2 years, 2 months ago
Selected Answer: C
Stew prediction time by remove the preprocessing!
upvoted 1 times
...
TNT87
2 years, 3 months ago
Selected Answer: A
https://cloud.google.com/bigquery-ml/docs/bigqueryml-transform Ans A
upvoted 4 times
...
ducc
2 years, 3 months ago
Selected Answer: A
This query's nested SELECT statement and FROM clause are the same as those in the CREATE MODEL query. Because the TRANSFORM clause is used in training, you don't need to specify the specific columns and transformations. They are automatically restored. Reference: https://cloud.google.com/bigquery-ml/docs/bigqueryml-transform
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...
exam
Someone Bought Contributor Access for:
SY0-701
London, 1 minute ago