Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 36 discussion

Actual exam question from Google's Professional Machine Learning Engineer

Question #: 36
Topic #: 1

[All Professional Machine Learning Engineer Questions]

You are building a model to predict daily temperatures. You split the data randomly and then transformed the training and test datasets. Temperature data for model training is uploaded hourly. During testing, your model performed with 97% accuracy; however, after deploying to production, the model's accuracy dropped to 66%. How can you make your production model more accurate?

A. Normalize the data for the training, and test datasets as two separate steps.
B. Split the training and test data based on time rather than a random split to avoid leakage.
C. Add more data to your test set to ensure that you have a fair distribution and sample for testing.
D. Apply data transformations before splitting, and cross-validate to make sure that the transformations are applied to both the training and test sets.

Show Suggested Answer

Suggested Answer: B 🗳️

by maartenalexander at June 22, 2021, 12:46 p.m.

Comments

Submit Cancel

maartenalexander

Highly Voted 3 years, 10 months ago

B. If you do time series prediction, you can't borrow information from the future to predict the future. If you do, you are artificially increasing your accuracy.

upvoted 35 times

...

desertlotus1211

Most Recent 3 months, 3 weeks ago

Selected Answer: B

D is incorrect: Applying transformations before splitting is important, but it does not resolve the issue of time leakage. Even if transformations are done correctly, the random split will still lead to inflated test accuracy and poor production performance. This option focuses on correct data processing, but it does not address the leakage caused by random splitting in time series data.

upvoted 1 times

...

baimus

7 months, 1 week ago

Selected Answer: D

It's D

upvoted 1 times

baimus

7 months, 1 week ago

B I mean. Sorry I wrote that comment very early and there is no delete key!

upvoted 1 times

...

jsalvasoler

8 months, 2 weeks ago

Selected Answer: B

temporal split is a must in time series forecasting evaluation

upvoted 1 times

...

PhilipKoku

10 months, 2 weeks ago

Selected Answer: B

B) Time split to avoid leaking data.

upvoted 1 times

...

fragkris

1 year, 4 months ago

Selected Answer: B

Definetely B

upvoted 1 times

...

Sum_Sum

1 year, 5 months ago

Selected Answer: B

they did not explicitly say forecasting, but splitting by time is the number one rule you learn

upvoted 1 times

...

M25

1 year, 11 months ago

Selected Answer: B

Went with B

upvoted 1 times

...

SergioRubiano

2 years ago

Selected Answer: D

D is correct. cross-validate

upvoted 2 times

...

Mohamed_Mossad

2 years, 10 months ago

Selected Answer: B

train accuracy 97% , production accuracy 66% ---> time series data ---> random split ---> cause leakage , answer is B

upvoted 2 times

...

David_ml

2 years, 11 months ago

Selected Answer: B

You don't split data randomly for time series prediction.

upvoted 3 times

...

mmona19

3 years ago

Selected Answer: B

B should be the answer. D is incorrect as normalize before split is going to do data leak https://community.rapidminer.com/discussion/32592/normalising-data-before-data-split-or-after

upvoted 2 times

...

giaZ

3 years, 1 month ago

Selected Answer: B

If you do random split in a time series, your risk that training data will contain information about the target (definition of leakage), but similar data won't be available when the model is used for prediction. Leakage causes the model to look accurate until you start making actual predictions with it.

upvoted 3 times

...

xiaoF

3 years, 2 months ago

agree B as well

upvoted 2 times

...

JobQ

3 years, 3 months ago

I think is B

upvoted 2 times

...

Danny2021

3 years, 7 months ago

B. D doesn't improve anything at all. Split and Transform is no different than Transform and Split if the transform logic is the same.

upvoted 3 times

...

Jijiji

3 years, 7 months ago

seems like D

upvoted 1 times

...

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 36 discussion

Comments

maartenalexander

desertlotus1211

baimus

baimus

jsalvasoler

PhilipKoku

fragkris

Sum_Sum

M25

SergioRubiano

Mohamed_Mossad

David_ml

mmona19

giaZ

xiaoF

JobQ

Danny2021

Jijiji

SY0-701