Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Data Engineer All Questions

View all questions & answers for the Professional Data Engineer exam

Exam Professional Data Engineer topic 1 question 2 discussion

Actual exam question from Google's Professional Data Engineer
Question #: 2
Topic #: 1
[All Professional Data Engineer Questions]

You are building a model to make clothing recommendations. You know a user's fashion preference is likely to change over time, so you build a data pipeline to stream new data back to the model as it becomes available. How should you use this data to train the model?

  • A. Continuously retrain the model on just the new data.
  • B. Continuously retrain the model on a combination of existing data and the new data.
  • C. Train on the existing data while using the new data as your test set.
  • D. Train on the new data while using the existing data as your test set.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
serg3d
Highly Voted 4 years, 5 months ago
I think it should be B because we have to use a combination of old and new test data as well as training data
upvoted 36 times
dambilwa
4 years, 5 months ago
Yes - The training set should be shuffled well to represent data across all scenarios
upvoted 4 times
...
...
jagadamba
Highly Voted 4 years, 4 months ago
B, as we need to train the data with new data, so that it will keep learning, and as well as used for test
upvoted 11 times
...
SamuelTsch
Most Recent 1 month ago
Selected Answer: B
From my point of view, we should take both datasets.
upvoted 1 times
...
rocky48
1 year ago
Selected Answer: B
Option A is not recommended because retraining the model on just new data will cause the model to lose the information it has learned from the historical data. Option C and D are not recommended because they are using the new data as test set and this approach will lead to a model that is overfitting and not generalize well to new users. So answer is B
upvoted 1 times
...
rajkinz
1 year ago
Answer is C. It is time sensitive data so latest data should be used for testing. Reference: https://cloud.google.com/automl-tables/docs/prepare#ml-use
upvoted 2 times
...
rtcpost
1 year, 1 month ago
Selected Answer: B
This approach allows the model to benefit from both the historical data (existing data) and the new data, ensuring that it adapts to changing preferences while retaining knowledge from the past. By combining both types of data, the model can learn to make recommendations that are up-to-date and relevant to users' evolving preferences.
upvoted 1 times
...
Websurfer
1 year, 3 months ago
Selected Answer: B
train on old and new data
upvoted 1 times
...
AmmarFasih
1 year, 6 months ago
Selected Answer: B
Option B is the right answer. Since the questions states the models needs to be updated since the clothing preference changes. Hence we need the new data to be utilized for training/ updating model.
upvoted 1 times
...
bha11111
1 year, 8 months ago
Selected Answer: B
Have verified this
upvoted 1 times
...
jin0
1 year, 9 months ago
there are two point first when retraining second what data. I think retraining should be occur when the model could not predict well in this case there is monitoring metric should be needed first but no one said, second what data? in this case I think the answer is A. because when the model could not predict well it means the data variance and bias are changed so, it's no make sense what is combination new data with old data because the data being not be changed is not necessary anymore..
upvoted 1 times
jin0
1 year, 9 months ago
And the questions should explain in detail.. whether it's deep learning or tree based machine learning model.. and how large of new dataset is.. I think
upvoted 1 times
...
...
Morock
1 year, 9 months ago
Selected Answer: C
The trend keep changing, so must mix new and old data...
upvoted 1 times
...
samdhimal
1 year, 10 months ago
Selected Answer: B
B. Continuously retrain the model on a combination of existing data and the new data. This approach will help to ensure that the model remains up-to-date with the latest fashion preferences of the users, while also leveraging the historical data to provide context and improve the accuracy of the recommendations. Retraining the model on a combination of existing and new data will help to prevent the model from being overly influenced by the new data and losing its ability to generalize to users with different preferences. Option A is not recommended because retraining the model on just new data will cause the model to lose the information it has learned from the historical data. Option C and D are not recommended because they are using the new data as test set and this approach will lead to a model that is overfitting and not generalize well to new users.
upvoted 5 times
rocky48
1 year ago
Nice explanation bro.
upvoted 1 times
...
...
korntewin
1 year, 10 months ago
The answer can be A, if we implement online learning! But for regular model which can't implement online learning (everything with no gradient descent) the answer should be B.
upvoted 1 times
...
testoneAZ
1 year, 10 months ago
Correct answer is B
upvoted 1 times
...
javibadillo
2 years ago
Selected Answer: B
B is the correct answer
upvoted 2 times
...
Maeel92
2 years, 1 month ago
B is the correct answer. Model training using existing data + new data will allow model to learn new behaviour of users while retaining some of the older behaviour to make model.
upvoted 1 times
...
Hikky_111
2 years, 5 months ago
B seems like the best option because we need to make the prediction based on the latest preferences by an individual also we need not forget how the user used to prefer before .It's highly unlikely that a person's fashion sense changes a lot.
upvoted 1 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...