Welcome to ExamTopics
ExamTopics Logo
- Expert Verified, Online, Free.
exam questions

Exam Professional Machine Learning Engineer All Questions

View all questions & answers for the Professional Machine Learning Engineer exam

Exam Professional Machine Learning Engineer topic 1 question 6 discussion

Actual exam question from Google's Professional Machine Learning Engineer
Question #: 6
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You work for an online retail company that is creating a visual search engine. You have set up an end-to-end ML pipeline on Google Cloud to classify whether an image contains your company's product. Expecting the release of new products in the near future, you configured a retraining functionality in the pipeline so that new data can be fed into your ML models. You also want to use AI Platform's continuous evaluation service to ensure that the models have high accuracy on your test dataset. What should you do?

  • A. Keep the original test dataset unchanged even if newer products are incorporated into retraining.
  • B. Extend your test dataset with images of the newer products when they are introduced to retraining.
  • C. Replace your test dataset with images of the newer products when they are introduced to retraining.
  • D. Update your test dataset with images of the newer products when your evaluation metrics drop below a pre-decided threshold.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
esuaaaa
Highly Voted 3 years, 5 months ago
I think B is the right answer. A: Doesn't make sense. If you don't use the new product, it becomes useless. C: Conventional products are also necessary as data. D: I don't understand the need to wait until the threshold is exceeded.
upvoted 32 times
mousseUwU
3 years, 1 month ago
Agree with you, B is correct
upvoted 1 times
...
q4exam
3 years, 2 months ago
Agree, B as it extends to new products.
upvoted 1 times
...
VincenzoP84
1 year, 6 months ago
D could have sense considering that is mentioned the intention to use AI Platform's continuous evaluation service
upvoted 2 times
...
maukaba
1 year ago
it's D for two reasons: - explicitly required in the question to leverage Continuous evaluation service - the threshod check allows to decide when perform the retrain avoiding making it for every single new data arrived.
upvoted 2 times
...
...
gcp2021go
Highly Voted 3 years, 5 months ago
answer is B
upvoted 11 times
...
joqu
Most Recent 4 days, 18 hours ago
Selected Answer: D
Task: "You also want to use AI Platform's continuous evaluation service to ensure that the models have high accuracy on your test dataset." Docs: https://cloud.google.com/vertex-ai/docs/evaluation/introduction "After your model is deployed to production, periodically evaluate your model with new incoming data. If the evaluation metrics show that your model performance is degrading, consider re-training your model. This process is called continuous evaluation."
upvoted 1 times
...
503b759
1 month, 1 week ago
D: Its definitely not a clear choice. B is the most obvious answer - you know you've got new data coming in, so why not incorporate it immediately into training. EXCEPT the question clearly states that Vertex continual evaluation should feature.
upvoted 1 times
...
MisterHairy
1 month, 4 weeks ago
=New Question6= You work for a global footwear retailer and need to predict when an item will be out of stock based on historical inventory dat a. Customer behavior is highly dynamic since footwear demand is influenced by many different factors. You want to serve models that are trained on all available data, but track your performance on specific subsets of data before pushing to production. What is the most streamlined and reliable way to perform this validation? A. Use the TFX Mode!Validator tools to specify performance metrics for production readiness B. Use k-fold cross-validation as a validation strategy to ensure that your model is ready for production. C. Use the last relevant week of data as a validation set to ensure that your model is performing accurately on current data. D. Use the entire dataset and treat the area under the receiver operating characteristics curve (AUC ROC) as the main metric.
upvoted 4 times
oddsoul
2 months, 1 week ago
Option A You can define specific performance metrics for different subsets of your data
upvoted 1 times
...
VJlaxmi
5 months, 3 weeks ago
option A is correct
upvoted 1 times
VJlaxmi
5 months, 3 weeks ago
TFX ModelValidator tools are designed to integrate performance tracking into the ML pipeline, providing robust validation on specific subsets of data before deploying models to production.
upvoted 1 times
...
...
wences
2 years, 9 months ago
A is the correct
upvoted 6 times
...
sid515
2 years, 10 months ago
B looks to be ok as using cross validation testing results are more even
upvoted 2 times
...
...
harithacML
1 month, 4 weeks ago
Selected Answer: B
A. Keep the original test dataset unchanged even if newer products are incorporated into retraining. : This would not test on new products. B. Extend your test dataset with images of the newer products when they are introduced to retraining. Most Voted : old+new products testing. Great C. Replace your test dataset with images of the newer products when they are introduced to retraining. : No need of old product to be tested? old product recognition might change when new products are added in training. Option Not good. D. Update your test dataset with images of the newer products when your evaluation metrics drop below a pre-decided threshold.: why wait? no need
upvoted 1 times
...
EFIGO
1 month, 4 weeks ago
Selected Answer: B
You need to correctly classify newer products, so you need the new training data ==> A is wrong; You need to keep doing a good job on older dataset, you can't just ignore it ==> C is wrong; You know when you are introducing new products, there is no need to wait for a drop in preformaces ==> D is wrong; B is correct
upvoted 2 times
...
oddsoul
2 months, 1 week ago
Selected Answer: B
B correct
upvoted 1 times
...
PhilipKoku
5 months, 2 weeks ago
Selected Answer: B
The best approach is option B: Extend your test dataset with images of the newer products. This ensures accurate evaluation as your product catalog evolves.
upvoted 1 times
...
guilhermebutzke
10 months ago
Selected Answer: B
My initial confusion with option B arose from the phrase "with images of the newer products when they are introduced to retraining." Initially, I mistakenly interpreted it as recommending the use of the same images in both training and testing, which is incorrect. However, upon further reflection, I realized that using the same product does not necessarily mean using identical images. Therefore, I now believe that option B is the most suitable choice.
upvoted 1 times
...
bugger123
11 months, 3 weeks ago
Selected Answer: B
A and C make no sense - you don't want to lose any of the performance on existing products. D - Why would you wait for your performance to drop in the first place? That's a reactive rather than proactive approach. The answer is B
upvoted 1 times
...
fragkris
11 months, 3 weeks ago
Selected Answer: B
B for sure
upvoted 1 times
...
Sum_Sum
1 year ago
B is the only thing we do in practice
upvoted 1 times
...
M25
1 year, 6 months ago
Selected Answer: B
Went with B
upvoted 2 times
...
will7722
1 year, 8 months ago
Selected Answer: B
you can't just replace the old product data with just new product, until you don't sell old product anymore
upvoted 2 times
...
SharathSH
1 year, 10 months ago
Ans: B A would not use the newer data hence not a ideal option C Replacing will not be a good option as it will replace older data with newer data which in turn hampers accuracy D waiting for threshold is not a better option
upvoted 1 times
...
koakande
1 year, 11 months ago
B is the most plausible answer. The key principle is that test set should represent ground truth distribution to infer credible model evaluation. So once new products become available, test set should be updated to reflect the new product distribution
upvoted 2 times
...
Community vote distribution
A (35%)
C (25%)
B (20%)
Other
Most Voted
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

SaveCancel
Loading ...