Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?
A.
Create alerts to monitor for skew, and retrain the model.
B.
Perform feature selection on the model, and retrain the model with fewer features.
C.
Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service.
D.
Perform feature selection on the model, and retrain the model on a monthly basis with fewer features.
A
Data values skews: These skews are significant changes in the
statistical properties of data, which means that data patterns are
changing, and you need to trigger a retraining of the model to capture
these changes.
https://developers.google.com/machine-learning/guides/rules-of-ml/#rule_37_measure_trainingserving_skew
Rule #37:
The difference between the performance on the holdout data and the "nextday" data. Again, this will always exist. You should tune your regularization to maximize the next-day performance. However, large drops in performance between holdout and next-day data may indicate that some features are time-sensitive and possibly degrading model performance.
Maybe it should be C
A
Data drift doesn't necessarily require feature reselection (e.g. by L2 regularization).
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning#challenges
When the distribution of input data changes, the model may not perform as well as it did during training. It is important to monitor the performance of the model in production and identify any changes in the distribution of input data. By creating alerts to monitor for skew, you can detect when the input data distribution has changed and take action to retrain the model using more recent data that reflects the new distribution. This will help ensure that the model continues to perform well in production.
Its A, as the model itself is performing well, neither overfitting nor performing poorly suddenly, it's a gradual change so regularization on the original model would not help. C is incorrect.
Creating alerts to monitor for skew in the input data can help to detect when the distribution of the data has changed and the model's performance is affected. Once a skew is detected, retraining the model with the new data can improve its performance.
Skew & drift monitoring: Production data tends to constantly change in different dimensions (i.e. time and system wise). And this causes the performance of the model to drop.
https://cloud.google.com/vertex-ai/docs/model-monitoring/using-model-monitoring
A model learns the distribution of the data, if it has done its job well any change in the distribution will lead to underperformance not by virtue of poor model performance but by very definition.
A voting comment increases the vote count for the chosen answer by one.
Upvoting a comment with a selected answer will also increase the vote count towards that answer by one.
So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.
celia20200410
Highly Voted 3 years, 2 months agooliveolil
2 years, 11 months agomousseUwU
2 years, 12 months agoPaul_Dirac
Highly Voted 3 years, 3 months agoPhilipKoku
Most Recent 4 months, 1 week agotavva_prudhvi
1 year, 3 months agoM25
1 year, 5 months agoSergioRubiano
1 year, 6 months agotavva_prudhvi
1 year, 7 months agoFatiy
1 year, 7 months agoenghabeth
1 year, 8 months agohiromi
1 year, 10 months agoMohamed_Mossad
2 years, 3 months agoMohamed_Mossad
2 years, 4 months agoggorzki
2 years, 9 months agokaike_reis
2 years, 11 months agoDanny2021
3 years, 1 month agogcp2021go
3 years, 2 months agoinder0007
3 years, 4 months ago