You recently deployed a scikit-learn model to a Vertex AI endpoint. You are now testing the model on live production traffic. While monitoring the endpoint, you discover twice as many requests per hour than expected throughout the day. You want the endpoint to efficiently scale when the demand increases in the future to prevent users from experiencing high latency. What should you do?
fitri001
Highly Voted 6 months, 1 week agofitri001
6 months, 1 week agoguilhermebutzke
Most Recent 8 months, 2 weeks agoYan_X
9 months agob1a8fae
9 months, 2 weeks ago36bdc1e
9 months, 2 weeks agopikachu007
9 months, 2 weeks agosonicclasps
9 months ago