You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine
(GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?
Y2Data
Highly Voted 3 years, 9 months agomousseUwU
3 years, 8 months agopico
Highly Voted 1 year, 7 months agodesertlotus1211
Most Recent 6 months, 1 week agorajshiv
6 months, 4 weeks agoAB_C
7 months, 1 week agodesertlotus1211
8 months, 2 weeks agotaksan
10 months, 2 weeks agochirag2506
1 year agoPhilipKoku
1 year agopinimichele01
1 year, 2 months agoichbinnoah
1 year, 7 months agoedoo
1 year, 4 months agotavva_prudhvi
1 year, 10 months agoharithacML
1 year, 11 months agoLiting
1 year, 12 months agoM25
2 years, 1 month agoSergioRubiano
2 years, 3 months agoYajnas_arpohc
2 years, 3 months agofrangm23
2 years, 2 months ago