You work for a large bank that serves customers through an application hosted in Google Cloud that is running in the US and Singapore. You have developed a PyTorch model to classify transactions as potentially fraudulent or not. The model is a three-layer perceptron that uses both numerical and categorical features as input, and hashing happens within the model.
You deployed the model to the us-central1 region on nl-highcpu-16 machines, and predictions are served in real time. The model's current median response latency is 40 ms. You want to reduce latency, especially in Singapore, where some customers are experiencing the longest delays. What should you do?
guilhermebutzke
Highly Voted 1 year, 2 months agotavva_prudhvi
1 year agodevops_bms
Most Recent 2 months, 2 weeks agouatud3
5 months agof084277
5 months, 1 week agowences
7 months, 1 week agoinc_dev_ml_001
10 months, 3 weeks agof084277
5 months, 1 week agoGuineaPigHunter
11 months, 1 week agoomermahgoub
1 year agotavva_prudhvi
1 year agoshuvs
1 year agoAzureDP900
9 months, 3 weeks agoAzureDP900
9 months, 3 weeks agopinimichele01
1 year ago