You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?
TNT87
Highly Voted 7Â months, 3Â weeks agojulliet
Most Recent 5Â months, 1Â week agoM25
5Â months, 3Â weeks agoM25
5Â months, 3Â weeks agoares81
10Â months agohiromi
10Â months, 1Â week agohiromi
10Â months, 1Â week agomil_spyro
10Â months, 2Â weeks ago