Submitted by AutoModerator t3_zp1q0s in MachineLearning
Awekonti t1_j0y7esq wrote
Reply to comment by AstroBullivant in [D] Simple Questions Thread by AutoModerator
>Is quantization ultimately a kind of scaling
Not really, it is about approximating (or better to say mapping) of real-world values that brings the limits. So that the model shrinks - computations and other model operations are being executed at lower bit-width(s).
trnka t1_j1hcd0f wrote
Adding a practical example:
I worked on SDKs for mobile phone keyboards on Android devices. The phone manufacturers at the time didn't let us download language data so it needed to ship on the phones out of the box. One of the big parts of each language's data was the ngram model. Quantization allowed us to save the language model probabilities with less precision and we were able to shrink them down with minimal impact on the quality of the language model. That extra space allowed us to ship more languages and/or ship models with higher quality in the same space.
Viewing a single comment thread. View all comments