Viewing a single comment thread. View all comments

sebzim4500 t1_je10iu2 wrote

>Lower-precision fine-tuning (like INT8, INT4)

How would this work? Are the weight internally represented as f16 and then rounded stochastically whenever they are used?

1