Submitted by Qwillbehr t3_11xpohv in MachineLearning
fnbr t1_jd6j7gh wrote
Right now, the tech isn't there to train on a single GPU. You're gonna end up training a language model for ~1 month to do so. It is slightly more efficient, though.
Lots of people looking at running locally. In addition to everything that people have said, there's a bunch of companies that will be releasing models that can just barely fit on an A100 soon that I've heard rumours about from employees.
Viewing a single comment thread. View all comments