HateRedditCantQuitit

HateRedditCantQuitit t1_iztdm8c wrote

> but I'm still wondering whether its worth it to just build a graphics card rig for the long term.

Pretty much never, assuming it's for personal use.

If you're going to use this rig exclusively for ML, then maybe it still makes sense. The calculation becomes simple: cost to buy + energy cost to use * amount of use before it doesnt fit your needs vs cloud cost. If you use it enough for this to make sense, you might also be surprised how quickly you outgrow it (e.g. maybe you'll want to run some experiments in parallel sometimes, or you want to use models bigger than this thing's VRAM in a year or few).

If you want to use it for non-ML use, then no just use the cloud. If you're using it enough that the above calculation says to buy, then you won't actually get to use it for non-ML use, which will just annoy the hell out of you.

7

HateRedditCantQuitit t1_ix4d4sx wrote

You can scale semi-supervised learning much more easily and cheaply and safely than you can scale human-in-the-loop RL. Similar to why we don’t train self driving cars by putting them in the real world and making them learn by RL.

If we could put a language model in a body and let it learn safely through human tutoring in a more time effective and cost effective way, maybe it could be worthwhile. Today, it doesn’t seem to be the time effective or cost effective solution.

And while I’m on my podium, once LMs are in any loop commercially talking to people at scale, I expect this will be a huge topic.

Tangentially, check out this short story/novella that kinda explores the idea from a fictional perspective. It’s incredibly well written and interesting by a favorite author of mine. “The Lifecycle of Software Objects” by Ted Chiang https://web.archive.org/web/20130306030242/http://subterraneanpress.com/magazine/fall_2010/fiction_the_lifecycle_of_software_objects_by_ted_chiang

18