Submitted by lifesthateasy t3_11ktxjl in MachineLearning
I've finally pulled the plug on a 4090 that'll arrive by the end of this week after ages with a 1050, and besides throwing everything ray traced at it, I also want to use it to train some deep learning models.
I do know the talk of the town, LLMs, are waaay too big to be done on such a card (iirc ChatGPT was train on 1024 industrial cards), but I was wondering if there's some neat DIY projects I could set up and train in a human amount of time (something that's not neural style transfer, that already ran on the 1050 too).
FYI I'm not specifically looking for language modeling, Chat was just an example about a model that'd def be too big.
Disastrous_Elk_6375 t1_jb8y5r2 wrote
GptNeoX should fit with 8bit and low prompt sizes. GptJ-7B should fit as well with 16bit inference. On smaller models you might even be able to do some finetuning for fun.
There's a couple of coding models from salesforce that you could fit comfortably. Check out FauxPilot for a copilot "clone".