Submitted by head_robotics t3_1172jrs in MachineLearning
I've been looking into open source large language models to run locally on my machine.
Seems GPT-J and GPT-Neo are out of reach for me because of RAM / VRAM requirements.
What models would be doable with this hardware?:
CPU: AMD Ryzen 7 3700X 8-Core, 3600 MhzRAM: 32 GB
GPUs:
- NVIDIA GeForce RTX 2070 8GB VRAM
- NVIDIA Tesla M40 24GB VRAM
catch23 t1_j9b9upb wrote
Could try something like this: https://github.com/Ying1123/FlexGen
This was only released a few hours ago, so there's no way for you to have discovered this previously. Basically makes use of various strategies if your machine has lots of normal cpu memory. The paper authors were able to fit a 175B parameter model on their lowly 16GB T4 gpu (with a machine with 200GB of normal memory).