caninerosie

caninerosie t1_j12qgv6 wrote

there are a ton of consumer motherboards that support 128GB max RAM. a single 3090 also has 24GB GDDR6X of memory. If you need more than that you can NVLink another 3090 with the added benefit of speeding up training. That’s getting pretty pricey though.

other than that, there’s the M1 Ultra Mac Studio? won’t be as fast as training on a dedicated GPU but you’ll have the memory for it and faster throughput than normal DRAM

edit: for an extremely large model like GPT-3 you would need almost 400 GB of RAM. theoretically you could build multiple machines with NVLinked 3090/4090s, all networked together for distributed training

2