Submitted by Zondartul t3_zrbfcr in MachineLearning
caninerosie t1_j12qgv6 wrote
there are a ton of consumer motherboards that support 128GB max RAM. a single 3090 also has 24GB GDDR6X of memory. If you need more than that you can NVLink another 3090 with the added benefit of speeding up training. That’s getting pretty pricey though.
other than that, there’s the M1 Ultra Mac Studio? won’t be as fast as training on a dedicated GPU but you’ll have the memory for it and faster throughput than normal DRAM
edit: for an extremely large model like GPT-3 you would need almost 400 GB of RAM. theoretically you could build multiple machines with NVLinked 3090/4090s, all networked together for distributed training
DavesEmployee t1_j1426ms wrote
4090s don’t support NVLink unfortunately 🥲
BelialSirchade t1_j174fni wrote
You don’t need NVLink though, PyTorch support model parallelism through deepspeed anyways, so go ahead and buy that extra 4090
caninerosie t1_j14738h wrote
really? NVIDIA is so weird
DavesEmployee t1_j147fki wrote
I think it’s because they’re mostly used for games which almost never take advantage of the technology. You can tell from the designs that they were going to support it but the feature was taken out probably due to price or power concerns
Viewing a single comment thread. View all comments