joossss

joossss OP t1_j3q9uip wrote

Only this server is planned. I just went with the recommendation from NVIDIA's website, which stated 100 Gbps per A100, but I guess it makes more sense now that I think of distributed training. What NIC speed seems enough in that case?

1

joossss OP t1_j3q9m8r wrote

The main reason for going to the cloud for us is that we are a research institution so, our funding is project-based meaning we have to use the funding in the allotted time and the second reason is that we already have the GPUs so the time it takes to pay itself off is faster.

2