Viewing a single comment thread. View all comments

dddd0 t1_iwghfua wrote

Interconnect

Supercomputer nodes are usually connected using 100-200 Gbit/s fabrics with latencies in the microsecond range. That's pretty expensive and requires a lot of power, too, but it allows you to treat a supercomputer much more like one Really Big Computer (and previous generations of supercomputers were indeed SSI - Single System Image - systems) instead of A Bunch Of Servers. Simulations like Really Big Computers instead of A Bunch Of Servers. On an ELI5 level something like a weather simulation will divide the world into many regions and each node of a supercomputer handles one region. Interactions between regions are handled through the interconnect, so it's really important for performance.

18

LaconicLacedaemonian t1_iwgt8rr wrote

I maintain a 20k node cluster of computers that pretends to be a single computer. The reason to do it that way is if we 10x our size we can 10x the hardware and individual machines dying are replaced.

3

_ytrohs t1_iwgziml wrote

and cost.. and hypervisor overhead etc

1