Today they are quite close to just a bunch of regular computers that work together with quite standard parts. networks hardware will be the most nonstandard. Historically the were quite different designs but over time using existing hardware and just a lot if it has shown to be more cost-efficient. It might not be a consumer variant of part but top-of-the-line parts
Take a look at the currentölyu highest ranked on Top 500 [https://www.top500.org/system/180047/](https://www.top500.org/system/180047/) on that page you can see it a short description
>HPE Cray EX235a, AMD Optimized 3rd Generation EPYC 64C 2GHz, AMD Instinct MI250X, Slingshot-11
The CPU is quite clear it is AMD Optimized 3rd Generation EPYC 64C 2GHz,
That is a 64-core server variant of the Zen 3 architecture.
Then what is a AMD Instinct MI250X look at https://en.wikipedia.org/wiki/AMD_Instinct it uses a CDNA 2 architecture that is Computer DNA compared to Radeon DNA that regular GPUS have A look at https://en.wikipedia.org/wiki/RDNA_(microarchitecture) show RDNA2 is what he Radeon RX 6xxx series use
So it is made up of computers with 64 core Zen 3 CPUs and GPUs with the same architecture as Radeon RX 6xxx graphic card
Then take a look at the computers now homepage
https://www.olcf.ornl.gov/frontier/ and you can firn the press release for it
https://www.ornl.gov/news/frontier-supercomputer-debuts-worlds-fastest-breaking-exascale-barrier
>Frontier has 74 HPE Cray EX supercomputer cabinets, which are purpose-built to support next-generation supercomputing performance and scale, once open for early science access.
>Each node contains one optimized EPYC™ processor and four AMD Instinct™ accelerators, for a total of more than 9,400 CPUs and more than 37,000 GPUs in the entire system.
9400/74 = 127 CPUs per cabinet. that is like 128 in reality for a total of 9472 CPUs and 37888 GPUs
https://www.olcf.ornl.gov/wp-content/uploads/2020/02/frontier_node_diagram_lr.png talk about 2 nodes per blade so 64 blades per cabinet
If you take a look at https://www.hpe.com/us/en/collaterals/collateral.a50002389.HPE-Cray-EX-Liquid-Cooled-Cabinet-for-Large-Scale-Systems-brochure.html?rpv=cpf&parentPage=/fi/en/products/compute/hpc/supercomputing/cray-exascale-supercomputer they sat there are 64 compute blads per cabinet That is exactly what we calculated above.
What is quite special in a computer cluster like this is the network card. To maximize performance you need a minimal delay and high speed for the whole network. That is special hardware because regular network cards wull be the bottleneck if used https://www.hpe.com/fi/en/compute/hpc/slingshot-interconnect.html
The rack will have special hardware for cooling. It looks like they are made so the racks have pipes with coolant delivered to them
The OS is Linux based. The software you run on the computer need to be explicitly coded to work in an environment with multiple computers. Ther are libraries to programing languages that support developed. It can be suprising easy to spread ot work but it will be hard to do it in an efficient way. Here is a example of a library for that https://en.wikipedia.org/wiki/OpenMP
So the faster supercomputer in the world is 9472 computers with a powerful CPU and 4 GPUs each. They are all interconnected with special types of network cards that use software developed to be run on multiple computers and use them all efficiently.
Latest Answers