What you’re missing is the width of the connections.
A 100Gbit connection is a serial link over fibre usually, so you have a small piece of dedicated hardware that can turn signals into light at that speed.
But to get the data to that piece of hardware you have a bus that is 16bits wide (PCIe) running at 32 billion transfers per second (PCIe 5.0). So now we have a bus capable of feeding 512Gbps to the network card, 100Gbit seems relatively pedestrian. But again, that’s more than the 5GHz of the CPU so what gives?
Well a CPUs clock speed doesn’t measure bits, it measures instructions, which can include 64 bits per register, which is theoretically 320Gbps of data that it’s handling per core.
Wait, per core? Well now we get to the real meat and potatoes of the question: the CPU that is connected to that network card over the PCIe bus likely has upwards of 32 cores, and as much as 128 for Epyc or 192 for Ampere One, all running at multi GHz. And the CPU as a whole could have not just the 16 lanes of PCIe that’s connected to a single card but up to 128 lanes connected to multiple cards all running line rate at the same time. The data may either come from memory or disks that are also connected to those PCIe busses.
Main memory on a machine like this (12 channel ddr5 for Epyc) can supply up to 3.6Tb/s of throughput. That’s 36 times as fast as that poor little network card.
Modern MVME disks can supply about 4 lanes PCIe bandwidth each, so in reality you can get 100Gbit/s of data out of just two modern SSDs.
All that to say that in a modern datacentre, it’s quite possibly the poor little 100Gbit/s network card that is the actual bottleneck, and often they’ll have two ports on a card.
With PCIe5 these days we are looking at the possibility that we can have a 400Gbps card *per slot* in a modern server.
Latest Answers