Why do all supercomputers in the world use linux?

684 viewsOtherTechnology

Why do all supercomputers in the world use linux?

In: Technology

22 Answers

Anonymous 0 Comments

Linux can be updated without restarting, this allows such computers and their services to stay up 24/7 no downtime .

Anonymous 0 Comments

To make something go fast it’s generally a better thing if it’s lighter. And we want these computers flying as fast as possible, in the computational sense. Linux can be a very very light operating system. It only has what it needs to make those computers fly. Windows and Mac are huge operating systems. You could never get these things moving as fast as you want.

Anonymous 0 Comments

Holy shit. Since when is this, Explain it like I’m 5 years post-doctorate computer scientist!

Anonymous 0 Comments

Linux isn’t an operating system, it’s a kernel. A kernel interacts with the computer’s hardware, like memory, CPU, etc. An operating system (OS) includes the kernel along with other software that lets users and applications interact with the system. So, when you use Linux as an OS you’re actually using an operating system built around the Linux kernel, which may be completely custom or be based on an existing family.

The Linux kernel is based on very popular Unix specifications (Unix being an older OS that lend many of its design principles to modern OSes) and is developed by a companies and individuals from all around world, as it’s an open-source and free kernel. Being the most popular of its time, it means it had the lion’s share of development effort put in, turning out to be a very robust kernel to be used. It’s also extremely flexible, allowing for the creation of custom OSes, something essential for supercomputers that often use very tailored solutions for their functioning. Once you have the Linux kernel, you can mold your OS around your hardware and software needs.

In the past, companies often built their own custom OSes for supercomputers, each based on different kernels. In fact, nothing prevents any company from doing so today, for example, I’m sure Apple or Microsoft could come up with solutions for their needs based on their custom software, HOWEVER the fact Linux has matured so much, has put so much development into means that there’s often no interest in spending a lot of money when they could take the tried-and-true Linux kernel as a starting point.

Anonymous 0 Comments

Because you don’t really want it updating in the middle of your next calculations do you?

Anonymous 0 Comments

Aside from the technical reasons already listed here, there is the economic issue. While supercomputers (and large scale scientific computing in general) is sexy and provides bragging rights, the actual quantity of supercomputers in the world is tiny compared to the overall computing market. So there is no business incentive to purpose-build a proprietary OS just for that. Better to just customize linux.

Plus the applications and code run on supercomputers is very soecialized for harrow use-cases, often developed by Universities or researchers themselves. These apps are built on open-source tools that were created on linux and run most easily on linux. No way Microsoft or Oracle or SAP is going to develop, say, a quantum chromodynamic simulation of gravitation (I made that up 🙂 and make any money selling it. So they don;t.

Anonymous 0 Comments

Adding a bit beyond the licensing and hardware discussion…

The way programs run on an supercomputer is by dividing up a large problem into smaller tasks: If it takes me 24 hours to solve a problem, then two us can solve it in 12 hours (in reality it’s not an exact doubling in speed).

More specifically, each task usually involves some set of equations for a particular area. Imagine a square that you’ve divided up into a bunch smaller squares. One task is going to solve some equations for one of the smaller squares, another task is going to solve the same set of equations on a different square, etc. Because of some technical/mathematical reasons, neighboring squares will have to share some data with each other (the values they computed that lie on the border of other neighbors). Now, hold that thought for a second.

For small problems, this task division can probably fit into your computer’s memory, and we can probably get some speedup by using multiple cores; we divide up the squares and have each core of your processor work on some of the squares.

But let’s say you want to solve a bigger problem. Now the square you want to solve equations on is so big it can’t fit into memory. So you make a supercomputer that is really just a bunch of smaller computers that are all connected to each other.

Now you have a problem…
Remember when I said that neighboring squares needed to share some information? That’s difficult if that data is sitting in the memory of a different computer. We need a way for computers send and receive data and we need it to be fast.

Typical network protocols are too slow for this…they rely on a lot response and acknowledgement:

“I’m going to send you a message, are you ready?”

“Yes, I’m ready”

“I’m sending the message”

“I understand you’re sending the message”

….

This is fine for things like the internet where you want this for security and reliability, but for supercomputers it gets in the way.

So, supercomputers have special networks that allow processors to just fire a message off and bypass all the response/acknowledgement stuff.

Now, you have to write a program to handle this. We use a sort of programming language that simplifies all of this “I need to quickly share data with other processors”, and that programming language knows how to use the special networks.

So, the point of all of this….none of this actively developed for Windows.
Besides everything said here about GPUs and custom filesystems, a lot of it comes down to the fact that the way programs that run on supercomputers are written is basically incompatible with the Windows OS.

Anonymous 0 Comments

Windows comes with a lot of extra baggage that is only useful sometimes on desktop machines (extra drivers, services, etc.) but are a total waste on supercomputers, where you usually strip down the OS to the bare minimum you need to run the specialized software required for the computations.

Among those baggages there’s graphics. Windows is *heavily* designed around graphical user interfaces, and even when building custom images using specialized tools, you can remove some of that graphics dependency from the OS, but a ton of code needs to stay because it’s part of the kernel (the core of the OS). 

Another aspect that others have mentioned is the performance. On the same hardware, a fine tuned Linux doesn’t just kick ass to Windows, it’s simply on another level, like comparing fast cars with a supersonic plane, and I’m not exaggerating (too much, at least).
On Linux there are several dozens of options just for the file system, which can be picked and choosed to tailor them to the specific workload (e.g.: many small files vs fewer but very large ones, network distributed filesystems, etc.). On the other hand, the default Windows filesystem, NTFS, can be easily brought to its knees by a single user on a desktop, and on Linux tools to defrag the disk are considered in rare and esoteric cases.

Microsoft put some effort on allowing more optimization, but for the few supercomputers that run Widows, it took a dedicated team of Microsoft specialists to help them with the process, literally hacking the OS. The same could be done on Linux simply by an experienced sys admin. Also beyond a certain point, Windows simply doesn’t scale that well. Windows 11 Pro supports up to 4 CPUs with up to 256 cores in total. A minimally customized Linux can support up to 8192 cores.

This means that when using Windows you’re bound to be inefficient, and even if it’s less than 30% less efficient than it could be on Linux (very optimistic), who would take that cut?

Interestingly, Microsoft knows this very well since it runs almost exclusively Linux on Azure, their cloud computing (speaking about eating their own dog food).

Anonymous 0 Comments

Windows is a sandboxed OS where you’re at the mercy of Microsoft’s decisions on how an OS is designed and how it should function.

Microsoft’s goals are making an OS that caters to a normal persons use case and retaining compatibility with all their batshit decisions over the last 25+ years.

Linux is a kernel that is incredibly customizable. You can tailor the functions of the kernel for your exact use case. Then you can swap out entire operating systems on top of it and customize the behavior even further for how you want it to work.

If you’re going to be building a computer to do complex calculations, the ability to tailor the OS and its functionality for the specific architecture and design use case is extremely important.

But even aside from that, Linux gives the user much more control over how the operating system functions are used. The design is much better thought out for people who understand computers and know exactly what they want to do and how they want to do it.

Anonymous 0 Comments

They didn’t used to be. If you go back to early 2000s you’ll find the majority are proprietary Unixes (IRIX, AIX, HPUX, Solaris and a bunch of even weirder ones), MacOS and even one or two Windows.

These days those Unixes have largely fallen out of use, while Microsoft and Apple don’t really care enough to compete. Microsoft DID release a “Windows HPC Edition” which was designed for supercomputer farms, but it didn’t get enough traction so they retired it again. All that Unix knowledge translated most easily to Linux.

A supercomputer is really a farm of thousands of smaller computers, and it’s difficult and expensive to run a huge Windows farm. You need more hardware to coordinate, and it’s always a bit fragile trying to keep them all running with a “good” configuration. *nix you can just netboot everything from a shared image. *nixes also tend to make tuning their kernel a bit more accessible than on Windows (though if you WERE building a Windows-based supercomputer I’m sure MS would offer up a lot of engineering support).