What kind of humongous tasks do supercomputers do? What type of mathematical models can be so complex that it requires a computer close to $1B?

563 views

What kind of humongous tasks do supercomputers do? What type of mathematical models can be so complex that it requires a computer close to $1B?

In: 244

29 Answers

Anonymous 0 Comments

The math formulas are not always complex, the computing and data exchange is huge. Complicated math is great actually, when it allows you to use one formula to predict what will happen with the system as a whole.

Supercomputers are needed when we don’t have the “answer” formula, we have a set of (maybe simple) rules, but need to apply them *everywhere* and *over time* to actually find out what happens.

Imagine simulating a huge storm cloud. You have a cube of 10x10x10 km of virtual moist air and need to compute step by step what happens to it over time. You split it into a grid of 100x100x100 cubes. So now you can use a million separate computers to each simulate a smaller 100x100x100 meter cube with whatever method you use (maybe a grid of 100x100x100 1-meter cubes that you consider to be flat).

What makes this “hard” enough for a supercomputer is the information exchange with adjacent cubes. To know what happens to the air in each cube you need to know what happens on the border to the 6 neighbors (how much air was pushed in, at what humidity and pressure, in a 100×100 grid if we use 1-meter cubes internally). Fast and frequent exchange is hard (the simulation needs to manage what gets sent where and how often) and expensive (needs specialized network hardware built into the supercomputer).

If your problem is not large or connected enough to need specialized network hardware, you can compute it on a less expensive commodity-hardware cluster (that used to be called Beowulf for some reason).

If your problem is more about churning through huge datasets and not simulation, usually intermediate results need to be exchanged between nodes only several times in the whole calculation. For this you can use a “Big Data” framework, like Hadoop or Spark, where the parallel computation is more limited, but it’s managed for you by the framework.

If your problem is big but does not need sharing *any* data between running sub-tasks (processing separate events from the Large Hadron Collider, trying different ways to fold proteins), you use a grid (see WLCG) or a desktop grid (see Folding At Home or BOINC). They can use any hardware to do their job (though depending on the problem they may need to get input data that makes one of those unusable): desktop computers, computing clusters, “leftover” resources on supercomputers. Some grids may even be able to start their jobs in Big Data frameworks, though this is a less developed approach. Big Data originated in business, so there is less history of compatibility with the scientific ecosystem (grids, clusters, supercomputers).

Edit: a cube has 6 neighbors in a grid, not 8. I don’t know wtf I was thinking. Probably Minesweeper (squares including diagonals)

You are viewing 1 out of 29 answers, click here to view all answers.