What kind of humongous tasks do supercomputers do? What type of mathematical models can be so complex that it requires a computer close to $1B?

66 views

What kind of humongous tasks do supercomputers do? What type of mathematical models can be so complex that it requires a computer close to $1B?

In: 244

Mostly stuff like simulating things like nuclear explosions, climate/weather, earthquakes and other things of that nature where it would be worth a whole lot to know more about them, but simulating them in details involves a lot of computation to get close due to its chaotic nature.

You can learn more by looking at the Top 500 List of supercomputers and look at the sublist by application area.

https://www.top500.org/statistics/list/

Keep in mind though that much of what governments do with them is classified. (hint the US Department of Energy is in charge of Nukes)

We make simplifications constantly, for everything.

Supercomputers allow us to get answers to things without having to simplify as much, they can be used to more accurately predict weather patterns, visualize the result of protein folding and interactions with other molecules for genetic and drug research, potentially model a brain down to the individual neurotransmissor…

Sometimes a simplified model can be good enough and can predict the state of something in the near future, however the more accurate you want your result or the further into the future you want to look, the less assumptions you will be allowed to make and the more computing power will be required to get every detail just right.

Essentially, anything you can think of that is very important to us and can be broken down into smaller and smaller components, supercomputers are good for that.

Cryptology and simulations mostly. And it’s less a “we need a supercomputer” as it’s “it’s nice to have one”.

For most of them, it’s not like they run one problem over a long time, but people book time on it. So instead of many people using medium sized computers to run their complex code over weeks/months/years, instead they book some smaller time on a super computer.

Often times you come up with a model that you want to test, let’s say for weather predictions. It’s complex and would take months to run on a normal pc. Instead, you run it for a few hours on a super computer, look at the results, compare them to real world results and then adjust your model accordingly, run it again a few days/weeks later and so on and so on. This is done for lots of different complex mathematical models for all sorts of different areas.

Also, if you are doing crypto, it’s usually something that you don’t need all the time, but when you need it, you don’t have months or years to wait for the results

Traditionally – computational fluid dynamics, weather, “high energy physics”

More recently – computational chemistry, comp life sciences like protein folding and DNA analysis.

Computer generated imagery. Machine learning.

Fun fact – most of the non-NSA DoD HPC is….. classified weather forecasting. They use a lot of the same programs as the weather services, but the exact areas that they are focusing on, are the secret.

For example, “why are you calculating the flying weather over a specific city in Ukraine for tomorrow night?” or “Why are you calculating the wave height on a specific beach in area XXX on the next moonless night?”

The math formulas are not always complex, the computing and data exchange is huge. Complicated math is great actually, when it allows you to use one formula to predict what will happen with the system as a whole.

Supercomputers are needed when we don’t have the “answer” formula, we have a set of (maybe simple) rules, but need to apply them *everywhere* and *over time* to actually find out what happens.

Imagine simulating a huge storm cloud. You have a cube of 10x10x10 km of virtual moist air and need to compute step by step what happens to it over time. You split it into a grid of 100x100x100 cubes. So now you can use a million separate computers to each simulate a smaller 100x100x100 meter cube with whatever method you use (maybe a grid of 100x100x100 1-meter cubes that you consider to be flat).

What makes this “hard” enough for a supercomputer is the information exchange with adjacent cubes. To know what happens to the air in each cube you need to know what happens on the border to the 6 neighbors (how much air was pushed in, at what humidity and pressure, in a 100×100 grid if we use 1-meter cubes internally). Fast and frequent exchange is hard (the simulation needs to manage what gets sent where and how often) and expensive (needs specialized network hardware built into the supercomputer).

If your problem is not large or connected enough to need specialized network hardware, you can compute it on a less expensive commodity-hardware cluster (that used to be called Beowulf for some reason).

If your problem is more about churning through huge datasets and not simulation, usually intermediate results need to be exchanged between nodes only several times in the whole calculation. For this you can use a “Big Data” framework, like Hadoop or Spark, where the parallel computation is more limited, but it’s managed for you by the framework.

If your problem is big but does not need sharing *any* data between running sub-tasks (processing separate events from the Large Hadron Collider, trying different ways to fold proteins), you use a grid (see WLCG) or a desktop grid (see Folding At Home or BOINC). They can use any hardware to do their job (though depending on the problem they may need to get input data that makes one of those unusable): desktop computers, computing clusters, “leftover” resources on supercomputers. Some grids may even be able to start their jobs in Big Data frameworks, though this is a less developed approach. Big Data originated in business, so there is less history of compatibility with the scientific ecosystem (grids, clusters, supercomputers).

Edit: a cube has 6 neighbors in a grid, not 8. I don’t know wtf I was thinking. Probably Minesweeper (squares including diagonals)

Tesla uses theirs for machine learning. It analyes all the data from the self driving cars.

Most are modeling things like weather, protein structures, and AI/machine learning

The example of the recent use case: Machine Learning.

Albeit AI and ML sound super fancy (and in some way, is), the core concept of machine learning is “if I have billions of multiplications one after the other, and make them the *right number*, I can know if the input image was a cat”. But, for that, you need to “train” your ML, which means you need to update those billions of multiplications, thousands of times.

This requires tons of computing power. Supercomputers are equivalent of thousands of PCs, so they just do these multiplications much faster. Turns out just increasing how many multiplications you do really makes the algorithm “smarter”, so ML researchers keep adding more and more, so they need bigger and bigger computers.

​

Source: I do exactly that, in a supercomputer.

it’s proportional to complexity of calculations. some examples might be:

– fluid dynamics

– seismic phenomena

– weather simulation/prediction

– protein modeling/decoding

– different kind of models, like : ecosystems, universe, and running simulations in which these evolve at a quicker rate than normally would.

Reinsurance modelling requires ridiculously large models. You basically want to model the probability of every possible natural disaster based on every possible inout in every country you insure. Which for many reinsurers is most of them.

I remember a fulla telling me about a time they took him into the supercomputer room ar swiss re. Apparently it was no shit a double key-turn entry with security guards into a few dudes in a room running code.

Pretty cool.

Its why I always find it hilarious when people deny climate change is even happening. Like, buddy, hundreds of billions of dollars worth of private capital is being redeployed on the basis of the simulations run by some of the most expensive computers in the world, programmes by some of the smartest highest paid people in the world, and you think that all the competitors, at the same time, are all too independently stupid to exploit the market inefficiency?

How about that Sag A* data crunching?

An example could be trying to find a new cure for some disease. This requires a lot of calculations as far as I understood.

While I don’t know every use for them, one is modeling the galaxy and universe.

Keeping track of every particle in the model; speed, direction, mass, energy/temperature. Then making sure that particle properly interacts with every other particle that it is tracking. . .It starts to build up.

It’s no so much that it’s overly complex, there’s just an insane number of calculations that need to be made, to move the model forward any amount of time.

I used supercomputers to simulate the universe in ridiculously high detail where things get interesting and lower detail where not much is happening. You can actually run small simulations on your home computer. But you can’t get anywhere enough detail to match the phenomenon that we observe in local stars and distant galaxies. Something that would take months to run on my laptop can be done overnight on the right supercomputer. Give it a shot yourself if you’re up for it: https://enzo-project.org/

They have a supercomputer at the university I went to and one of my astrophysics professors used time on it for a (partial) universe simulation

Take a deck of 52 playing cards. The number of possibilities of the order of the cards in that deck when properly shuffled is 52 factorial (written as “52!”), which is 52 × 51 × 50 × 49 × 48 … × 2 × 1.

The product is a 68-digit number. How large is that? I defer to data scientist Scott Czepiel:

>How large, really, is 52 Factorial?

>This number is beyond astronomically large. So, just how large is it? Let’s try to wrap our puny human brains around the magnitude of this number with a fun little theoretical exercise.

>Start a timer that will count down the number of seconds from 52! to 0. We’re going to see how much fun we can have before the timer counts down all the way.

>Start by picking your favorite spot on the equator. You’re going to walk around the world along the equator, but take a very leisurely pace of one step every billion years. 

>After you complete your round the world trip, remove one drop of water from the Pacific Ocean.

>Now do the same thing again: walk around the world at one billion years per step, removing one drop of water from the Pacific Ocean each time you circle the globe.

>Continue until the ocean is empty.

>Once it’s empty, take one sheet of paper and place it flat on the ground.

>Now, fill the ocean back up and start the entire process all over again, adding a sheet of paper to the stack each time you’ve emptied the ocean.

>Do this until the stack of paper reaches from the Earth to the Sun.

>(Take a glance at the timer, you will see that the three left-most digits haven’t even changed. You still have 8.063e67 more seconds to go.)

>So, repeat the entire process. One step every billion years, one water drop every time around, one sheet of paper every ocean. Build a second stack to the Sun.

>Now build 1000 more stacks.

>Good news! You’re just about a third of the way done!

>To pass the remaining time, start shuffling your deck of cards. Every billion years deal yourself a 5-card poker hand.

>Each time you get a royal flush, buy yourself a lottery ticket.

>If that ticket wins the jackpot, throw a grain of sand into the Grand Canyon.

>Keep dealing, and when you’ve filled up the entire canyon with sand, remove one ounce of rock from Mt. Everest.

>Empty out the sand and start over again. Play some poker, buy lotto tickets, ,drop grains of sand, and chisel some rock. When you’ve removed all 357 trillion pounds of Mt. Everest, look at the timer, you still have 5.364e67 seconds remaining.

>Do that whole mountain levelling thing 255 more times. You would still be looking at 3.024e64 seconds.

>The timer would finally reach zero sometime during your 256th attempt.

>But, let’s be realistic here. In truth you wouldn’t make it more than five steps around the earth before the Sun becomes a Red Giant and boils off the oceans. You’d still be shuffling while all the stars in the universe slowly flickered out into a vast cosmic nothingness.

That’s just a deck of playing cards. Now imagine trying to model the possibilities of Earth’s climate, for example.

My company buys supercomputer time to run complex weather models. We turn out 2 per day. It takes 2 hours for each model run, and that’s with more than 4000 cores dedicated to the computations. It’s worth it because we can provide very accurate weather information (over the next 48 hours) to paying customers.

Oil companies use them to model stuff, others use them to do complex mathematics, etc. Basically any computationally-intensive job benefits from the power of a supercomputer.

Interestingly, GPUs are used in many of these situations nowadays – a serious reduction in cost!

You would be surprised how much computational power “simple models” need, and its more like those tasks could always do with a bit more computer time, without a fixed amount “computer hours” required.

Also academics write bad code.

I see little or no mention of the industrial business uses for this type of computing. Oil and gas geological research. Stock futures forecast modeling. Air and water fluid dynamics modeling for aerospace engineering and contract military work.

I would imagine a lot of the heavy lifting these machines do outside of nice science actually just is market analysis. Trades are executed in microseconds and it really just comes down to an arms race in the end. When retail wins it’s priced in. When retail loses it’s priced in.