What some comments here have not mentioned is it could also be just bad/lazy programming, usually a program can only do one thing at a time and you have to do more to make it do multiple tasks at the same time so you end up with a program that can’t perform the task and update the loading bar at the same time.
While it’s true that some loading bars are arbitrary because you can’t really calculate the progress, there are some that you can likr copying X number of files or processing X number of rows in a spreadsheet.
Your explanation is contained in this [article ](https://www.rockpapershotgun.com/yes-video-game-loading-bars-are-fake-indie-devs-admit).
If you don’t have time or patience for that, the ***TLDR*** is that people generally don’t trust smooth loading bars. Generally the stuttering progress tends to be viewed as more believable than smooth and accurate tracking. It’s counterintuitive, buy hey – so are people.
Not all operations are linear. And most operations aren’t predictable as to when they will be done particularly if they are complex. For instance, if your computer is to copy 1000 files, there is not a constant rate of file transfer. Some files may take a second to transfer, others may take 10 seconds, others may take much longer. A lot of these “progress” symbols use a very cheap but fast way to just provide you feedback in regards to “I’m working, I’m _this_ far” – but it’s not linear. Having taken 5 minutes to go 50% doesn’t mean there’s 5 minutes left. There’s no way to predict how fast a file is copied – it depends on a lot of things, not just size. And the programmer never set out to tell you how long an operation would take to begin with.
Say you’re making a program (app, game, whatever) and you want to show the user that behind the scenes *something* is happening so that they don’t think the program is frozen or locked up. You could go with a spinning circle or hourglass, that will at least tell them something is happening, but it doesn’t give the impression that the process has an *end* or *finish* that it’s working toward.
So you put in a progress bar. Great. Except… How do you know exactly how much progress to show? How far along is it? You’ve got to measure something.
Maybe you’re copying files. The easiest thing to do is count how many files you’re copying, divide up the progress bar into that many equal sections, then every time a file finishes copying you move the bar along a chunk.
But there’s a problem with that. What if you have 9 teeny tiny files, like text files that have config info or key bindings or something, and one HUGE file that has all the graphical textures or music or something. You’ve got 10 files, but the 9 small ones will jump the progress bar ahead quickly, and it’ll kind of stop for a long time for the big one. If the big one is near the start, it goes slowly at first, then jumps from 10% to 100% super fast. If it’s at the end the first 90% happens instantly but the last 10% takes **forever** to finish.
So you need something better. With files, maybe you could add up the total size of everything in MB, then divide the bar up by that. But then you need to spend extra time gathering that info and you need a way to know how much of a file has been copied. That’s more complex. It’ll make the bar behave better, but if you tell the boss that you can have the first way done this morning and the second way done by the end of the week, he’ll be like “It’s a progress bar. Who cares? Just get it done quick, the project is already behind”…
And that’s if you’re counting file copying. Other operations and processes have their own challenges finding things to count to indicate how far along they are.
So you think “wait, I could count TIME and use that!”. So you run the process once, time how long it takes in seconds, and tell the bar that it takes X seconds to go from 0 to 100.
But then what happens if the client/user’s computer is slower or faster at doing the process than yours? You could have a progress bar that gets “stuck” at 100% for two minutes because it counted up the three minutes it took on your machine, but it will take five here. Or your process could finish when the bar is at 50% because their computer is super fast. What do you do then? Do you code a way to see if the process is done then just skip to 100% if it finishes early? Or do you just let it count to 3min no matter how long it takes?
Hopefully this look behind the scenes of building a program with a progress bar sheds some light on why it’s so hard to make a smooth and accurate one, and why so many of them suck.
And of course, relevant XKCD from the days when the Windows File Copy dialog counted the old way (# of files) instead of the new way (% of bytes total):
https://xkcd.com/612/
When performing a loading process, you know that you have a certain number of tasks, so the developers typically just program the loading bar to increase as each task is complete. However, each task can take a different amount of time.
Imagine that you have a to-do list for your day with 10 items on it. As you finish each item, you check it off the list. The first 5 items take 10 minutes, but the remaining items take the rest of the day. As such, you’ll be 50% done with your list in 10 minutes and the remaining 50% takes the rest of the day.
You could have estimated how long each task would take before starting and then tracked your percent complete based on how long it’s been since you started, but what happens if your estimate was incorrect? It’s easier just to keep track of how many items on the list you’ve finished.
Let’s say you’re doing the simplest possible thing: You have InstallMyApp.exe that will copy files to your hard drive from inside itself. There’s no “download the latest version from the Internet” stuff happening, no mucking with the Registry, no installing parts of the OS like a C++ runtime or DirectX drivers.
You might think “Okay you’re writing 10,000 MB of files, you just increase the percent by 0.01% for every 1 MB you write. How hard can it be to program this?”
You’ve just made some very naive, very wrong assumptions about what’s going to happen behind the scenes when you actually run InstallMyApp.exe on an actual PC.
A file has to be put into blocks on the hard disk. If you think of a disk as a notebook, there’s a front part with a “table of contents” and a back part with the actual information. When copying large files or a lot of files, the OS needs to figure out where free blocks are, and mark them as occupied.
– Smaller files have more overhead in the filesystem compared to larger files. An extremely small file (whose data fits in one block) still needs at least one “table of contents” block update to keep track of it, so only 50% of the disk accesses you need are spent on the file’s data. On the other hand, a large file (say, > 100 MB) will use less than 1 MB worth of “table of contents” blocks. A file’s size is not the sole factor that affects how long it will take to copy the file.
– If you have one of those new-fangled SSD disks, compared to a traditional HDD disk, your computer can seek quickly (flip between the table of contents and the back part). It’s non-trivial to account for this effect, it benefits smaller files more.
– What kind of filesystem is your disk using? (Roughly, this is “how the table of contents designed?”) This depends on what OS / version you’re using, and how your disk was originally set up. What filesystem features are enabled? FAT vs. NTFS vs. whatever the newest Windows systems use might make a difference. If you have compression turned on, that will definitely affect the performance.
– How fragmented is the filesystem? If you need 100 blocks and the OS sees a single table of contents entry that says “Blocks 259-473 are free” it can put it all in there, if it has to cobble together a bunch of entries like “Blocks 114-117 are free, blocks 938-941 are free, block 1057 is free, blocks 1279-1318 are free…” it will have to do a lot more seeking, which will lower performance — much moreso for HDD than SSD, of course.
– Caching. Whenever a program asks the OS “Please write this data to disk, and tell me when you’re done” the OS will normally *immediately* say: “OK I wrote it, you can go ahead.” *The OS is lying to the program*: The OS actually put the data into spare RAM (cache) and will slowly write it to disk later. (This is a white lie to improve the program’s performance so it doesn’t wait as much for the disk.) Of course if you keep telling the OS to write more data faster than the disk can keep up, eventually the OS will run out of spare RAM to use for cache — at which point the OS will stop lying to you about how fast you can write to the disk. If there’s 8GB of free RAM, from your point of view, megabytes 1-9000 may write really fast (to RAM) — but as soon as you get over 9000, the 9001th and later megabytes write at the disk’s actual speed (they’re still written to cache, but they have to wait for a previous megabyte to be written to disk). (If you’re curious how 9000 MB fit in 8 GB of RAM, the answer is that megabytes 1-1000 were written to disk while your program was telling the OS to write megabytes 1001-9000.)
– Compression. The files inside InstallMyApp.exe are probably compressed. That means the source file might be smaller than the destination file, and it’s no longer a matter of totalling up the disk usage, you have to account for the fact that there’s some amount of CPU usage in between. Also the ratio of bytes-in to bytes-out can be quite variable between files. Of course every system also has a different ratio of hard disk to CPU performance. And significantly older / newer / different manufacturer CPU’s have different timings of instructions and out-of-order execution shenanigans, so you can’t assume a “40% faster” CPU will *always* be 40% faster. Some parts your the decompression code might run 40% faster while other parts only run 5% faster. How often which parts of the code run can vary heavily between different files or parts of files. And it all interacts with the RAM and CPU cache in non-trivial ways, even before we get to…
– Other programs. Modern OS’s can run more than one program at a time. Some of these programs (“services”) are parts of the OS, or other background tasks that you might not even be aware of. These programs compete with your installer for the CPU / disk, and also affect things like the caches and overall RAM usage. Of course these other programs could do literally anything at any time — the CPU, memory, and disk loads may be quite spiky and unpredictable.
You might have been able to write a good progress bar in 1982 when just about every PC ran MS-DOS, a very simple operating system with a single, very simple filesystem, that only ran one program at a time, had no caching, and you could perhaps make some widely applicable assumptions about typical CPU and disk drive types, speeds and performance characteristics.
Because in most cases it’s impossible to predict how long some processes would take, especially when algorithm behind your progress bar tries to cover many cases. Copying tens of thousands small files is going to be very slow, copying one or two big files – fast, both cases are predictable. But now imagine user is copying some common data, which consists of big and small files in random order. It’s really hard to predict how long something would take, and generally not worth it. Like, if you want your program to be precise, you’ll need to first benchmark the media, SSD, HDD or whatever it is, then look up every file to read their size, then calculate the estimation. And that means reading every file at least twice, which is really inefficient and don’t worth the progress bar working steadily.
Latest Answers