why loading bars jump around instead of smoothly increasing percent?

939 viewsOtherTechnology

3%. 5%. 34%! 97%! 97… 98…

In: Technology

24 Answers

Anonymous 0 Comments

Let’s say you’re doing the simplest possible thing: You have InstallMyApp.exe that will copy files to your hard drive from inside itself. There’s no “download the latest version from the Internet” stuff happening, no mucking with the Registry, no installing parts of the OS like a C++ runtime or DirectX drivers.

You might think “Okay you’re writing 10,000 MB of files, you just increase the percent by 0.01% for every 1 MB you write. How hard can it be to program this?”

You’ve just made some very naive, very wrong assumptions about what’s going to happen behind the scenes when you actually run InstallMyApp.exe on an actual PC.

A file has to be put into blocks on the hard disk. If you think of a disk as a notebook, there’s a front part with a “table of contents” and a back part with the actual information. When copying large files or a lot of files, the OS needs to figure out where free blocks are, and mark them as occupied.

– Smaller files have more overhead in the filesystem compared to larger files. An extremely small file (whose data fits in one block) still needs at least one “table of contents” block update to keep track of it, so only 50% of the disk accesses you need are spent on the file’s data. On the other hand, a large file (say, > 100 MB) will use less than 1 MB worth of “table of contents” blocks. A file’s size is not the sole factor that affects how long it will take to copy the file.

– If you have one of those new-fangled SSD disks, compared to a traditional HDD disk, your computer can seek quickly (flip between the table of contents and the back part). It’s non-trivial to account for this effect, it benefits smaller files more.

– What kind of filesystem is your disk using? (Roughly, this is “how the table of contents designed?”) This depends on what OS / version you’re using, and how your disk was originally set up. What filesystem features are enabled? FAT vs. NTFS vs. whatever the newest Windows systems use might make a difference. If you have compression turned on, that will definitely affect the performance.

– How fragmented is the filesystem? If you need 100 blocks and the OS sees a single table of contents entry that says “Blocks 259-473 are free” it can put it all in there, if it has to cobble together a bunch of entries like “Blocks 114-117 are free, blocks 938-941 are free, block 1057 is free, blocks 1279-1318 are free…” it will have to do a lot more seeking, which will lower performance — much moreso for HDD than SSD, of course.

– Caching. Whenever a program asks the OS “Please write this data to disk, and tell me when you’re done” the OS will normally *immediately* say: “OK I wrote it, you can go ahead.” *The OS is lying to the program*: The OS actually put the data into spare RAM (cache) and will slowly write it to disk later. (This is a white lie to improve the program’s performance so it doesn’t wait as much for the disk.) Of course if you keep telling the OS to write more data faster than the disk can keep up, eventually the OS will run out of spare RAM to use for cache — at which point the OS will stop lying to you about how fast you can write to the disk. If there’s 8GB of free RAM, from your point of view, megabytes 1-9000 may write really fast (to RAM) — but as soon as you get over 9000, the 9001th and later megabytes write at the disk’s actual speed (they’re still written to cache, but they have to wait for a previous megabyte to be written to disk). (If you’re curious how 9000 MB fit in 8 GB of RAM, the answer is that megabytes 1-1000 were written to disk while your program was telling the OS to write megabytes 1001-9000.)

– Compression. The files inside InstallMyApp.exe are probably compressed. That means the source file might be smaller than the destination file, and it’s no longer a matter of totalling up the disk usage, you have to account for the fact that there’s some amount of CPU usage in between. Also the ratio of bytes-in to bytes-out can be quite variable between files. Of course every system also has a different ratio of hard disk to CPU performance. And significantly older / newer / different manufacturer CPU’s have different timings of instructions and out-of-order execution shenanigans, so you can’t assume a “40% faster” CPU will *always* be 40% faster. Some parts your the decompression code might run 40% faster while other parts only run 5% faster. How often which parts of the code run can vary heavily between different files or parts of files. And it all interacts with the RAM and CPU cache in non-trivial ways, even before we get to…

– Other programs. Modern OS’s can run more than one program at a time. Some of these programs (“services”) are parts of the OS, or other background tasks that you might not even be aware of. These programs compete with your installer for the CPU / disk, and also affect things like the caches and overall RAM usage. Of course these other programs could do literally anything at any time — the CPU, memory, and disk loads may be quite spiky and unpredictable.

You might have been able to write a good progress bar in 1982 when just about every PC ran MS-DOS, a very simple operating system with a single, very simple filesystem, that only ran one program at a time, had no caching, and you could perhaps make some widely applicable assumptions about typical CPU and disk drive types, speeds and performance characteristics.

You are viewing 1 out of 24 answers, click here to view all answers.