I understand that machines fail for numerous physical reasons, however, I’ve never understood how computer or programs that were working fine all along can suddenly crash or break down if there’s no moving parts and the code hasn’t been otherwise recently patched or updated. This has bugged me for over 25 years and it finally occurred to me that I should it.
In: Technology
Like others have pointed out, sometimes the circumstances for it are just very, very specific. One real world example that happens is called integer overflow.
For background, numbers in computers are most often stored using a method called two’s complement. I’ll try to keep it simple, but basically, it’s like a number line, it starts at 0, and then the negative numbers are after the positives (0 1 2 3 -4 -3 -2 -1). This is only 5 bits worth of data, and older computers usually used 32.
So, for example, you might be counting the amount of milliseconds your server has been running for. With 32 bits, that won’t overflow for about 24 days. So let’s say you plan to restart your server every week so the count resets. It might take years before you ever have to run for long enough for it to overflow into negatives
There are ways to account for this, like using a bigger number, not allowing negatives, or resetting your count occasionally. Usually this kind of problem happens when they never expected the program to need to run that long or it’s circumstances you just didn’t anticipate during planning
Other people have already said other good things about how and why, but hopefully this concrete exactly can help you conceptualize how it can happen and how it can fail to show up until years later
Latest Answers