Back in the 1970’s, the people developing the UNIX computer operating system decided to design the system clock like this:
– Pick a particular date and time in the past (the “epoch”)
– The system clock keeps track of the number of seconds since the epoch.
They picked midnight on January 1, 1970 for the epoch. This is reasonable so far, but the 2038 problem is due to a single bad design decision:
– They used 31 bits to store the number (it’s actually a 32-bit counter, but one bit is used for something else (a sign bit)).
Now with 2 bits, you have 4 possible patterns: 00, 01, 10, 11. You can use these 4 patterns to store the numbers 0-3.
With 3 bits, you have 8 patterns: 000, 001, 010, 011, 100, 101, 110, 111.
With 4 bits, you have 16 patterns: 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111.
Whenever you add a bit, you get twice as many patterns. (Because when you have one more bit, you can do all the patterns you had before starting with an extra 0, then do all the patterns you had before *again* starting with an extra 1.)
With say 5 bits, you don’t have to list all the patterns to know how many there are. You just multiply the number of 4-bit patterns by 2, to give 16×2 = 32 possible patterns for 5 bits. You can also figure this out “from scratch” by multiplying five two’s together, that is 32 = 2x2x2x2x2 = 2^5.
So for 31 bits, you have 2^31 possible patterns, which works out to 2^31 = 2,147,483,648. So after 2,147,483,648 seconds the clock will overflow. We can translate that number of seconds into years as follows:
– There are 60 seconds in a minute.
– There are 60 minutes in an hour, so an hour is 60×60 = 3600 seconds.
– There are 24 hours in a day, so a day is 60x60x24 = 86,400 seconds.
– There are 365 days in a year, so a year is 60x60x24x365 = 31,536,000 seconds.
Dividing 2,147,483,648 by 31,536,000 gives 68.096, meaning the patterns will no longer be correct starting approximately 68 years after January 1, 1970. (A more precise calculation shows the exact time it occurs is when the clock ticks over from 03:14:07 to 03:14:08 on January 19, 2038.)
When they decided to use 31 bits, the original designers of UNIX may have been aware of this problem, but thought “Nobody will still be using our operating system in 68 years.” If that’s what they thought, they would have been wrong: The descendants of the UNIX operating system are very widely used today, especially in servers and mobile devices (including most smartphones, as the phone OS’s of both Apple and Google are descendants of UNIX).
Operating systems are finally [starting to fix the problem](https://en.wikipedia.org/wiki/Year_2038_problem#Implemented_solutions) but this problem is far bigger than just an OS issue, it affects a whole ecosystem, for example:
– You need one fix for your operating system, another fix for your programming language, and a third fix for programs written in that programming language
– If you have a program that saves dates/times in a file, you have to update how it’s stored, but you still want to be able to read older files that were created with a non-updated version.
– If your file has fixed-size records then you’re going to have a big problem because where do you put more bits for the timestamp without making the record bigger and breaking all the code that works with the old record size?
They faced a similar problem when dates rolled over from 1999 to 2000, which turned out to be a non-problem partly because it was easy to explain the problem and convince companies / governments to put resources into fixing it.
With the 2038 problem things are worse:
– It’s harder to explain “We’ll have a problem rolling over from 03:14:07 to 03:14:08 on January 19, 2038” than it is to explain “We’ll have a problem when the year rolls over from 99 to 00”, so it’s harder to convince people that we should be spending resources fixing it.
– We have far more computers and are far more dependent on them than we were in 2000 (and nobody thinks this trend will reverse itself between now and 2038)
– The ease of sharing code over the Internet means a lot of software isn’t self-contained: Your software may have a 2038 problem because you used Bob’s code, and Bob used Charlie’s code, and Charlie used Dan’s code, and Dan’s code has a 2038 bug. It’s not uncommon for a website to have like 1000 different pieces of code from different people.
Latest Answers