Think about the counting numbers. Basically, counting numbers pile on leading 1’s first. Only when they’ve finished doing that do they start on the 2’s. Leading 3’s don’t get a look in until the 1’s and 2’s have finished. And so on. 9’s don’t catch up until the number is a string of 9’s and about to tick over to the next power of 10. At which point the whole cycle restarts; 1’s take over again, and start stretching their lead back out.
For the vast majority of the time, in other words, the predominant leading digits that the numbers have had to date are the lower ones, with the lowest ones most common. Many numbers in the real world tend to reflect counting to some degree, so they also tend to show the same skewed first digits. Invented ones often don’t, because we’re lousy at making up “realistic” data – which is why it’s often a useful test of the likelihood that a dataset is genuine, as opposed to invented.
But it’s important to say that the “Law” is a statistical observation more than an immutable rule. It most definitely doesn’t apply to all data; it’s also important to look at how the data arose.
Latest Answers