What is the logic behind programming languages 0-indexing?


As someone who primarily uses R, I don’t understand why Python indexes lists starting from 0. I’m very slow at simple mental calculations and doing something like subsetting an array in Python often takes me several extra seconds.

I think I read that it has something to do with memory, but thats so much less of a consideration for people who only use high level languages.

In: Technology

Historically an array/list was (and in languages like C, is) implemented by storing the memory address (“pointer”) of the first item in the array. Thus following the pointer takes you straight to the first item in the list. The memory location of any particular item is then calculated as “pointer to first item + size of each item * array index”. Hence the first item is number 0 because you don’t move the memory pointer to find it. The second item is “1 space to the right of the first item”, etc.

When people say that C is a very low level langauge, this is the sort of thing they mean.

Changing this behaviour might involve having the pointer point just before the first item, but then either you’re wasting an item slot’s worth of memory or the pointer is technically invalid as presented. This would be inconsistent with other uses of variables pointing at memory. Alternatively you could subtract 1 from all array indexes before doing the memory location math but that’s a lot of overhead that isn’t necessary.

Most (but not all) languages keep this convention for consistency with other languages.

Because that’s how address registers in the hardware work. If a “pointer” in a language is the address of the start of an array, the first element is at that address. The hardware’s “load indexed” instruction takes an address register for the base and another register as the index. The address+0 location contains the first element in the array.

At a deeper level, you need to figure out where in memory each item is. When the application asks the OS for memory for a variable, it gets given some address X. You will have the first item in the list at memory address X, the second at memory address X + offset, the third at X + 2* offset, the fourth at X + 3* offset, and so on.

So when you want to get a specific item from the array, you need to tell it how many offsets to use. The first item is right at the start of the list, so it has 0 offsets.

The values in the array are stored in memory. Rather than explicitly store the address for each element in the array, the computer stores the address only for the beginning of the array. To access other elements of the array, you must specify an offset.

Naturally, to access the beginning of the array itself, the offset would be 0. So the index starts at 0.

The other responses are all correct, but if you want a shortened version, think about how a number is stored. It’s stored in bits; 1’s and 0’s. You can have four bits that are all 0’s: 0000. If you start your index with 1 (0001), you miss out on an extra spot in the array.