Why do some specific web pages have addresses that contain SEVERAL dozen nonsense characters in the address bar? Even if there are quadrillions of individual web pages there are still way too many characters than necessary for them all to be unique and leave room for more.

423 viewsOtherTechnology

Why do some specific web pages have addresses that contain SEVERAL dozen nonsense characters in the address bar? Even if there are quadrillions of individual web pages there are still way too many characters than necessary for them all to be unique and leave room for more.

In: Technology

6 Answers

Anonymous 0 Comments

I imagine you’re talking about the autogenerated IDs?

Things like this post [https://www.reddit.com/r/explainlikeimfive/comments/**1c74iqz**/eli5_why_do_some_specific_web_pages_have/](https://www.reddit.com/r/explainlikeimfive/comments/1c74iqz/eli5_why_do_some_specific_web_pages_have/) or everyone’s favorite YouTube Video [https://www.youtube.com/watch?v=**dQw4w9WgXcQ**](https://www.youtube.com/watch?v=dQw4w9WgXcQ) contain an automatically generated ID in them.

This ID is essentially a representation of a number. While normally we count up 0-9 and then roll over into the next column and start again, if you add more “digits”, you can count 0-9 then a-z then A-Z and then two more special characters (usually **-** and **_** in URLs, because the normal convention of **+** and **/** already have special meaning in URLs).

Now there are reasons you don’t just increment this number. If I made videos 1 2 and 3, but made video 2 unlisted, then someone could just go looking for it. By using random numbers in the range, it reduces the ability to guess. There should be a very good chance that someone guessing random numbers does not actually find a result.

In addition, with something that is decentralized, you need to add a mechanism for a server in, for example, Australia to generate a number and know that another server in, for example, New York does not also generate the same number (or even a secondary server in the same location that is handling excess traffic). Having very large numbers is part of the solution to this.

So once you’ve figured out how big of a range you need to make it so that you don’t have collisions on IDs when posts or videos are created, and so that people can’t randomly guess IDs to find things, you’ve got your upper bound. Now you just randomly generate digits in that range and turn them into Base64 for the URL.

You are viewing 1 out of 6 answers, click here to view all answers.