What do all the characters in a URL mean?

713 viewsEngineeringOther

A URL will be something like YouTube.com/r7@usq=616UUhF

What is all the jargon at the end?

Thanks in advance

In: Engineering

7 Answers

Anonymous 0 Comments

*(disclaimer: this is all simplified and some of the names aren’t fully accurate because URLs are actually really, really complicated)*

Generically speaking a URL looks like this:

`scheme://subdomain.domain.tld/path/file?querystring=value#anchor`

Where the different parts are:

* The **scheme** is the communication protocol being used, a collection of recognized commands and data formats so the two computers talking know how to interpret each other. Generally this will be https or http.
* The **TLD** is the *top-level domain*, the broadest part of the URL’s “name.” Typically it’ll tell you what country the site is in, or if it’s in the US what kind of site it is (commercial, education, government, etc).
* The **domain** is which specific site you’re visiting—Google or Reddit or whatever—within a given TLD.
* The **subdomain** is an organizational structure within a domain, but how they’re used (if at all) changes a lot from domain to domain. Originally this was the name of the computer you were accessing and web sites were typically hosted on a computer called WWW but this is rarely true anymore (even though a lot of sites will recognize the www subdomain).
* The **path** is the folder structure to get from the site’s root to wherever the file is located.
* The **file** is the specific piece of content you’re looking at.
* The **query string** is a list of name-value pairs that convey additional information to the server. Early on this was often used for search queries, hence the name. They’re generally in the format `name1=value1&name2=value2`.
* The **anchor** tells the browser to move automatically to a specific piece of content within the file.

Note that you know all of that, forget it because URLs have become the god damn Wild West, with servers interpreting them in all kinds of weird ways, single-page apps doing silly hackery like hash-bang URLs, load balancers, JS rendering…

It’s difficult anymore to answer your question for anything other than a specific site because they could be using their URLs in a lot of different ways. In the case of YouTube specifically you’ll see URLs that look like this:

`https://m.youtube.com/watch?v=_cUKCLbfK5w&pp=ygUPRGFuY2luZyBnb3BoZXJz`

Which you’ll observe has a scheme of `https`; subdomain of `m`; domain of `youtube`; TLD of `com`; path or file of `watch`; and then you’ll see that the `?` begins a query string with two named values, `v` and `pp`. Most likely the video player application is hosted at the path `/watch` and it uses the query string parameters to pass in the ID of the video you’re watching (v) and any additional settings it needs to be aware of (encoded into an unreadable string as the value of pp).

You are viewing 1 out of 7 answers, click here to view all answers.