eli5 what’s the point of having so many sorting algorithms if the primary objective is to sort, i mean why can’t we just use one algorithm for all

In: 6

Different requirements. Memory usage, number of writes, number of cpu instructions executed, you optimize for any of those and you get a different algorithm

It’s like saying what’s the point of having so many different cars if the primary objective is to get from one place to another. Why can’t there just be one type of car?

Some people want to get there the fastest with no regards to fuel usage (memory usage).

Others want good fuel economy and can handle going a little slower.

Sometimes you have to handle transporting a lot of people so you’re limited to larger vehicles rather than driving multiple vehicles.

You may want a different car for road trips compared to daily commutes.

Generally we want to sort as fast as possible. We rank how fast algorithms are with big O notation, you’ll see things like O(n^2), O(n * log (n)) etc. This says how much slower the algorithm gets as he number of items (n) you are sorting gets bigger. So you’d think O(n^2) is always slower than O(n * log (n)), but that isn’t necessarily true for small values of n. So if you are sorting just a few items, the ‘slower’ algorithm might actually be better because there is less setup time.

Other algorithms may have other weaknesses. They may perform worse poorly if there are a lot of duplicate values. They may perform poorly if the data is almost sorted.

You may want a stable sort. That is if you have two equal values, their order isn’t changed.

Generally, you can use the standard sort algorithm that comes with your language of choice. However, if you run into performance problems, you may need a specialized sort.

For sorting, there are three major characteristics and a few secondary ones that are of interest. Different algorithms have differing values for those characteristics that may adjust their suitability for a given application.

The major characteristics are:

* Average and Worst case run-time. The best possible worst case, according to known mathematics, is O(N ln N)*, meaning the run time is based on the number of items in the list times the log of the number of items in the list.

* Memory usage. Some algorithms sort in place, and take only a constant amount of memory, while others need to copy chunks of the data to sort. If memory is tight, a O(1) (constant) amount of memory usage is preferable to one that has to use enough memory to hold an entire temporary copy of the data.

* Stability. If two items are equal, what order are they put in? A stable sort will maintain the order in original data, while non-stable sorts may have an effectively random result order. While this does not matter for simple data like numbers, it may if you are sorting people by name or any other set. Some applications may require a stable sort, while many do not. An unstable sort can always be made stable by using extra memory and time to record the original order and include that in the compare, but that may not be ideal.

Secondary characteristics include:

* Best case run-time.

* How long does the algorithm take if the data is already sorted. Checking for already sorted takes more time, so an algorithm known to handle it innately can be useful if your data is likely already sorted.

* Similarly, how about data nearly sorted or in other clear patterns such as reversed? These can be much more important, as its not uncommon to need to guarantee data is sorted, knowing that you likely only added or changed a few items from the last check.

* Performance with short data. Practically, a lot of data is small, and more complicated algorithms that might meet the other needs are often slower to startup but faster for large data. Basically, there is some size of data where a simple algorithm is faster than a “better” one.

* Code size. While generally not super important for most applications, sometimes keeping the code tiny, and generally simple, is a huge benefit.

* Is it an online sort? If you constantly have new data coming in that you need to maintain sorted, some algorithms exist to do so more efficiently than collecting all the data and sorting it. These are especially useful if you have a large chunk of initial data and new data coming in live.

​

Practically, only a few of the general purpose algorithms are commonly used, with most of the others being more specialized in nature.

​

* To be complete, there are some types of data that can be sorted faster, such as Radix Sort for integers, but no general purpose solutions. When working with large data sets of these types, using one of these specialized algorithms is generally more efficient, though their specialized nature makes them harder to use.

Different algorithms sort at different speeds, use different amounts of memory, and are better suited to different applications. It is sort of like saying “why can’t we just use one wheel for everything”; while the wheels on luggage and the wheels on an airplane are both wheels you wouldn’t want to use the same one for both.