# ElI5: Why isn’t there an equation to solve 5-factor polynomials?

32 views
0

There’s the quadratic formula and the one for 3 and 4 factor ones (apologies if I’m wording this wrong), but I just heard that apparently there isn’t anything like a quintic(?) formula and so on. Why is this?? Googling gives me a bunch of confusing terminology that’s difficult to parse.

In: 75 There is a formula. You can use long division to factor any length polynomial into 2 and 3 term factors. From there the roots can be determined using the quadratic formula. This is a trial and error method tho, so generally using a computer or plotting a graph is easier. The real answer involves [Galois theory](https://math.stackexchange.com/questions/1733072/the-quintic-equation-why-is-there-no-closed-formula), but I’m not sure any 5-year old would understand that. It was apparently too difficult for even the famous mathematician Poisson (of Poisson distribution and Poisson equation fame), who said “”[Galois’s] argument is neither sufficiently clear nor sufficiently developed to allow us to judge its rigor”.

I’ll add that in practice, nobody usually cares, because for a fifth-order polynomial, you are guaranteed at least one real root. You can divide by that root to get a fourth-order polynomial that you could then solve analytically. Or, just keep solving numerically… First, let’s clarify some terms. When we talk about “solving” a polynomial, we’re talking about finding its roots. The roots of a polynomial are the values of x that make the polynomial equal to zero. For example, if we have a quadratic polynomial like (x^2 – 3x + 2), the roots are the values of x that make (x^2 – 3x + 2 = 0). In this case, the roots are (x = 1) and (x = 2).

Now, for polynomials of degree 2 (quadratics), 3 (cubics), and 4 (quartics), we have formulas that can find the roots. These are the quadratic formula, Cardano’s formula, and Ferrari’s formula, respectively. These formulas are great because they give us a systematic way to find the roots of any polynomial of degree 2, 3, or 4.

However, for polynomials of degree 5 and higher, things get more complicated. In the 19th century, a mathematician named Évariste Galois proved that there is no general formula, using only the usual algebraic operations (addition, subtraction, multiplication, division, and root extraction), that can find the roots of a fifth-degree polynomial or higher. This is known as the Abel–Ruffini theorem.

The reason for this has to do with the nature of the symmetries of the roots of polynomials. For polynomials of degree 4 and lower, these symmetries form what’s called a “solvable group,” which means that there’s a systematic way to break down the problem of finding the roots into simpler problems. But for degree 5 and higher, the symmetries form a more complex type of group, called a “non-solvable group,” and there’s no way to break down the problem in the same way.

This doesn’t mean that we can’t find the roots of a fifth-degree polynomial at all. It just means that there’s no one-size-fits-all formula that works for all fifth-degree polynomials. We have to use other methods, like numerical approximation, to find the roots in general. The (most common) proof is way, WAY beyond ELI5 level. It occupied about half a semester of graduate-level abstract algebra at my university, and the progression of the proof is not at all obvious, or at least wasn’t at all obvious to me (and I am pretty good at this sort of thing). It’s a pretty extraordinary piece of mathematics, all the more so for having been done by one guy at age 18.

But I’ll sketch it out, broadly speaking.

——

So. We need to have some way of describing a mathematical object that tells us whether a polynomial has a solution. It’s not even obvious where we would *start* with this. But here’s an idea: what if we think about what would happen if we had the rational numbers and we “attached” the roots of the polynomial to it.

So for example, if we have the polynomial x^2 + 1 = 0, we can “attach” +i and -i (the two roots of this polynomial) to the rational numbers. This turns out to get us all numbers of the form a + bi where a and b are both *rational* numbers (note that this is a subset of the complex numbers, not all of them). We call this operation a **field extension**, because both objects are fields, a kind of mathematical object that “acts like” the rational numbers in some sense. (Specifically, a field is a set on which you can add, subtract, multiply, and divide by non-zero values, and it turns out that “take the rationals and stick the roots of a polynomial on them” always results in such an object.)

More abstractly, we take the field Q of rational numbers (Q is the usual symbol for them), and for any polynomial P with roots x1, x2, x3, …, xn we can create a new field which we write Q[x1, x2, x3…, xn] of the rational numbers with these extra roots. We call this new, bigger, field the splitting field of our polynomial P.

This is helpful because it takes us out of the realm of our polynomial, and into the realm of talking about abstract algebraic objects (which is usually where mathematicians like to live). But how the hell does this help us?

—–

Well, it turns out – and again this is not at all obvious and takes a lot of work to prove – that the relationship between Q and the splitting field Q[x1, x2, …, xn] encodes the information we want.

Return to our earlier example: we took the rationals and added +i and -i to them. But the properties of these “rational complex” numbers wouldn’t change if we swapped +i and -i, and we wouldn’t be messing with the rationals that lack any imaginary part by doing so. In other words, we have an operation that:

* Preserves the properties of the bigger field (the splitting field) *and*
* Does not change the smaller field, in our case Q, at all.

If we take all the operations that do this – and there may be quite a few of them – they form another kind of mathematical structure called a *group*. Groups are a very common type of mathematical object, because they in some sense describe symmetries and transformations in a very general way, and studying the symmetries of an object is often a way to understand its properties.

This particular group, which we call the *Galois group*, encodes information about how our polynomial’s roots extend the rational numbers. In some cases, when the polynomial’s roots are all rational themselves, it doesn’t extend the rationals at all (because you could already “get to” those numbers). In other cases, it extends the rationals in various ways.

—–

Okay, but how does *that* help us?

Well, it turns out that the properties of the Galois group tell us something about the original polynomial’s roots – namely, if they can be described using just arithmetic and nth root operations.

It turns out that roots that can be described this way extend the rationals in a very specific kind of way. The resulting extensions – and their corresponding Galois groups from the previous section – can only take on a particular kind of structure.

Since we can build up our full extension by *all* the roots by extending by each one one at a time, we get a sequence of Galois groups for each of those extensions in turn. And it turns out that if that sequence has particular properties – which turn out to be equivalent to the full Galois group being something called a [solvable group](https://en.wikipedia.org/wiki/Solvable_group), the original polynomial had a solution that could be written using only arithmetic and radicals.

—–

Finally, we show that there exists at least one polynomial of degree 5 – it turns out that x^5 – x – 1 works – whose Galois group is *not* solvable. Then we work backward:

* This polynomial has a non-solvable Galois group.
* Therefore, its chain of extensions does not have the property that it would have if every root could be written with only arithmetic and radials.
* Therefore, **this** polynomial has no solution with only arithmetic and radicals
* Therefore, there is no general formula that can solve every polynomial that way.

It turns out that the smallest non-solvable Galois group requires at least five roots, and therefore a polynomial of degree at least 5, which explains why degree five can’t be solved with a formula using only arithmetic and roots (and 2, 3, and 4 can be). There are formulae, but what there aren’t are formulae that only involve addition, subtraction, multiplication, division, and nth roots (eg, square roots or cube roots).

As for why, that’s a question that vexed almost 300 years of mathematicians, until it was finally proven by a guy called Abel, and later another guy called Galois, who provided a general framework for this kind of question before being killed in a duel at the age of 19.

Lesson 1: if you like mathematics, don’t get involved in 19th century French politics.

Here’s roughly how Galois’ proof works.

**The first big insight** is that if you have a system of numbers, (eg, the real numbers, you can take a polynomial that doesn’t have a solution (eg, x^2 + 1 = 0), then pretend it does have solutions, and get a bigger number system. Eg, if we say “actually, x^2 + 1 = 0 has a solution, and we call it i”, then the real numbers turn into the complex numbers.

Galois came up with a general system for “extending” number systems with roots of polynomials, and used the system to shoot down a whole lot of questions that had been plaguing mathematicians since the Ancient Greeks, eg “how do you trisect an angle with ruler and compass”? (Galois: you can’t, and here’s why). or “how do you use ruler and compass to make a cube whose volume is twice that of a given one?” (Galois: you can’t, and here’s why).

Eg, we can extend the rational numbers with a solution to x^3 – 2 = 0, getting numbers of the form a + b 2^(1/3) + c 2^(2/3).

**The second big insight** is that sometimes, the roots of a polynomial are all pretty much indistinguishable. Not the *same*, but indistinguishable. Eg, x^2 + 1 has two roots, i and -i. But these two roots have all the exact same properties. The engineers use j instead of i to represent “the” root of x^2 + 1. But what if they were actually using j for -i all along? There’s no way anyone could know. If you take any chunk of maths, and replace all the i with -i, and all the -i with i, then it comes out exactly the same.

We could call this a “symmetry” of the roots of x^2 + 1. And there are, it turns out, two symmetries: {i -> -i and -i -> i}, and {i -> i and -i -> -i}.

If we collect all the symmetries of the roots of a polynomial together into a collection (mathematicians call this a “group” of symmetries), then there are all kinds of possibilities for what that “group” of symmetries might look like.

**The third big insight** is some interesting things about what kinds of “groups” of symmetries are possible – in general, and of roots of polynomials.

One important type of group is when you just rotate a number of things: For example {A -> B, B -> C, C -> D and D -> A}. The letters A, B, C and D just get rotated.

There are other, more complicated groups of symmetries: for example, if we look at all possible permutations of three things, there are 6 of them: first, there are three cycles:

* {A->A, B->B, C->C}, (kind of a trivial cycle, I know).
* {A->B, B->C, C->A},
* {A->C, B->A, C->B},

And there are also three swaps:

* {A->A, B->C, C->B},
* {A->C, B->B, C->A},
* {A->B, B->A, C->C}.

If A, B and C these were roots of polynomials, it would represent a situation (like the complex numbers i and -i), where the roots could be swapped or cycled any old way, and nobody would be able to tell the difference.

Not all polynomials are like this. For example,

x^3 + 2 = 0 over rational numbers has three roots, but if you decide to shuffle them, then as soon as you decide what to replace the first one with, then your decisions about the others are forced. It turns out that all the shuffles of the roots are cycles.

This is typical of polynomials of the form x^n – a = 0.

And it turns out this cuts both ways: if your group of roots has only cycles (or powers of cycles), then your polynomial has to be x^n – a = 0 in disguise.

**The fourth big Insight:**

Not every group of root shuffles has just cycles, but many of them can be built out of things that are just cycles.

For example: We could build up this group:

* {A->A, B->B, C->C},
* {A->B, B->C, C->A},
* {A->C, B->A, C->B},
* {A->A, B->C, C->B},
* {A->C, B->B, C->A},
* {A->B, B->A, C->C}

by saying “let’s just have all the cycles of A, B and C”:

* {A->A, B->B, C->C},
* {A->B, B->C, C->A},
* {A->C, B->A, C->B},

(so this would correspond to taking a cube root) and then saying “let’s also have the cycles that swap A and B” (so that would correspond to a square root).

So if you had a polynomial whose 3 roots could be shuffled in every possible way, it would be possible to solve it by doing a cube root, and then a square root, with possibly some normal addition, multiplication etc between, before, and after.

**The fifth big Insight:**

So, what polynomials can be solved with roots? Galois showed this is the same question as “which groups of shuffles can be built up out of cycles?”