Eli5: what are P and NP problems and how we are using it in real life?

46 viewsMathematicsOther

Eli5: what are P and NP problems and how we are using it in real life?

In: Mathematics

4 Answers

Anonymous 0 Comments

# ELI5 Answer

P is the class of problems that are “easy” to solve. NP is the class of problems where solutions are “easy” to verify.

Cryptography, Sudoku, finding efficient routes and paths for driving or flights, determining if a single move in Magic the Gathering is legal (known to be coNP-hard) are all in view of these kinds of problems, and they’re known to lie in some classes, and whether or not they lie in certain other classes is an open question and of great interest.

# Technical Answer

P stands for “Polynomial Time,” and NP stands for “Non-deterministic Polynomial Time.”

They refer to classes of decision problems—they are what you call complexity classes, in that they classify problems based on how complex they are, how much computational resources (either time or space) they take. Decision problems are problems that have a binary yes-or-no answer. “Does this boolean formula have a satisfying assignment?” “Is this graph 3-colorable?” “Does this Turing machine halt?” (this one is in a famous complexity class called RE) Etc.

P are those problems (formalized as *languages*, the set of all strings that represent a given instance of the problem, where the answer is “yes”) that are decidable in polynomial time, meaning there exists an algorithm that can answer the question in time that’s polynomial in the size of the input. I.e., the runtime of the algorithm scales polynomially with the size of the input. More on why polynomial is important in a minute.

NP are those problems that are decidable by a non-deterministic Turing machine in polynomial time. Or more colloquially, they are problems where given a candidate solution to the problem, you can verify whether or not the solution is correct in polynomial time.

A famous example is integer factorization (and similar problems like the discrete logarithm problem, or the elliptic curve discrete logarithm), one of the problems that underlies the heart of modern cryptography and therefore all digital communications. Given a composite number n, it is not known if there is a polynomial time algorithm to decompose it into its prime factors. However, given a propposed factorization of n, it’s easy to verify if those factors do indeed multiply out to n. In other words, it’s known to be in NP, but we’re uncertain if it’s in P.

Why is polynomial significant? Because it’s generally considered “easy” (there are exceptions to this, but that is the ELI5 simplification). If as the input size of the problem grows (bigger graph, bigger boolean formula, bigger sudoku puzzle, bigger chess board), the time it takes to find the answer only grows polynomially, well, that’s often tractable, and we can scale our solvers with relative ease. But if not, we have a problem: as the problem size grows, the resource usage blows up and we don’t have enough time, enough space in the universe to solve it even for relatively small inputs.

Let’s assume for a second P ≠ NP. That means there are some problems that truly are fundamentally “hard” to solve, but when given an alleged answer to an instance of the problem, you can quickly and efficiently verify.

If it could be proven P = NP, that could have disastarous consequences. Assuming the proof was constructive (someone came up with a polynomial-time solver for some NP-complete problem), you would instantly have a polynomial-time reduction of all problems in NP to a problem in P, collapsing the two and making all NP problems (all sorts of interesting verification problems) solvable in polynomial time. Assuming the polynomial wasn’t too large, that would upend cryptography.

Anonymous 0 Comments

Roughly:

A “P” problem is one that can be solved “quickly” in a very specific technical sense.

An example is “sort these numbers”.

An “NP” problem is one where we can check a proposed solution quickly (in that same very specific technical sense).

Intuitively, you’d expect that it might be harder to solve problems than to check their solutions. For example, if I give you pen and paper and say: “Find a seven-digit prime factor of 21398004009749.” you know you might need a lot of time. But if I say “Check that 8934773 is a seven-digit factor of 21398004009749” you know you’ll be done within a few minutes.

~~It turns out that factorising numbers is in P (even though it seems obviously slower than checking factors). The quick algorithm is unintuitive and leans on some pretty advanced mathematics, but it is fast in the technical sense used here.~~ This is wrong. We don’t know if factorisation is in P. See replies below.

However, there are some NP problems that might or might not be in P.

An example is

“Here are some cities, joined by highways. Find a route along highways only that visits each city exactly once”.

The best *known* methods for solving that problem either:

* Are very very slow, as the number of cities gets large, or
* Have a probability (ever so small) of failing, or
* Only work for some special arrangements of cities, not all.

There might be some special, super-tricky algorithm nobody has yet thought of that solves this problem quickly. We don’t know yet. So we don’t know if this problem is in P.

On the other hand, it’s trivially easy to solve *this* problem:

“Here are some cities, joined by highways. Here’s a route through the cities. Check that it goes along highways only, and visits each city exactly once”.

So the problem is in NP. It’s quick to *check* a proposed solution.

There are lots of problems like this. Another example is related to the game “battleships”.

“I give you an empty grid and some battleships. Can you place all the battleships on the grid so that no two battleships are next to each other?”

Nobody knows a quick way to solve this. On the other had, it’s easy to check a solution:

“I give you an empty grid and some battleships. Here is a proposed arrangement of battleships. Have all the battleships been placed on the grid so that no two battleships are next to each other?”

So this puzzle is NP, but may or may not be P.

The P vs NP problem is the question “Are there any NP problems that are *not* P?” Nobody knows the answer to this. If the answer is “yes” then it turns out that planning trucking routes or placing battleships on a grid are intrinsically difficult problems, and we will always have to rely on approximations and shortcuts. If the answer is “no” then it turns out that every problem that can be quickly checked can also be quickly solved (perhaps not *as* quickly, but “quickly” in the technical sense that’s relevant here).

Anonymous 0 Comments

I have a deck of cards with some cards missing. How many steps would it take you to determine whether the King of Spades is missing? If the king is missing, you need to look at all remaining cards to be sure. If the king _isn’t_ missing, you will need to look at half the deck (on average) before you run into it and can say it’s not missing.

Now, imagine If the deck was twice as big (because the numbers go 2—23 instead of 2—10). You’d need twice as many steps to determine whether the king of spades is missing. If it were half as big, you’d need half as many steps too. The average number of steps is always proportional to the size of the deck.

Now, imagine you know for a fact that the deck is sorted. You can just look at the last eight cards (presumably, the kings and aces) to see if the king of spades is in there. Doesn’t matter if you double the size or multiply it a hundredfold, it’ll always only ever take you at most eight steps to check if that particular card is missing.

Algorithmic Complexity is an area of computer science that studies this idea of how much work it takes to solve a problem as you change the size of the input data. One of the more interesting things they study is the big categories of problems that have similar complexity (that is: the number of steps grows in the same rough shape).

Which brings us to P and NP. P is the category of problems for which you can find a solution fairly efficiently (for a very specific, technical definition of “efficiently”). NP is the category of problems for which you might or might not be able to find a solution efficiently yourself, but, if I give you a solution, you can efficiently check if that solution is correct. (You might notice that all P problems are also NP problems, because you can just build your own solution to verify that my solution is correct).

The important question that we don’t have an answer to is: are there any NP problems that aren’t also P problems (P ≠ NP), or are all NP problems efficiently solvable (P = NP)? It’s more or less accepted that P ≠ NP, but we don’t know that for a fact (and some prominent computer scientists aren’t really sure it’s actually true).

One way this idea is used in practice is cryptography. E.g. multiplying two big prime numbers is P, but finding those numbers given the result is NP. We can take advantage of this difference to build so-called asymmetric encryption schemes — a setup where I have two keys (one public and one private), and anybody can use the public key to encrypt messages that can only be decrypted with the private key. This is one of the fundamental parts of how we keep the internet secure.

Anonymous 0 Comments

P is the set of problems where we can quickly find a solution. For example: multiplying two numbers.

NP is the set of problems where we can quickly check a solution. For example, a Sodoku puzzle. Hard to solve, but if I give you a filled Sudoku grid you can easily check if it’s right or not.

All P problems are also NP problems, that is, P is a subset of NP. Since if you can quickly find a solution, then you can quickly check a solution just by quickly solving the problem and comparing your result.