The Greatest Problem in Computer Science

Neel Parekh
9 min readDec 27, 2020

--

We’ll be looking at a problem that has often been called the biggest unsolved problem in computer science, the P versus NP. There’s actually a 1-million-dollar prize waiting for the person who solves it. It can roughly be translated to if a problem is easy to check the solution to is it also easy to find the solution to. This question sounds odd, but we’ll try to provide an intuitive explanation about why it’s even a question in the first place and also learn a lot more about the nature of computer science itself.

We’re going to start in a somewhat unexpected place by looking at a game called Number Scrabble. Given blocks of numbers between 1 to 9, the aim of the game is to pick 3 numbers that add up to 15. Whoever does this first wins. If all numbers are picked and no player has exactly 15 the game is a draw. You could do some trial and error spend some time trying to block your opponent from getting to 15.

Number scrabble

But what if we reframe the game? Instead of lining up the numbers in a line let’s arrange them in a square these numbers are arranged in a very specific order every triplet adds to 15.

A magic square

This is called a magic square where every triplet adds to the same number now how would you try to win the game. It seems more obvious now you would try to select 3 numbers which make a triplet and block your opponent from doing the same thing. Sounds oddly familiar doesn’t it?

Voila! Tic-tac-toe!

Tic-tac-toe! to win you need to draw three nuts or three crosses in a row while your opponent me from doing the same. When we arrange Number Scrabble in a magic square we see it’s really the same game as tic-tac-toe. A lot of games can be reduced to the exact same game and a lot of problems in math and computer science can be reduced to the exact same problem.

If two problems can be reduced to the same problem that can be solved by the same strategy any strategy that can solve tic-tac-toe can also solve Number Scrabble, in more technical terms, these strategies are called algorithms and they are essentially mechanical procedures. Kind of like a recipe. You don’t need to think when carrying them out. As long a you follow each step correctly, you’ll eventually reach the solution. Depending on the problem, algorithms can be more or less complex

How complicated an algorithm a problem needs to solve is one of the main areas of interest in computer science. It is so important in fact the problems are grouped into classes based on how complicated they are. This area of computer science is called computational complexity and the groups of problems are called complexity.

A Venn diagram representing the hierarchy

NP and P, are classes of problems based on how difficult they are to solve. P is the class of all problems which have an algorithm that can be computed in polynomial time. A polynomial is any number that has a power or exponent which is also a number like,

Or

n⁵

Where n represents the inputs to the problem. So we can say that the number of steps the algorithm needs to take is proportional to the input. Class P refers to all problems that a computer can solve in a reasonable amount of time.

NP on the other hand are the class of problems that can be verified in polynomial time but may take an exponential number of steps to solve.

An exponential is when the input number n is in the exponent like,

2ⁿ

The important difference here is that computers take extremely long time to solve these kinds of problems. To demonstrate the difference between a p-problem and an NP problem, if a problem is in the class P and has 100 inputs and if algorithm is proportional to it’ll solve this problem in about 1,000,000 steps. Which is reasonable. If it’s an NP problem with a completion time proportional to 2, it’ll take roughly 1,267,650,600,228,229,401,496,703,205,376 steps. Which is NOT reasonable by any means. It’ll take approximately 300 quintillion years to solve. That’s longer than the age of the universe! It’s safe to say that it takes a very long time to calculate NP problems. Although they take a relatively short time to verify. This is pretty intuitive. Sudoku, for example, is an NP problem it takes some effort and time to solve, but once it’s solved, it is pretty quick to check that it’s correctness. This finally brings us to the crux of the question the question,

Does P equal NP?

What we’re essentially trying to ask here is the two complexity classes P and NP are in fact the same class. Is it possible to perhaps solve these exponential problems in polynomial time? Or framed another way, are questions that are verified in polynomial time also solved in polynomial time?

This is the biggest open problem in computer science, the answer to which, would improve the world by unimaginable amounts.

If P did equal NP, it would mean that a lot of things would improve in the world

Optimization problems like transport routing, production of goods, job scheduling, circuit and database design are all NP problems. Imagine how much more efficient the economy would be if we could find the most optimal solutions to all these problems? There would be massive advances in machine learning as problems that are easy to check would become just as easy to solve. The problem of protein folding is an NP problem. If we could solve it, we could potentially cure cancer

To quote MIT complexity researcher Scott Aaronson,

“If P were equal to NP then the world would be a profoundly different place than we usually assume it to be there would be no special value in creative leaps no fundamental gap between solving a problem and recognizing the solution once it’s found everyone who could appreciate a symphony would be Mozart everyone who could follow a step-by-step argument would be Gauss everyone who could recognize a good investment strategy would be Warren Buffett.”

There is a downside to this golden metropolis. Encryption that keeps sensitive data like your bank details and privacy safe relies on P being different to NP. If P was shown to be equal to NP, cracking encryptions would become child’s play.

That being said, it’d be pretty cool if P did equal NP

Sometimes you can improve an algorithm so that it solves a problem faster the problem

For instance, if you had an array of 100 elements, using a sorting technique like bubble sort would give you a O(n²) complexity, which amounts to 10,000 steps. The same problem can be solved in about 200 steps if you were to use the merge-sort algorithm, which has an O(nLog n) complexity.

This is exactly what happened in the case of some NP problems. It was deemed that they couldn’t be solved in polynomial time but then some clever person came along with an amazing algorithm and managed to transport the problem from the NP domain to the P class.

Now you’re probably thinking, wait, if it was shown that some NP problems were in fact equal to P, doesn’t that prove that P equals NP?

The answer is a little fiddly, for the lack of a better word

Complexity classes aren’t totally separate things. Some of them are more like Russian dolls in that they contain other classes. The class of P is actually contained within NP. Not all problems in NP take the same amount of time to solve, but rather they can all be solved in exponential time.

Problems in P can obviously be solved in exponential time or less. To illustrate this with a simpler scenario, if you can solve something in less than a hundred years you can definitely solve it in less than a thousand years.

So NP is the class of all problems that can be solved in exponential time or less.

This finally brings us to the NP complete circle.

NP-Complete problems are a little different

These are the hardest problems in the whole class of NP and they take the longest time to solve. Some problems from the middle of NP have been shown to be in P, but none of the NP-complete problems ever have been.

So the question does P equal NP really means are NP-complete problems in the class P?

To sum it up again, NP problems can all be solved in exponential time or less. Some NP problems (those in P) can be solved in polynomial time or less, therefore the class P is contained within NP. Some NP problems have been shown to be in the class P.

However, the hardest problems in NP the np-complete ones have never been shown to be in the class P. So the question of whether P equals NP still is in salt because this NP actually refers to NP-complete problems.

If just one of the NP-complete problems was shown to be in the class P that would mean that all of the NP-complete problems would be in the class of P, just like solving Number Scrabble also solves tic-tac-toe. And if some NP problems have been shown to be in P it’s not too crazy to think that maybe one np-complete problem might be in the classes P. Maybe someday a brilliant mind with a totally radical idea will come up with an algorithm that will do this. If they showed that just one NP problem could be reduced to a P problem, this would show that all NP problems could be reduced to P problems, and that P does in fact equal NP. You could be Mozart, Warren Buffett and Gauss all rolled into one.

Here are some other interesting problems that are yet to be solved.

Real one-way functions

One-way functions can process every input very easily but hard to get the real input back from random output values. There are several candidates for the concept of one-way functions. However, the existence of one-way functions is still a conjecture. In other words, we believe that one-way functions exist. But no one could show and prove a real one-way function yet. One may argue about the cryptography hash functions. However, many types of hash functions were broken by ethical hackers by showing that those hashes are not real one-way functions.

This is essentially an NP-complete problem.

The fastest matrix multiplication algorithm

The matrix multiplication concept is vastly used in scientific computing, computer graphics, and pattern recognition fields. Therefore, many computer scientists tried to find faster algorithms. There are even special hardware related matrix multiplication algorithms such as, distributed and parallel computing algorithms. There are faster matrix multiplication algorithms than the naive approach like the Strassen algorithm. Further, some recent research papers have proposed faster matrix multiplication algorithms that have less asymptotic complexity than other popular algorithms.

However, the fastest algorithm for matrix multiplication hasn’t been invented yet. Also, the existing algorithms don’t have a clear asymptotic complexity.

Polynomial integer factorization

Integer factorization is a process that breaks an integer into a product of smaller integers. Also, if output values are restricted to only prime numbers, it is called prime factorization. This process is undoubtedly an easy task for computers only if the given integer is a smaller one. Whereas, when the given integer becomes larger this process will become a very challenging computational task for computers. No polynomial-time algorithm is invented yet to do faster integer factorization on non-quantum (also known as classical) computers. On the other hand, quantum computers have Shor’s algorithm for integer factorization. In fact, public-key encryption mechanisms such as RSA use the advantage of the weakness of the existing integer factorization algorithms to offer more security in computer systems. However, if someone invented a reasonably faster algorithm for this problem, all RSA-based encryption techniques will be invalidated. Non-quantum computers will process inputs until the universe exists to break RSA-2048 encryption. But, a perfect quantum computer will do it quickly. Unfortunately, the problem is that we don’t have such a computer yet.

By

Neel Parekh

Chaitanya Pathak

Prachi Pandey

Mihir Rabade

Rashi Wase

--

--