Problems in NP (Non-deterministic Polynomial time)¶

There are two (and even more) equivalent ways to characterize the decision problems in the class $ \NP $:

Problems that can be solved in polynomial time with non-deterministic (Turing) machines.

Such non-deterministic machines can compute the same functions as our current CPUs in polynomial time. In addition, they can “branch” their computations into many “simultaneous” computations and return a solution if at least one of the branches finds a solution. They cannot be implemented with the current hardware but are rather a conceptual tool; more on non-determinism in the CS-C2160 ToC course.
Problems for whose “yes” instances, and only those, have a certificate that is both
- reasonably small (i.e., of polynomial size w.r.t. the input), and
- easy to verify (i.e., in polynomial time w.r.t. the input).
However, the certificates are not necessarily easy to find (or to prove non-existent)!

In most cases, one can think that a “certificate” is a synonym for a “solution”.

The subset-sum problem¶

As a first example, consider the $ \SUBSETSUM $ problem.

Definition: $ \SUBSETSUM $

Instance: A set $ S $ of integers and a target value $ t $.

Question: Does $ S $ have a subset $ S’ \subseteq S $ such that $ \sum_{s \in S’}s = t $?

For each instance $ \Tuple{S,t} $ with the “yes” answer (and for only those), a certificate is a set $ S’ $ of integers such that $ S’ \subseteq S $ and $ \sum_{v \in S’} = t $. The set $ S’ $ is obviously of polynomial size w.r.t. the instance $ \Tuple{S,t} $ as $ S’ \subseteq S $. On can also easily check in polynomial time whether the certificate is valid, i.e., whether it holds that (i) $ S’ \subseteq S $ and (ii) $ \sum_{v \in S’} = t $. Thus the $ \SUBSETSUM $ problem is in the class $ \NP $.

Example

A certificate for the instance $ \Tuple{S,t} $ with

$ S = \{2, 7, 14, 49, 98, 343, 686, 2409, 2793, 16808, 17206, 117705, 117993\} $ and
$ t = 138457 $

is $ S’ = \Set{2, 7, 98, 343, 686, 2409, 17206, 117705} $.

The instance $ \Tuple{S,t} $ with

$ S = \{2, 7, 14, 49\} $ and
$ t = 15 $

does not have any certificates and the answer for it is “no”.

Propositional satisfiability¶

Propositional formulas are built on

Boolean variables, and
connectives of unary negation $ \neg $, binary disjunction $ \lor $ and binary conjunction $ \land $.

In Scala terms, the negation $ \neg $ corresponds to the unary operation !, the conjunction $ \land $ to the binary operation &&, and the disjunction $ \lor $ to ||. The difference is that in Scala, a Boolean variable always has a value (either true of false). Therefore, in Scala a Boolean expression (a && !b) || (!a && b) can be evaluated at any time point to some value. In the propositional formula level, a variable is like an unknown variable in mathematics: it does not have a value unless we associate one by some mapping etc. Thus the propositional formula $ (a \land \neg b) \lor (\neg a \land b) $ cannot be evaluated unless we give values to $ a $ and $ b $: an assignment $ \TA $ for a formula $ \phi $ is a mapping from the variables $ \VarsOf{\phi} $ in the formula to $ \Booleans = \Set{\False,\True} $. It satisfies the formula if it evaluates the formula to true. A formula is satisfiable if there is an assignment that satisfies it; otherwise it is unsatisfiable.

Example

The formula $$((a \land b \land \neg c) \lor (a \land \neg b \land c) \lor (\neg a \land b \land c) \lor (\neg a \land \neg b \land \neg c)) \land (a \lor b \lor c)$$ is satisfiable as the assignment $ \TA = \{a \mapsto \True,b \mapsto \False,c \mapsto \True\} $ (among 2 others) evaluates it to true.

The formula $ (a) \land (\neg a \lor b) \land (\neg b \lor c) \land (\neg c \lor \neg a) $ is unsatisfiable.

Definition: $ \Prob{propositional satisfiability} $, $ \SAT $

Instance: a propositional formula $ \phi $.

Question: is the formula $ \phi $ satisfiable?

The propositional satisfiability problem can be understood as the problem of deciding whether an equation $ \phi = \True $ has a solution, i.e. deciding whether the variables in $ \phi $ have some values so that the formula evaluates to true. The problem is in $ \NP $ as we can use a satisfying assignment as the certificate:

its size is linear in the number of variables in the formula, and
evaluating the formula to check whether the result is $ \True $ is easy.

To simplify theory and practice, one often uses certain normal forms of formulas:

A literal is a variable $ x_i $ or its negation $ \neg x_i $
A clause is a disjunction $ (l_1 \lor … \lor l_k) $ of literals
A formula is in 3-CNF (CNF means conjunctive normal form) if it is a conjunction of clauses and each clause has 3 literals over distinct variables. For instance, $ (x_1 \lor \neg x_2 \lor x_4) \land (\neg x_1 \lor \neg x_2 \lor x_3) \land (x_2 \lor \neg x_3 \lor x_4) $ is in 3-CNF.

Definition: $ \SATT $

Instance: a propositional formula in 3-CNF.

Question: is the formula satisfiable?

Some other problems in NP¶

Definition: $ \Prob{Travelling salesperson} $

Instance: An edge-weighted undirected graph $ \Graph $ and an integer $ k $.

Question: Does the graph have a simple cycle visiting all the vertices and having weight $ k $ or less?

Source: Univ. Waterloo

Definition: $ \Prob{Generalized Sudokus} $

Instance: An $ n \times n $ partially filled Sudoku grid ($ n=k^2 $ for some integer $ k \ge 1 $).

Question: Can the Sudoku grid be completed?

Definition: $ \Prob{longest simple path (decision version)} $ problem

Instance: an undirected graph $ \Graph=\Tuple{\Verts,\Edges} $, two vertices $ \Vtx,\Vtx’ \in \Verts $, and an integer $ k $.

Question: Is there a simple path of length $ k $ or more from $ \Vtx $ to $ \Vtx’ $?

Definition: $ \MSTDEC $ problem

Instance: A connected edge-weighted and undirected graph $ G=\Tuple{\Verts,\Edges,\UWeights} $ and an integer $ k $.

Question: Does the graph have a spanning tree of weight $ k $ or less?

Definition: $ \Prob{Minimum Steiner tree (decision version)} $ problem

Instance: A connected edge-weighted and undirected graph $ \Graph=\Tuple{\Verts,\Edges,\UWeights} $, a set $ S \subseteq \Verts $, and an integer $ k $.

Question: Does the graph have a tree that spans all the vertices in $ S $ and has weight $ k $ or less?

And hundreds of others…

P versus NP¶

For polynomial-time solvable problems, the instance itself can act as the certificate because we can always compute, in polynomial time, the correct no/yes answer from the instance itself. Therefore, $ \Poly \subseteq \NP $. However, we do not know whether $ \Poly = \NP $. In other words, we do not know whether it is always the case that if a solution is small and easy to check, is it always relatively easy to find it? Most people believe that $ \Poly \neq \NP $ but no-one has yet been able to prove this (or that $ \Poly = \NP $). We know efficient algorithms for many problems such as $ \Prob{minimum spanning tree} $, $ \Prob{shortest path} $, and so on. But we do not know any efficient algorithms for $ \SUBSETSUM $, $ \SATT $, and hundreds of other (practically relevant) problems in $ \NP $, even though lots of extremely clever people have tried to find ones for decades.