A practical introduction to the PCP class
This is a post motivating and demonstrating how to write a probabilistically checkable proof for the graph non isomorphism () language. For the sake of practicality and beginner friendliness, definitions will be somewhat handwaved.
Still, this post assumes that you know enough about computational complexity to understand terms as "language", "complexity class" or "turing machine" and that you already heard a bit about the class before.
The definition of PCP
The class is built upon letting a verifier , running in polynomial time, having probabilistic access (i.e. has access to a source of randomness) to a proof string provided by a prover .
Some key definitional points about what the class means:
- A verifier of a statement for a language must always accept any valid proof . If is invalid, accepts it with probability less than
- What mean is that can use random coins and non-adaptive queries to . A non-adaptive query is a query whose content is independent of the answers gets when querying
- The proof is of length at most
- No assumption is made about 's computational power - i.e. is not assumed to be a polynomial time running prover
PCP theorem
A landmark achievement of complexity theory is the theorem. It states that , meaning that for any language in and a statement , a proof system can be built with:
- Completeness and soundness
- A proof length of at most
- running in polynomial time
- needing only a constant amount of non-adaptive queries and a logarithmic amount of coin tosses in the size of the statement .
For a result applying to the whole class, this makes up for a pretty effective out-of-the box verifier! Above all, it characterizes in a whole new light, using probabilistic and interactive turing machines, two notions which were absent from the initial definition of .
The history of the PCP theorem is pretty interesting. At the time, there was some kind of race to be the first to be able to define in the most efficient way possible in terms of the class, using the smallest number of random coins and queries. It was this race that eventually led up to the PCP theorem, published in 92, and that we know today.
However, building PCP proofs for a language is non trivial. There is still a language for which building a PCP proof is actually pretty easy: the graph non-isomorphism language aka .
The graph non-isomorphism language
We say that two graphs are isomorphic if one is a permutation of vertices of the other. In fact, the language of deciding whether two graphs are isomorphic is in : the permutation is a certificate for the isomorphism. Using this certificate, the verifier can verify in polynomial time that the isomorphism claim is correct.
The complement of is , graph non-isomorphism. If you pause for a minute, you will realise that it's pretty darn hard to think of a short proof (of length polynomial in the inputs) when claiming that two graphs are not isomorphic. What certificate can be used to prove that for all the permutations of the vertices of a given graph , none is equivalent to a second graph ? I.e. is not in .
Still, this example is a good a motivation for building PCP proofs. While there might not be a deterministic, short proof based verifier for , the PCP class still lets us trade additional bits in the proof and access to randomness for a polynomial time probabilistic verifier. That is pretty cool!
An astute reader might note that is in since there exists an interactive proof system for a prover to convince a verifier that two graphs , are not isomorphic. The protocol for it is straightforward and consists, in broad strokes, into letting choosing to permute (randomly) either or and asking whether or was permuted.
But, to the difference of , the PCP class even lets us even reduce the interaction phase to a single message between and , using only the proof .
As mentioned above, is not in . Let me now tell you that is in . This means that any non-isomorphism claim about two graphs has a probabilistically checkable proof of length polynomial in the size of the inputs (i.e. the number of vertices in the checked graphs), checkable by using a number of random coins polynomial in the length of the inputs and a constant number of queries.
Again, note that this is quite counter-intuitive for a language with no short proof to have such an efficient verification procedure. The key lays in letting access randomness and allowing to append additional bits to his proof.
A PCP for GNI
The intuition behind 's PCP is simple. First, will build:
- A list of all non isomorphic graphs of vertices.
- A list of all possible graphs of vertices
Recall that no computational assumption has been made regarding . So we assume doing the above is possible for . Also, note that those two lists are different. There exists 1024 possible graphs with 5 vertices, out of which only 34 are not isomorphic to one another.
The proof will hand over to will consist in an array indexed by all possible graphs of vertices. Each value in the array will be set to 0 or 1 following whether the graph at this index is isomorphic to or . If the graph being indexed is not isomorphic to both, will set the value randomly.
Upon receiving , the check that does is straightforward. randomly chooses a bit or , takes , permutes it to obtain a graph and accesses the array at the position of . If the value at this position is then accepts, otherwise rejects.
Recall that by the definition of the PCP class, we should have for proof length something in the order of . That is the case, since for vertices, there exists possible graphs, which is the length of the proof that will send to . You can also see that the number of queries does to obtain the required soundness level is constant.
The soundness of this proof is straightforward. Suppose that and are permutations of each other. After samples a random bit , the permutation of , which is will get to access a position in the array whose value is with probability 0.5, and reject if the two bits differ. If wants to reduce the soundness error, he will just repeat his verification process many times.
For completeness, since if and are not isomorphic, randomly permuting them will make always access the array at an index whose value will be equal to the sampled bit .
Scripting a PCP for GNI
Let's build this PCP system for . I will be omitting imports and functions' definitions when comments are sufficient to understand what's going on. This is running with sage 9.8.
First, we build a list with all the possible graphs composed of vertices. Since sage has a database for non-isomorphic graphs, let's use it directly.
n = 5
vertices = range(n)
possible_edges = list(combinations(vertices, 2))
powerset_all_possible_edges = list(powerset(possible_edges))
all_graphs = []
for g in powerset_all_possible_edges:
d = defaultdict(list, {k: [] for k in vertices})
for edge in g:
d[edge[0]].append(edge[1])
all_graphs.append(Graph(d, immutable=True))
# we have 2^{(n * (n-1))/2} possible graphs with n vertices (
# the proof is of polynomial length in the number of vertices
# n vertices --> n(n-1)/2 pairs of possible points
# all possible subsets of those pairs of points = 2^{n(n-1)/2}
assert(len(all_graphs) == 2**((n**2 - n)/2))
# also, sage has a db of all non isomorphic graphs of n vertices
# list of all non isomorphic graphs with n vertices, using `graphs()`
non_isomorphic_graphs = [g.copy(immutable=True) for g in graphs(len(vertices))]
We will randomly sample two non-isomorphic graphs and build a proof for it.
# generating a proof requires two non-isomorphic graphs and
# a list of all possible graphs of n vertices
def prove(g0, g1, graphs):
pi = {} # the proof is an array of bits, indexed by all possible n-vertex graphs
for g in graphs:
if g.is_isomorphic(g0):
pi[g] = 0
elif g.is_isomorphic(g1):
pi[g] = 1
else:
# the choice does not matter if neither are isomorphic to g
# this bit will never be inspected by the verifier
pi[g] = random.choice([0, 1])
return pi
g0, g1 = sample_graph_pair(non_isomorphic_graphs)
pi = prove(g0, g1, all_graphs)
The verifier can check :
# Soundness is a parameter indicating how many checks the verifier should do on the proof
def verify(pi, g0, g1, soundness):
for i in range(soundness):
b = random.choice([0, 1])
gb = [g0, g1][b]
h = get_permuted_graph(gb)
if pi[h] != b:
return False
return True
proof_is_valid = verify(pi, g0, g1, 10000)
assert(proof_is_valid)
You can also try to cheat the verifier:
# try to cheat verifier with two isomorphic graphs
g1 = get_permuted_graph(g0) # g1 is a permutation of g0
pi = prove(g0, g1, all_graphs)
# if you set soundness to 1, there is a 1/2 prob that this check passes
proof_is_valid = verify(pi, g0, g1, 1)
assert(proof_is_valid == False)