Imagine logging into a secure web server which, instead of asking you to type in your password, merely asks you questions about your password until it’s convinced that you really do know it and therefore are who you say you are. Moreover, imagine that your answers to the server’s questions provide no information whatsoever which could be used by a malicious hacker, even if all communications between you and the server are being intercepted. Finally, imagine that the server in question not only does not store any information about your password, it has never at any point asked you for information about your password.
Sounds too good to be true, right?
In fact, such password schemes do exist, and they’re quite easy to implement. They are known as zero knowledge authentication systems. In this post, I’ll explain the main idea behind such protocols using the notion of a “one-way homomorphism”. Before diving into the technicalities, though, here’s a useful thought experiment which conveys the main idea.
The Parable of the Mismatched Socks
You wake up on April 1st and put on what you believe is a pair of red socks; however, being color-blind you aren’t 100% sure. Your wife takes one look at you and says, “Christmas isn’t until December, you know.” You realize that either you accidentally put on one red sock and one green one, or your wife, who is a well-known enthusiast for April Fool’s pranks, is taking advantage of your color-blindness to play a trick on you. In a flash of brilliance, you come up with the following method for testing your wife’s truthfulness:
First, you take off the socks, place one in each hand, and ask your wife to remember which one is where. Then:
(1) You place your hands behind your back and secretly either switch the socks between your hands or leave the socks where they are.
(2) You bring your hands in front of you and ask, “Did I switch hands?”
She answers correctly. So you repeat steps (1) and (2) again. She gets it right once more. You repeat steps (1) and (2) eighteen more times, for a total of 20 questions, randomly selecting each time whether or not to switch the socks. And she gets it right every single time.
Assuming you haven’t been married so long that she can simply anticipate all of your random choices, there are only two possibilities here: either the socks really are different colors or she was just guessing every time and got extremely lucky. The chances of the latter are 1 in , or about one in a million. You decide that your wife is not, in fact, taking advantage of your color-blindness as a cruel joke but is genuinely trying to save you from public embarrassment.
Zero Knowledge Proofs
Aside from the remarkable fact that you’re completely color-blind and yet your wife has managed to convince you that your socks are mismatched, there’s another interesting feature of this thought experiment: at no point have you been given any information about which sock is red and which one is green. This is — morally speaking, anyway — the idea behind a zero knowledge proof.
A Mathematical Implementation
This is fascinating, but it doesn’t seem to have much to do with our original problem concerning computer passwords. So let’s see how to implement the idea behind the thought experiment in a more abstract mathematical way.
A group is a set together with a binary operation satisfying a few simple axioms. In particular, there should be an identity element with for all , and for every there must be an element with .
If and are groups, a homomorphism from to is a function such that and for all .
We’ll call a group computable if and can be efficiently computed for all .
A one-way homomorphism is a homomorphism such that (a) can be efficiently computed for all ; but (b) is hard to “invert”, in the sense that given a random element it is computationally infeasible to compute an element with .
Here are some classic examples of one-way homomorphisms which are frequently used in cryptography:
(1) Let be a large prime number, let be a primitive root modulo , and let be the map sending (considered as an integer modulo ) to modulo . The value of can be computed efficiently for every via the method of successive squaring, but computing given is the famous discrete logarithm problem.
(2) Let where are large prime numbers, and let be the map sending to modulo . This is a one-way homomorphism (assuming that factoring is hard), because if we had an efficient algorithm to compute a square root of any we could use this algorithm to factor (see Concluding Remark 2 below.)
Suppose, then, that we’re given a one-way homomorphism between computable groups and that Peggy selects a random element as her secret password. (Technical note: We need to assume that we have an efficient algorithm for selecting a random element of ; this is certainly the case in the two examples above.) Peggy makes the value public; anyone who cares is allowed to know the value of .
Here’s how Victor (the verifier) can verify that Peggy (the prover) knows the password , without compromising the fact that must be kept secret:
(1) Peggy generates a random element and sends to Victor.
(2) Victor flips a fair coin. If it comes up heads, he asks Peggy to send him the value of ; if it comes up tails, he asks Peggy to send him .
(3) Once Peggy sends the requested value , Victor verifies (if it was heads) that or (if it was tails) that . Since is a homomorphism, if Peggy is telling the truth then Victor’s verification procedure will work.
Peggy and Victor now repeat steps (1)-(3) some number of times. As we will see, if they run through this procedure times and the verification checks out each time, the probability that Peggy is in fact a malicious hacker who does not actually know the value of will be at most . So if , for example, there is less than a one in a million chance that Peggy does not actually know the password.
Why is it difficult for someone who doesn’t know the password to answer the queries correctly? And why do Peggy’s responses not provide Victor (or anyone eavesdropping on their interaction) with any information about the value of ?
Let’s answer the second question (the “zero knowledge” part) first. If Peggy chooses a random and sends it to Victor, this clearly does not provide any new information about the value of . And that’s exactly what happens when Victor’s coin flip comes up heads. On the other hand, if the coin flip comes up tails, Peggy sends the value of . However, since was chosen randomly, the value of will also be completely random. So in either case, Peggy is not revealing any information about itself to Victor or anyone else.
For the first question, we need to suppose that Peggy is in fact a malicious hacker and place ourselves in her shoes. How could she attempt to get every question correct without knowing the value of or having an efficient algorithm to invert the function ? Well, if Peggy can anticipate Victor’s coin flips, then whenever he is about to flip “tails”, Peggy can report instead of and instead of . This does not require knowledge of , and Victor will certify Peggy’s response as correct since .
However, if Victor’s coin flips are truly random then Peggy cannot use this strategy. In order to account for the possibility that Victor might flip heads, she needs to legitimately report as . But if Victor flips tails, Peggy will then be stuck needing to produce the value of given , which clearly requires knowledge of .
1. Zero-knowledge proofs were first introduced in this 1989 paper by Goldwasser, Micali, and Rackoff. The abstract protocol introduced here (in the special case where ) is a simplified version of the Feige-Fiat-Shamir identification scheme. The case of our abstract protocol in which originated (as far as I can tell) with this paper. The thought experiment involving color-blindness is frequently used to illustrate zero-knowledge proofs, but usually with balls rather than socks. I read the “sock” version in this expository paper by Antoine Chambert-Loir.
2. To see that if we had an efficient algorithm to compute a square root of any we could use this algorithm to factor , take a random , square it, and use the algorithm to find one of the four square roots of modulo . After repeating this a number of times, with high probability we will find such that . The GCD of and will then be equal to either or .
3. In practice an interactive “query and response” protocol of the above type can be inconvenient. Using a suitable cryptographic hash function, one can reduce any such protocol to a non-interactive one using the so-called Fiat-Shamir heuristic.
4. Another example of a one-way homomorphism is the map sending to , where is the set of -points of a suitable elliptic curve over a large finite field and is a point of large order (e.g. a cyclic generator). Cryptographers have proposed and studied many other interesting examples. I’m not sure if it’s useful to allow the groups and/or to be non-abelian, but I noticed while writing up this post that (unlike in discrete logarithm-based key exchange protocols such as Diffie-Hellman) commutativity is not needed for the kind of zero-knowledge identification schemes discussed here.