In a previous Math Investor blog, we described the emerging world of blockchain, emphasizing how it might impact the financial services and investment world. Already numerous firms, including several startup organizations, are pursuing blockchain to facilitate and streamline many types of financial transactions.
It is worth taking a brief look at the mathematics behind blockchain. The following is based in part on an article by Eric Rykwalder, one of the founders of Chain.com, a startup blockchain software firm in San Francisco.
The elliptic curve digital signature algorithm
Blockchain is basically a publicly available ledger where participants enter data and certify their acceptance of the transaction via an elliptic curve digital signature algorithm (ECDSA). An elliptic curve is an equation such as y2 = x3 + a x + b. In Bitcoin and most other implementations, a = 0 and b = 7, so this is simply y2 = x3 + 7 (see graph). Elliptic curves have numerous interesting properties, such as the fact that a nonvertical line intersecting two nontangent points will always intersect a third point on the curve. Indeed, one can define “addition” on the curve as finding that third point corresponding to two given points. This is basically what is done in ECDSA, except that the operations are performed modulo some large prime number M.
In particular, in ECDSA, addition of two points (p1,p2) and (q1,q2), and the doubling of (p1,p2), are performed as follows:
Addition of (p1,p2) and (q1,q2):
c = (q2 – p2) / (q1 – p1) mod M
r1 = c2 – p1 – q1 mod M
r2 = c (p1 – r1) – p2 mod M
Doubling of (p1,p2):
c = (3 p12) / (2 p2) mod M
r1 = c2 – 2p1
r2 = c (p1 – r1) – p2
Some readers will note that “division” is indicated in the first line of each algorithm. What this means is the product, modulo M, of the expression to the left of the slash by the multiplicative inverse of the expression to the right of the slash. Since M is a prime, every nonzero integer from 1 to M-1 has a multiplicative inverse. For example, the multiplicative inverse of 5 mod 17 is 7, because 5*7 = 35 = 1 mod 17; in other words, 5-1 mod 17 = 7. In practice, these inverses are rapidly calculated by means of the Euclidean algorithm, where one accumulates the divisors in a particular way. See a note by Nick Korevaar for some examples.
One other preliminary detail is how to “multiply” in this algebraic structure, in particular to calculate an expression such as m * (p1,p2) for some integer m. This can be done by first doubling the input (p1,p2), and then using the addition algorithm repeatedly until m copies of (p1,p2) have been added, but this of course is not practical when m and M are very large, as they are in real blockchain applications. Instead, such “multiply” operations are typically done using the binary algorithm for multiplication, which we will sketch here for ordinary integers but which can be easily adapted to ECDSA:
To compute r = n * b mod M: First set t to be the largest power of two such that t is less than or equal to n, and set r = 1. Then perform:
A: If n is greater than or equal to t, set r = b + r mod M, and set n = n – t; else set t = t / 2.
B: If t is greater than or equal to 1 set r = 2 * r mod M, and go to A (if t < 1, then we are done).
Computer scientists will immediately recognize this as almost identical to the binary algorithm for exponentiation modulo M, except that we are adding and doubling instead of multiplying and squaring. It is implied, for example, when one writes 317 mod 10 = ((((32 mod 10)2 mod 10)2 mod 10)2 mod 10) * 3 mod 10 = 3, thus performing the exponentiation in only five multiplications mod 10 instead of 16 or 17.
Now we may state the ECDSA algorithm (except that we omit some relatively minor details that apply mainly to real-world implementations):
First, select a modulus M, a “base point” (p1,p2), and a private key k1 (integer between 1 and M-1). These are typically selected such that the order of the base point (namely the maximum number of times (p1,p2) can be added to itself before the addition formula above fails due to zero divide) is prime and at least as large as M (this is not required but is normally done, and with this assumption the algorithm below is simpler). This often takes some experimentation, although practical applications can do this very rapidly.
As a concrete example, let us take M = 199 (which is prime), and the base point (p1,p2) = (2,24). For this M and (p1,p2), one can calculate that the order n = 211. Then let us select as our private key k1 = 151. We first need to calculate the public key (r1,r2) corresponding to the private key. This is done by multiplication:
(r1,r2) = k1 * (p1,p2)
where again the multiplication is done either by repeated summation or by the binary algorithm above. If we do this, we find that the public key (r1,r2) = (64,80).
Now select some data z1, say z1 = 104. We shall construct a digital signature of the data. This is done as follows:
1. Choose some integer k2 between 1 and n-1, where n is the order.
2. Calculate (s1,s2) = k2 * (p1,p2). If s1 = 0, return to step 1.
3. Calculate s2 = (z1 + s1 * k1) / k2 mod n. If s2 = 0, return to step 1.
Then the digital signature is (s1,s2). In our specific case, if we select k2 = 115, we calculate (s1,s2) = (99,52).
Now we can test the digital signature, as a third party might to verify that the transaction (which in this example we presume is coded in the data z1 = 104) is valid. This is done as follows:
1. Calculate u1 = s2-1 mod n
2. Calculate u2 = z1 * u1 mod n
3. Calculate u3 = s1 * u1 mod n
4. Calculate (t1,t2) = u2 * (p1,p2) + u3 * (r1,r2)
5. Verify that t1 = s1.
In our case, we find that the result of step 4 is (t1,t2) = (99,44). Since t1 = 99 = s1 (see above), the validity of the signature is confirmed (it is not necessary for t2 to equal s2).
We should emphasize that our example involves extremely modest-sized integers. In a real Bitcoin or blockchain application, these integers are typically 256 bits long, dramatically increasing the cost of performing the above operations, but, on the other hand, very dramatically increasing the cost required for someone to “break” the system, such as by computationally attempting to recover the private key from the public key.
So what conclusions can we draw from this exercise? First of all, it should be clear that the mathematics involved is not trivial, and the necessary computations to implement this scheme certainly are not trivial. Nonetheless what we have is an effective one-way function: it is relatively easy to verify a signature, but it is very difficult to work back from publicly available data, such as the public key, to obtain the critical private key.
ECDSA is the essence of how both Bitcoin and other blockchain applications work. The scheme has resisted some rather extensive testing for weaknesses, both mathematically and computationally. The few failures that have occurred in practice have generally been because users were not careful in protecting their private keys, or else they used a fairly standard pseudorandom number generator to produce the private keys, which attackers then exploited.
As with all the technology we rely on in our digital age, the weakest links are users who are not careful.