9. Improvements on Computations of Discrete Logs#
In this chapter we introduce methods that can improve on Shank’s Baby-Step/Giant-Step algorithm when the order of the base is not prime.
9.1. Solving the Discrete Log for Powers of Prime Orders#
Suppose that \(g \in \mathbb{F}_p^{\times}\) such that \(g\) has order \(q^e\), where \(q\) is some prime and \(e\) is an integer greater than one. Of course, we can use Shank's Baby-Step/Giant-Step algorithm. But there is a more efficient way in this case!
9.1.1. Algorithm#
Algorithm 9.1 (Discrete Log for Power of Prime Order)
Let \(g, h \in \mathbb{F}_p^{\times}\) such that \(|g| = q^e\), where \(q\) is some prime and \(e \geq 2\). To compute \(\log_g(h)\):
Initialize \(N_0 = |g| = q^e\), \(N = N_0\), \(x = 0\).
While \(N > 1\):
Compute \(y = \log_{g^{N/q}}\left( h^{N/q}\right)\) using Shank's Baby-Step/Giant-Step algorithm.
Let \(x \leftarrow x + y N_0/N\), \(h \leftarrow g^{-y}h\), \(g \leftarrow g^q\), and \(N \leftarrow N/q\).
Return \(x\).
(Remember that the \(\leftarrow\) denotes assignment.)
Note that in the loop, every iteration divides \(N\) by \(q\), so it takes a total of \(e\) iterations to get to \(N=1\) and the loop to stop.
9.1.2. Why Does It Work#
Let’s try to justify why the algorithm above indeed gives us the discrete log \(\log_g(h)\).
Before we get into it, let’s remind ourselves of a couple of results we’ve seen in chapter about powers:
Proposition 9.1 (Properties of Powers)
If \(a\) is a unit in \(\mathbb{Z}/n\mathbb{Z}\) of order \(m\), then:
For any integer \(k\) we have:
We have:
So, suppose that \(g \in \mathbb{F}_p^{\times}\) with \(|g| = q^e\) for some prime \(q\) and \(e \geq 2\), and that \(h = g^{x}\), for some \(x \in \{0, 1, 2, \ldots , q^e - 1\}\). (Note that we do not know the value of \(x\), as it is what we are trying to compute. But we are assuming that it exists.). We then can write \(x\) in base \(q\) as
Let’s find the first digit \(d_0\). Since \(g^x = h\), we start by raising both sides to the power \(q^{e-1}\):
Now, by the first part of Proposition 9.1, note that
i.e., \(\left| g^{q^{e-1}} \right| = q\).
Now, since \(g^{q^{e-1}}\) has order \(q\) (prime), we use Shank’s Baby-Step/Giant-Step to find a solution, say \(y\). This means that
and by the second item of Proposition 9.1, we have that
This means that when we solved (9.1), we found \(d_0\)!
We now find \(d_1\) in a similar way. Observe that
Letting \(g' = g^q\), \(h' = g^{-d_0}h\), and \(x' = d_1 + d_2 q + \cdots + d_{e-1}q^{e-2}\), and multiplying both sides of the equation above by \(g^{-d_0}\), we get:
i.e.,
Note that, again by the first part of Proposition 9.1, we have that \(|g'| = q^{e-1}\), so the order went down by a factor of \(q\).
So, to find \(d_1\) we repeat the process to find \(d_0\) above, and repeat until we find all digits \(d_i\) (in \(e\) iterations of the process).
9.1.3. Example#
Let’s now illustrate the method with an example.
We will take \(p = 2{,}116{,}179{,}719\) and \(q = 1{,}019\), both primes:
p = 2116179719
q = 1019
is_prime(p), is_prime(q)
(True, True)
We then have \(p = 2q^3 + 1\):
p == 2 * q^3 + 1
True
We will take \(g = 2\), and indeed, \(|g| = q^3\):
g = Mod(2, p)
g.multiplicative_order() == q^3
True
Finally, let \(h = 805{,}343{,}147\).
h = Mod(805343147, p)
We need to find \(\log_g(h)\), i.e., we need \(x\) such that \(g^x = h\). We follow Algorithm 9.1.
Since the algorithm changes the original values of g and h, let’s store the original values in different variables:
g_original = g
h_original = h
(Note that if the algorithm is in the body of a function, this is not necessary, as the variable outside the function will not be changed!)
First, we can initialize our values:
x = 0
N0 = N = g.multiplicative_order()
Then, we solve a discrete log with a base of order \(q\) (prime). We will use Sage’s own discrete_log function for this computation:
power = N // q # compute it only once!
y = discrete_log(h^power, g^power)
y
617
Now, we update the values:
x += y * (N0 // N)
h *= g^(-y)
g = g^q
N = N // q
Then, if the value of N is not one, we will repeat:
N
1038361
So, repeat: first we compute a new discrete log (with base of order \(q\)):
power = N // q
y = discrete_log(h^power, g^power)
y
6
And update the variables:
x += y * (N0 // N)
h *= g^(-y)
g = g^q
N = N // q
Check N:
N
1019
It’s not one, so we repeat again:
power = N // q
y = discrete_log(h^power, g^power)
y
0
Update the variables:
x += y * (N0 // N)
h *= g^(-y)
g = g^q
N = N // q
Check N again:
N
1
Since it is one now, we are done and x has our discrete log:
x
6731
Let’s check:
g_original^x == h_original
True
It worked!
9.1.4. Number of Operations#
In this situation where \(|g| = q^e\) and we compute \(\log_g(h)\), if we use Shank's Baby-Step/Giant-Step algorithm, as seen in the number of operations for Shank’s algorithm, we need about
multiplications.
How many multiplications do we perform using the method above? In that case we compute \(e\) discrete logs with a base of order \(q\) using Shank’s algorithm, so we perform about
multiplications in total.
Thus, the ratio between the number of multiplications is
Therefore, a straight use of Shank’s algorithm in our example above would use \(q = 1{,}019\) times the number of multiplications of our new method. In practice, for larger primes \(q\) and/or larger powers \(e\), the gains are even higher!
Homework
You will implement this algorithm in your homework.
9.2. The Pohlig-Hellman Algorithm#
Now suppose that \(g \in \mathbb{F}_p^{\times}\) and we have \(|g| = N\), with the prime factorization:
We then have the following algorithm:
Algorithm 9.2 (Pohlig-Hellman Algorithm)
Let \(g, h \in \mathbb{F}_p^{\times}\) and suppose that \(|g| = N = q_1^{e_1} q_2^{e_2} \cdots q_k^{e_k}\), with \(q_i\)’s distinct prime, and \(e_i \geq 1\) for \(i = 1, 2, \ldots, k\). To compute \(\log_g(h)\):
For each \(i \in \{ 1, 2, \ldots, k \}\) let \(N_i = N/q_i^{e_i}\), \(g_i = g^{N_i}\) and \(h_i = h^{N_i}\) and find \(y_i = \log_{g_i} \left( h_i \right)\) using Algorithm 9.1.
Use the Chinese Remainder Theorem to find \(x\) such that
Then, \(\log_g(h) = x\).
9.2.1. Why Does It Work?#
Since we have that \(g_i = g^{N_i}\), we have that
So, if \(x\) satisfies the congruences above, we have that \(x \equiv y_i \pmod{q_i^{e_i}}\), and hence \(g_i^x = g_i^{y_i} = h_i\) in \(\mathbb{F}_p^{\times}\), for all \(i \in \{1, 2, \ldots, k\}\).
Note that \(\gcd(N_1, N_2, \ldots, N_k) = \gcd(N/q_1^{e_1}, N/q_2^{e_2}, \ldots , N/q_k^{e_k}) = 1\), since \(q_i\) does not divide \(N/q_i^{e_i}\). Then, using the Generalized Extended Euclidean Algorithm, we can find integers \(r_1, r_2, \ldots , r_k\) such that
Then:
9.2.2. Number of Operations#
In the algorithm we compute \(2k\) powers, which we can do using Fast Powering. The number of products that this takes is at most:
Then, solving the discrete logs (with bases having order powers of primes) takes about
more multiplications.
Finally, we can solve the system of congruences with about \(\boxed{2 \log_2(N) + 2(k-1)}\) long divisions.
If we were using the straight Shank’s Baby-Step/Giant-Step Algorithm, we would need about \(2\sqrt{N}\) multiplications and \(n \log_2(n) \approx \sqrt{N} \log_2(\sqrt{N}) = (\sqrt{N}/2) \log_2(N)\) comparisons. When \(N\) factors, the process above is a lot more efficient.
For instance, let’s take \(N\) (the order of \(g\)) to be a very large number:
N = 275478590580404620961069223105530214854517114273998667300885664883160258239634122227989362418355423134437149221136296365217903294873136621174602837803
But, it factors:
N_factorization = factor(N)
N_factorization
2309^8 * 2341^10 * 2521^7 * 3847^4 * 4153^9 * 4211^5
If we were to use Shank’s Baby-Step/Giant-Step directly, the number of multiplications would be:
floor(sqrt(N)/2 * log(N, base=2))
130278056880962267509600048076969842842723121576434617757316043508493573783854
On the other hand, using Algorithm 9.2, the number of multiplications would be:
k = len(N_factorization) # number of prime factors
# multiplications from powers + multiplications from discrete logs
floor(2 * (k - 1) * log(N, base=2) + 2 * k) + floor(sum(e * sqrt(q)/2 * log(q, base=2) for (q, e) in N_factorization))
18733
And we also need some long divisions, more precisely, the total number of long divisions is at most:
k = len(N_factorization)
floor(2 * log(N, 2) + 2 * (k - 1))
1002
This is considerably better than the straight use of Shank’s algorithm.
Note
Note that using Pohlig-Hellman’s algorithm, we still use Shank’s Baby-Step/Giant-Step algorithm to compute some discrete logs, but only for bases of prime order.
Homework
You will implement Pohlig-Hellman’s algorithm in your homework.
Important
The Pohlig-Hellman algorithm show how important it is that we choose \(g\) of prime order in the Diffie-Hellman key exchange and ElGamal cryptosystem. If not, Eve has more efficient methods to solve the discrete log problem.
9.2.3. Example#
Let’s illustrate Pohlig-Hellman’s algorithm with an example. We will take:
N = 22974332779312916308087541215025543130953873335484909873
p = 2 * N + 1
Indeed, \(p\) is prime:
is_prime(p)
True
First we need to factor this large \(N\).
N_factorization = factor(N)
N_factorization
583669^2 * 971177^3 * 2929231^4
Important
We did not take this step into account when counting the number of operations for this algorithm. As we soon shall see, factoring a large number \(N\) can be quite difficult and time consuming. If that is the case, the prime factors will be large as well, and Pohlig-Hellman might not be the best choice.
In this case, we have that \(3\) has order \(N\) in \(\mathbb{F}_p^{\times}\):
g = Mod(3, p)
g.multiplicative_order() == N
True
Now, let’s take some \(h \in \mathbb{F}_p^{\times}\):
h = Mod(117619616680834488747814058359345855076997576088312309, p)
Now, we need to find \(x\) such that \(g^x = h\), following the algorithm. First, let’s initialize a lists to have the solutions for the discrete logs corresponding to each power of a prime and the corresponding moduli:
y = []
m = []
Now, we select the first prime and corresponding power:
q, e = N_factorization[0]
q, e
(583669, 2)
We define the elements for each we will solve the discrete log:
NN = N // q^e # power for g and h (compute it only once!)
gg = g^NN
hh = h^NN
Now, we solve a discrete log (with a base that has order q^e), using Algorithm 9.1. In here (as in your homework!) we can use Sage’s discrete_log, which internally does exactly that:
y.append(discrete_log(hh, gg)) # add discrete log to the list
m.append(q^e) # add the modulus to the list
y, m
([312457750089], [340669501561])
Then, we move to the next prime and power:
q, e = N_factorization[1]
q, e
(971177, 3)
And follow the same steps:
NN = N // q^e
gg = g^NN
hh = h^NN
y.append(discrete_log(hh, gg))
m.append(q^e)
y, m
([312457750089, 764093480249851], [340669501561, 915999350837922233])
And, last prime (and power):
q, e = N_factorization[2]
q, e
(2929231, 4)
Repeat the steps:
NN = N // q^e
gg = g^NN
hh = h^NN
y.append(discrete_log(hh, gg))
m.append(q^e)
y, m
([312457750089, 764093480249851, 764093480249851],
[340669501561, 915999350837922233, 73623165508788895650352321])
Now we are done, since we ran out of primes. So, y contains the left-sides of the congruences for the CRT, and m contains the corresponding moduli. We solve it using Sage’s CRT function:
x = crt(y, m)
x
764093480249851
Now x should be our discrete log. Indeed:
g^x == h
True
It worked!
9.3. Further Improvements?#
There is a method for solving the DLP that is even more efficient, called index calculus, which we will learn later.