9. Improvements on Computations of Discrete Logs#

In this chapter we introduce methods that can improve on Shank’s Baby-Step/Giant-Step algorithm when the order of the base is not prime.

9.1. Solving the Discrete Log for Powers of Prime Orders#

Suppose that \(g \in \mathbb{F}_p^{\times}\) such that \(g\) has order \(q^e\), where \(q\) is some prime and \(e\) is an integer greater than one. Of course, we can use Shank's Baby-Step/Giant-Step algorithm. But there is a more efficient way in this case!

9.1.1. Algorithm#

Algorithm 9.1 (Discrete Log for Power of Prime Order)

Let \(g, h \in \mathbb{F}_p^{\times}\) such that \(|g| = q^e\), where \(q\) is some prime and \(e \geq 2\). To compute \(\log_g(h)\):

  1. Initialize \(N_0 = |g| = q^e\), \(N = N_0\), \(x = 0\).

  2. While \(N > 1\):

    1. Compute \(y = \log_{g^{N/q}}\left( h^{N/q}\right)\) using Shank's Baby-Step/Giant-Step algorithm.

    2. Let \(x \leftarrow x + y N_0/N\), \(h \leftarrow g^{-y}h\), \(g \leftarrow g^q\), and \(N \leftarrow N/q\).

  3. Return \(x\).

(Remember that the \(\leftarrow\) denotes assignment.)

Note that in the loop, every iteration divides \(N\) by \(q\), so it takes a total of \(e\) iterations to get to \(N=1\) and the loop to stop.

9.1.2. Why Does It Work#

Let’s try to justify why the algorithm above indeed gives us the discrete log \(\log_g(h)\).

Before we get into it, let’s remind ourselves of a couple of results we’ve seen in chapter about powers:

Proposition 9.1 (Properties of Powers)

If \(a\) is a unit in \(\mathbb{Z}/n\mathbb{Z}\) of order \(m\), then:

  1. For any integer \(k\) we have:

\[|a^k| = \frac{|a|}{\gcd(|a|, k)} = \frac{m}{\gcd(m, k)}.\]
  1. We have:

\[a^k = a^l \quad \text{ if and only if } \quad k \equiv l \pmod{|a|}.\]

So, suppose that \(g \in \mathbb{F}_p^{\times}\) with \(|g| = q^e\) for some prime \(q\) and \(e \geq 2\), and that \(h = g^{x}\), for some \(x \in \{0, 1, 2, \ldots , q^e - 1\}\). (Note that we do not know the value of \(x\), as it is what we are trying to compute. But we are assuming that it exists.). We then can write \(x\) in base \(q\) as

\[x = d_0 + d_1 q + d_2 q^2 + \cdots + d_{e-1} q^{e-1}, \quad d_i \in \{ 0, 1, \ldots, q-1 \} .\]

Let’s find the first digit \(d_0\). Since \(g^x = h\), we start by raising both sides to the power \(q^{e-1}\):

(9.1)#\[\left( g^x \right)^{q^{e-1}} = h^{q^{e-1}} \quad \Longrightarrow \quad \left( g^{q^{e-1}} \right)^x = h^{q^{e-1}}.\]

Now, by the first part of Proposition 9.1, note that

\[\left| g^{q^{e-1}} \right| = \frac{|g|}{\gcd(|g|, q^{e-1})} = \frac{q^e}{\gcd(q^e, q^{e-1})} = \frac{q^e}{q^{e-1}} = q,\]

i.e., \(\left| g^{q^{e-1}} \right| = q\).

Now, since \(g^{q^{e-1}}\) has order \(q\) (prime), we use Shank’s Baby-Step/Giant-Step to find a solution, say \(y\). This means that

\[ \left( g^{q^{e-1}} \right)^{y} = h^{q^{e-1}} = \left( g^{q^{e-1}} \right)^x,\]

and by the second item of Proposition 9.1, we have that

\[y \equiv x \equiv d_0 \pmod{q}.\]

This means that when we solved (9.1), we found \(d_0\)!

We now find \(d_1\) in a similar way. Observe that

\[h = g^x = g^{ d_0 + d_1 q + d_2 q^2 + \cdots + d_{e-1} q^{e-1}} = g^{d_0} \cdot \left( g^q \right)^{d_1 + d_2 q + d^3 q^2 + \cdots + d_{e-1}q^{e-2}}.\]

Letting \(g' = g^q\), \(h' = g^{-d_0}h\), and \(x' = d_1 + d_2 q + \cdots + d_{e-1}q^{e-2}\), and multiplying both sides of the equation above by \(g^{-d_0}\), we get:

\[h' = g^{-d_0} g^x = \left( g^q \right)^{d_1 + d_2 q + d^3 q^2 + \cdots + d_{e-1}q^{e-2}} = (g')^{x'},\]

i.e.,

\[h' = (g')^{x'}.\]

Note that, again by the first part of Proposition 9.1, we have that \(|g'| = q^{e-1}\), so the order went down by a factor of \(q\).

So, to find \(d_1\) we repeat the process to find \(d_0\) above, and repeat until we find all digits \(d_i\) (in \(e\) iterations of the process).

9.1.3. Example#

Let’s now illustrate the method with an example.

We will take \(p = 2{,}116{,}179{,}719\) and \(q = 1{,}019\), both primes:

p = 2116179719
q = 1019
is_prime(p), is_prime(q)
(True, True)

We then have \(p = 2q^3 + 1\):

p == 2 * q^3 + 1
True

We will take \(g = 2\), and indeed, \(|g| = q^3\):

g = Mod(2, p)
g.multiplicative_order() == q^3
True

Finally, let \(h = 805{,}343{,}147\).

h = Mod(805343147, p)

We need to find \(\log_g(h)\), i.e., we need \(x\) such that \(g^x = h\). We follow Algorithm 9.1.

Since the algorithm changes the original values of g and h, let’s store the original values in different variables:

g_original = g
h_original = h

(Note that if the algorithm is in the body of a function, this is not necessary, as the variable outside the function will not be changed!)

First, we can initialize our values:

x = 0
N0 = N = g.multiplicative_order()

Then, we solve a discrete log with a base of order \(q\) (prime). We will use Sage’s own discrete_log function for this computation:

power = N // q  # compute it only once!
y = discrete_log(h^power, g^power)
y
617

Now, we update the values:

x += y * (N0 // N)
h *= g^(-y)
g = g^q
N = N // q

Then, if the value of N is not one, we will repeat:

N
1038361

So, repeat: first we compute a new discrete log (with base of order \(q\)):

power = N // q
y = discrete_log(h^power, g^power)
y
6

And update the variables:

x += y * (N0 // N)
h *= g^(-y)
g = g^q
N = N // q

Check N:

N
1019

It’s not one, so we repeat again:

power = N // q
y = discrete_log(h^power, g^power)
y
0

Update the variables:

x += y * (N0 // N)
h *= g^(-y)
g = g^q
N = N // q

Check N again:

N
1

Since it is one now, we are done and x has our discrete log:

x
6731

Let’s check:

g_original^x == h_original
True

It worked!

9.1.4. Number of Operations#

In this situation where \(|g| = q^e\) and we compute \(\log_g(h)\), if we use Shank's Baby-Step/Giant-Step algorithm, as seen in the number of operations for Shank’s algorithm, we need about

\[\frac{\sqrt{q^e}}{2} \cdot \log_2(q^e) = \frac{q^{e/2}}{2} \cdot e \cdot \log_2(q)\]

multiplications.

How many multiplications do we perform using the method above? In that case we compute \(e\) discrete logs with a base of order \(q\) using Shank’s algorithm, so we perform about

\[\boxed{e \cdot \frac{q^{1/2}}{2} \cdot \log_2(q)}\]

multiplications in total.

Thus, the ratio between the number of multiplications is

\[\frac{q^{1/2}}{q^{e/2}} = \frac{1}{q^{(e-1)/2}} .\]

Therefore, a straight use of Shank’s algorithm in our example above would use \(q = 1{,}019\) times the number of multiplications of our new method. In practice, for larger primes \(q\) and/or larger powers \(e\), the gains are even higher!

Homework

You will implement this algorithm in your homework.

9.2. The Pohlig-Hellman Algorithm#

Now suppose that \(g \in \mathbb{F}_p^{\times}\) and we have \(|g| = N\), with the prime factorization:

\[N = q_1^{e_1} q_2^{e_2} \cdots q_k^{e_k}.\]

We then have the following algorithm:

Algorithm 9.2 (Pohlig-Hellman Algorithm)

Let \(g, h \in \mathbb{F}_p^{\times}\) and suppose that \(|g| = N = q_1^{e_1} q_2^{e_2} \cdots q_k^{e_k}\), with \(q_i\)’s distinct prime, and \(e_i \geq 1\) for \(i = 1, 2, \ldots, k\). To compute \(\log_g(h)\):

  1. For each \(i \in \{ 1, 2, \ldots, k \}\) let \(N_i = N/q_i^{e_i}\), \(g_i = g^{N_i}\) and \(h_i = h^{N_i}\) and find \(y_i = \log_{g_i} \left( h_i \right)\) using Algorithm 9.1.

  2. Use the Chinese Remainder Theorem to find \(x\) such that

\[\begin{split}\begin{align*} x &\equiv y_1 \pmod{q_1^{e_1}},\\ x &\equiv y_2 \pmod{q_2^{e_2}},\\ & \;\; \vdots \\ x &\equiv y_k \pmod{q_k^{e_k}}. \end{align*}\end{split}\]

Then, \(\log_g(h) = x\).

9.2.1. Why Does It Work?#

Since we have that \(g_i = g^{N_i}\), we have that

\[|g_i| = \frac{|g|}{\gcd(|g|, N_i)} = \frac{N}{\gcd(N, N/q_i^{e_i})} = \frac{N}{N/q_i^{e_i}} = q_i^{e_i}.\]

So, if \(x\) satisfies the congruences above, we have that \(x \equiv y_i \pmod{q_i^{e_i}}\), and hence \(g_i^x = g_i^{y_i} = h_i\) in \(\mathbb{F}_p^{\times}\), for all \(i \in \{1, 2, \ldots, k\}\).

Note that \(\gcd(N_1, N_2, \ldots, N_k) = \gcd(N/q_1^{e_1}, N/q_2^{e_2}, \ldots , N/q_k^{e_k}) = 1\), since \(q_i\) does not divide \(N/q_i^{e_i}\). Then, using the Generalized Extended Euclidean Algorithm, we can find integers \(r_1, r_2, \ldots , r_k\) such that

\[r_1 \cdot N_1 + r_2 \cdot N_2 + \cdots + r_k \cdot N_k = 1.\]

Then:

\[\begin{split}\begin{align*} g^x &= \left( g^1 \right)^x \\[1.7ex] &= \left( g^{r_1 \cdot N_1 + \cdots + r_k \cdot N_k} \right)^x \\[1.7ex] &= g^{r_1 \cdot N_1 x + \cdots + r_k \cdot N_kx} \\[1.7ex] &= \left( g^{r_1 N_1 x} \right) \cdots \left( g^{r_n N_k x} \right) \\[1.7ex] &= \left( \left( g^{N_1} \right)^x \right)^{r_1} \cdots \left( \left( g^{N_k} \right)^x \right)^{r_k} \\[1.7ex] &= \left( g_1^x \right)^{r_1} \cdots \left( g_k^x \right)^{r_k} \\[1.7ex] &= \left( g_1^{y_1} \right)^{r_1} \cdots \left( g_k^{y_k} \right)^{r_k} \\[1.7ex] &= \left( h_1 \right)^{r_1} \cdots \left( h_k \right)^{r_k} \\[1.7ex] &= \left( h^{N_1} \right)^{r_1} \cdots \left( h^{N_k} \right)^{r_k} \\[1.7ex] &= h^{r_1 \cdot N_1 + \cdots + r_k \cdot N_k} \\[1.7ex] &= h^1 = h. \end{align*}\end{split}\]

9.2.2. Number of Operations#

In the algorithm we compute \(2k\) powers, which we can do using Fast Powering. The number of products that this takes is at most:

\[\begin{split}\begin{align*} (2 \log_2(N_1) + 2) &+ (2 \log_2(N_2) + 2) + \cdots + (2 \log_2(N_k) + 2) \\[1.7ex] &= 2 \left( \log_2(N_1) + \log_2(N_2) + \cdots + \log_2(N_k)\right) + 2k \\[1.7ex] &= 2 \log_2 \left( N_1 N_2 \cdots N_k \right) + 2k \\[1.7ex] &= 2 \log_2 \left( \frac{N}{q_1^{e_1}} \frac{N}{q_2^{e_2}} \cdots \frac{N}{q_k^{e_k}} \right) + 2k \\[1.7ex] &= 2 \log_2 \left( \frac{N^k}{N} \right) + 2k \\[1.7ex] &= 2 \log_2 \left( N^{k-1} \right) + 2k \\[1.7ex] &= \boxed{2(k-1) \log_2(N) + 2k}. \end{align*}\end{split}\]

Then, solving the discrete logs (with bases having order powers of primes) takes about

\[\boxed{e_1 \frac{q_1^{1/2}}{2} \log_2(q_1) + e_2 \frac{q_2^{1/2}}{2} \log_2(q_2) + \cdots + e_k \frac{q_k^{1/2}}{2} \log_2(q_k)}\]

more multiplications.

Finally, we can solve the system of congruences with about \(\boxed{2 \log_2(N) + 2(k-1)}\) long divisions.

If we were using the straight Shank’s Baby-Step/Giant-Step Algorithm, we would need about \(2\sqrt{N}\) multiplications and \(n \log_2(n) \approx \sqrt{N} \log_2(\sqrt{N}) = (\sqrt{N}/2) \log_2(N)\) comparisons. When \(N\) factors, the process above is a lot more efficient.

For instance, let’s take \(N\) (the order of \(g\)) to be a very large number:

N = 275478590580404620961069223105530214854517114273998667300885664883160258239634122227989362418355423134437149221136296365217903294873136621174602837803

But, it factors:

N_factorization = factor(N)
N_factorization
2309^8 * 2341^10 * 2521^7 * 3847^4 * 4153^9 * 4211^5

If we were to use Shank’s Baby-Step/Giant-Step directly, the number of multiplications would be:

floor(sqrt(N)/2 * log(N, base=2))
130278056880962267509600048076969842842723121576434617757316043508493573783854

On the other hand, using Algorithm 9.2, the number of multiplications would be:

k = len(N_factorization)  # number of prime factors

# multiplications from powers + multiplications from discrete logs
floor(2 * (k - 1) * log(N, base=2) + 2 * k) + floor(sum(e * sqrt(q)/2 * log(q, base=2) for (q, e) in N_factorization))
18733

And we also need some long divisions, more precisely, the total number of long divisions is at most:

k = len(N_factorization)
floor(2 * log(N, 2) + 2 * (k - 1))
1002

This is considerably better than the straight use of Shank’s algorithm.

Note

Note that using Pohlig-Hellman’s algorithm, we still use Shank’s Baby-Step/Giant-Step algorithm to compute some discrete logs, but only for bases of prime order.

Homework

You will implement Pohlig-Hellman’s algorithm in your homework.

Important

The Pohlig-Hellman algorithm show how important it is that we choose \(g\) of prime order in the Diffie-Hellman key exchange and ElGamal cryptosystem. If not, Eve has more efficient methods to solve the discrete log problem.

9.2.3. Example#

Let’s illustrate Pohlig-Hellman’s algorithm with an example. We will take:

N = 22974332779312916308087541215025543130953873335484909873
p = 2 * N + 1

Indeed, \(p\) is prime:

is_prime(p)
True

First we need to factor this large \(N\).

N_factorization = factor(N)
N_factorization
583669^2 * 971177^3 * 2929231^4

Important

We did not take this step into account when counting the number of operations for this algorithm. As we soon shall see, factoring a large number \(N\) can be quite difficult and time consuming. If that is the case, the prime factors will be large as well, and Pohlig-Hellman might not be the best choice.

In this case, we have that \(3\) has order \(N\) in \(\mathbb{F}_p^{\times}\):

g = Mod(3, p)
g.multiplicative_order() == N
True

Now, let’s take some \(h \in \mathbb{F}_p^{\times}\):

h = Mod(117619616680834488747814058359345855076997576088312309, p)

Now, we need to find \(x\) such that \(g^x = h\), following the algorithm. First, let’s initialize a lists to have the solutions for the discrete logs corresponding to each power of a prime and the corresponding moduli:

y = []
m = []

Now, we select the first prime and corresponding power:

q, e = N_factorization[0]
q, e
(583669, 2)

We define the elements for each we will solve the discrete log:

NN = N // q^e  # power for g and h (compute it only once!)
gg = g^NN
hh = h^NN

Now, we solve a discrete log (with a base that has order q^e), using Algorithm 9.1. In here (as in your homework!) we can use Sage’s discrete_log, which internally does exactly that:

y.append(discrete_log(hh, gg))  # add discrete log to the list
m.append(q^e)  # add the modulus to the list
y, m
([312457750089], [340669501561])

Then, we move to the next prime and power:

q, e = N_factorization[1]
q, e
(971177, 3)

And follow the same steps:

NN = N // q^e
gg = g^NN
hh = h^NN
y.append(discrete_log(hh, gg))
m.append(q^e)
y, m
([312457750089, 764093480249851], [340669501561, 915999350837922233])

And, last prime (and power):

q, e = N_factorization[2]
q, e
(2929231, 4)

Repeat the steps:

NN = N // q^e
gg = g^NN
hh = h^NN
y.append(discrete_log(hh, gg))
m.append(q^e)
y, m
([312457750089, 764093480249851, 764093480249851],
 [340669501561, 915999350837922233, 73623165508788895650352321])

Now we are done, since we ran out of primes. So, y contains the left-sides of the congruences for the CRT, and m contains the corresponding moduli. We solve it using Sage’s CRT function:

x = crt(y, m)
x
764093480249851

Now x should be our discrete log. Indeed:

g^x == h
True

It worked!

9.3. Further Improvements?#

There is a method for solving the DLP that is even more efficient, called index calculus, which we will learn later.