8. The Chinese Remainder Theorem and Generalized Bezout’s Lemma#
In this chapter we introduce two mathematical tools to help speed up the computation of discrete logs when the order of the base is not prime: the Chinese Remainder Theorem and the Generalized Extended Euclidean Algorithm.
8.1. The Chinese Remainder Theorem#
In some cases we have algorithms even faster than Shank’s Baby-Step/Giant-Step in computing discrete logs. These depend on \(N\) (the order of \(g\)) not being prime, which is one of the reasons we should always use elements of prime order with Diffie-Hellman key exchange and ElGamal cryptosystem. Here is, roughly, how these methods work: we compute the prime factorization
solving discrete logs with elements of order \(p_i^{r_i}\), which being a smaller modulus usually are considerably faster, and then “patching” these discrete logs for \(i=1, 2, \ldots, k\) into the original discrete log \(\log_g(h)\) (in \(\mathbb{Z}/N\mathbb{Z}\)).
This last step of patching the discrete logs modulo \(p_i^{r_i}\) is done by using the following important result:
Theorem 8.1 (Chinese Remainder Theorem (CRT))
Let \(m_1, m_2, \ldots, m_k\) be pairwise relatively prime moduli (i.e., \(\gcd(m_i, m_j) = 1\) if \(i \neq j\)) and \(a_1, a_2, \ldots, a_k\) be integers. Then the system
has a unique solution modulo \(m = m_1 m_2 \cdots m_k\), meaning that there is an integer \(x=x_0\) simultaneously satisfying all these congruences, and that \(x_1\) is also a common solution if and only if \(x_1 \equiv x_0 \pmod{m}\).
Let’s see why it is true and how we can find a solution.
Proof. Proof of the CRT.
We start with the case when \(k = 2\). So, we want to find \(x\) such that
Using the Extended Euclidean Algorithm we find \(u\) and \(v\) integers such that \(1 = \gcd(m_1, m_2) = um_1 + vm_2\). Note that this means that:
i.e.,
Then, let \(x_0 = a_1vm_2 + a_2um_1\). Then we have:
i.e.,
and so \(x_0\) is a solution to the system (8.2).
Also, clearly, if \(x_1 \equiv x_0 \pmod{m}\), where \(m = m_1m_2\), i.e., \(x_1 = x_0 + tm\) for some integer \(t\), then
so \(x_1\) is also a solution. Hence, any \(x_1 \equiv x_0 \pmod{m}\) is also a solution of (8.2).
Now, suppose that \(x_2\) is some other solution of (8.2). (We need to show that \(x_2 \equiv x_0 \pmod{m}\), i.e., that the solutions above are not missing any particular one.) Then, since both \(x_0\) and \(x_2\) satisfy the system we have
This means that \(m_1\) and \(m_2\) both divide \(x_2 - x_0\). Since \(m_1\) and \(m_2\) are relatively prime, this means that \(m_1m_2 = m\) also divides \(x_2 - x_0\), i.e., that \(x_2 \equiv x_0 \pmod{m}\).
So, note that this gives us that the two congruences of system (8.2) is equivalent to the single congruence
meaning that \(x\) is a solution of (8.2) if and only if \(x\) is a solution of (8.3).
So, when we have \(k\) congruences, as in #eq-crt, we can reduce the number of equations by one by replacing the first two by (8.3):
Then, we proceed: finding the solution modulo \(m_1m_2m_3\) for the first two congruences of this new system, say \(x \equiv x_1 \pmod{m_1m_2m_3}\), we can reduce again the number of congruences by one, by replacing these congruences by the solution:
Proceeding this way, we eventually get the single solution we needed:
8.1.1. Algorithm#
To summarize the process given in the proof of the CRT, we have two main steps that we repeat: solve a pair of congruences, and then replace the two congruences in the system by the congruence from the solution.
8.1.1.1. Solution for a Pair of Congruences#
To solve
when \(\gcd(m_1, m_2) = 1\), we first find \(u\) and \(v\) such that
(we can use xgcd for that!) and then the solution is
Warning
“Bookkeeping” is very important here! Careful to not get these variables mixed!
Note that the solution requires the use of the Extended Euclidean Algorithm, which uses about \(2 \log_2(m) + 2\) long divisions, where \(m = \min(m_1, m_2)\).
8.1.1.2. Reducing Number of Congruences#
We use the previous step to reduce the number of congruences, in each step reducing the first two to a single one by finding a solution:
until we get a single solution
8.1.2. Number of Operations#
We perform \(k-1\) instances of the Extended Euclidean Algorithm. The argument of the log that appears in the number of long divisions of each depends on the minimum between the two moduli, but roughly we can assume that these will be \(m_2\), \(m_3\), \(m_4\). etc. So, we need about
long divisions.
8.1.3. Example#
Let’s solve the system:
(Note that the moduli are distinct primes, so certainly pairwise relatively prime, as required.) We first deal with the first two congruences:
We find \(u\) and \(v\) such that \(2u + 3v = 1\). We can use Sage’s xgcd for that:
_, u, v = xgcd(2, 3)
Then, our solution is \(a_2 u m_1 + a_1 v m_2\):
sol = 0 * u * 2 + 1 * v * 3
sol
3
Let’s check:
(Mod(sol, 2), Mod(sol, 3)) == (Mod(1, 2), Mod(0, 3))
True
Now we replace the first two congruences with \(x \equiv 3 \pmod{6}\), and solve
Again, we focus on the (new) first pair of congruences:
We repeat the same procedure:
_, u, v = xgcd(6, 5)
sol = 3 * u * 6 + 3 * v * 5
sol
3
Let’s check again:
(Mod(sol, 6), Mod(sol, 5)) == (Mod(3, 6), Mod(3, 5))
True
Repeating, we replace the first two congruences (of the new system) by \(x \equiv 3 \pmod{30}\):
We now have only two left and we solve as before:
_, u, v = xgcd(30, 7)
sol = 1 * u * 30 + 3 * v * 7
sol
183
This solution, namely \(x \equiv 183 \pmod{210}\), now should satisfy the original four congruences:
(Mod(sol, 2), Mod(sol, 3), Mod(sol, 5), Mod(sol, 7)) == (Mod(1, 2), Mod(0, 3), Mod(3, 5), Mod(1, 7))
True
Homework
You will implement this algorithm to find solutions in the CRT as an exercise in your homework.
But, of course, Sage can solve the system of congruences in the CRT using the function crt. We give it a list of the left-sides of the congruences (i.e., \(a_1, a_2, \ldots, a_k\)) and a list of the moduli (i.e., \(m_1, m_2, \ldots, m_k\)). So, our previous example can be done with:
crt([1, 0, 3, 1], [2, 3, 5, 7])
183
8.2. Python/Sage Detour: Scope and Mutable Objects#
In the implementation of the CRT in your homework, as with Sage’s crt, we will pass two lists to the function: a list a of the left-sides of the congruences (i.e., \(a_1, a_2, \ldots, a_k\)) and a list m of the moduli (i.e., \(m_1, m_2, \ldots, m_k\)).
In the implementation we need to deal with two congruences at a time. We can do that by “popping” elements out the lists with pop, applying the CRT for two congruences and appending the new left-side and modulus back to the respective lists.
8.2.1. “Popping Elements”#
If we have a list, the .pop method takes (by default) the last element of the list, thus modifying the original list, and return this previous last element:
my_list = [1, 2, 3, 4]
# returns last element and REMOVE it from the list
last_element = my_list.pop()
last_element, my_list
(4, [1, 2, 3])
8.2.2. Mutable Objects#
List and dictionaries in Python/Sage are mutable objects. Let’s illustrate what this means with an example:
a = 2
b = a
a = 3
print(f"We have that {a = }, and {b = }.")
We have that a = 3, and b = 2.
We started setting a as \(2\), used b to store the same value as a, and then changed the value of a. At the end b has the old value of a. Now, let’s look at an example with lists:
list_a = [1, 2, 3]
list_b = list_a
list_a.pop() # remove last element of list_a
print(f"We have that {list_a = }, and {list_b = }.")
We have that list_a = [1, 2], and list_b = [1, 2].
So, you might have been expecting that list_b would have the old/original value of list_a, but both lists are the same!
The reason for that is, basically, that line 2 makes the names list_a and list_b “point” to the same object, the list [1, 2, 3]. (The same is true for line 2 in the previous example: a and b point to the same object, the integer 2.) But, while line 3 of the previous example redefines a, making it to point somewhere else (the integer 3), in the last example line 3 changed (or mutated) the object to which both lists were pointing!
If we were to also redefine (instead of changing) list_a, then this problem would not happen:
list_a = [1, 2, 3]
list_b = list_a
list_a = [1, 2]
print(f"We have that {list_a = }, and {list_b = }.")
We have that list_a = [1, 2], and list_b = [1, 2, 3].
The statement list_a = [1, 2] makes list_a point somewhere else, i.e., it defines list_a, while list_a.pop() simply changes the object to which it was pointing (by removing the last element).
But you might want to have list_b be the same as the old list_a, and not be affected by any changes to list_a, meaning, you might want list_b to be a copy of list_a at its current state, and not be affected to any other changes made to it.
To do that, we can use the list method .copy():
list_a = [1, 2, 3]
list_b = list_a.copy() # NOTE THE .copy()
list_a.pop() # remove last element of list_a
print(f"We have that {list_a = }, and {list_b = }.")
We have that list_a = [1, 2], and list_b = [1, 2, 3].
Objects that can be changed, like lists, are called mutable objects. Dictionaries are also mutable:
my_dict = {"A": 1, "B": 2, "C": 3}
# remove key/value for key B, returning the value
# the dictionary is changed by this operation
poped_value = my_dict.pop("B")
poped_value, my_dict
(2, {'A': 1, 'C': 3})
Sets are also mutable. On the other hand tuples (which use parentheses instead of the square brackets used for lists) are not mutable. There is no method that changes a tuple (since it cannot be changed).
Warning
Forgetting to use .copy when making a copy of a list or dictionary is a common source of bugs! Try to not forget to use it when you want to save the current value of the list or dictionary in a new variable.
8.2.3. Scope of Variables#
Here we briefly discuss the notion of scope of variables. But again, let’s do it with examples.
Here we have a variable a defined outside a function, and another also named a defined inside a function:
a = 1
def my_function(x):
a = x + 1
return x^2 + a
my_function(2)
7
(The function here is “silly”, but it is a simple example to illustrate the point.)
Let’s see if the value of a was changed also outside the function. (If it were, it should have value 3 = 2 + 1 now.)
a
1
As you can see, it is not! Variables defined inside a function are local, meaning, they only have meaning inside the function. This includes the input variables as well.
But let’s look at this example:
def find_list_minimum(input_list):
input_list.sort() # sort list
return input_list[0] # return first element
(Again, this is a very silly example. We can simply use Python/Sage’s min function for this.)
Let’s test it:
my_list = [5, 3, 1, 2, 4]
find_list_minimum(my_list)
1
It seems it worked! But:
my_list
[1, 2, 3, 4, 5]
The original my_list was changed. That’s because inside the function we changed input_list with .sort. This might be surprising, since the change occurred inside the function, but that is the default behavior. This might be good, as we might want to create a function with the specific purpose of modifying the given list. But the original list might contain information that the user wants to keep. For instance, in this example the order of the entries in the list might correspond to some order in real life, like the first five months of the years. If we change the order, we might not now the value for a given month.
We can try to create a new list inside, and change that list instead:
def find_list_minimum(input_list):
new_list = input_list
new_list.sort() # sort list
return new_list[0] # return first element
my_list = [5, 3, 1, 2, 4]
find_list_minimum(my_list)
1
my_list
[1, 2, 3, 4, 5]
As you can see, we again changed the original my_list. Even though new_list was defined inside the function, it is still pointing to the same object as my_list, and changing one of them also changes the other.
So, again, we need .copy():
def find_list_minimum(input_list):
new_list = input_list.copy() # note the .copy()!
new_list.sort() # sort list
return new_list[0] # return first element
my_list = [5, 3, 1, 2, 4]
find_list_minimum(my_list)
1
Now the original remains intact:
my_list
[5, 3, 1, 2, 4]
What would happen if the user passes a tuple, like (5, 3, 1, 2, 4), instead of a list? Since tuples do not have the .copy method (since they cannot be changed, and hence this copy method is not needed), we would get an error:
AttributeError: 'tuple' object has no attribute 'copy'
There is a trick that works as copy and converts all data types that can be converted to lists to a list before continuing: instead of
new_list = input_list.copy()
we use
new_list = list(input_list)
This will copy the list, if input_list is already a list, and convert the object to a list if not:
def find_list_minimum(input_list):
new_list = list(input_list) # note the "list"!
new_list.sort() # sort list
return new_list[0] # return first element
my_list = [5, 3, 1, 2, 4]
find_list_minimum(my_list)
1
my_list
[5, 3, 1, 2, 4]
my_tuple = (5, 3, 1, 2, 4) # a tuple!
find_list_minimum(my_tuple)
1
my_tuple
(5, 3, 1, 2, 4)
This might come hand when getting lists are inputs for functions, like in your homework, when coding the CRT function.
Important
The bottom line is that if you have a function that takes a list, say my_list, as an argument, and you do not want to change the original when performing the actions of your function, create a copy of the list using something like local_list = list(my_list), and use local_list instead of my_list inside your function.
8.3. Generalized Bezout’s Lemma#
The following result is also used in improvements in computations of the discrete log (in specific cases):
Lemma 8.1 (Generalized Bezout’s Lemma)
Let \(a_1, a_2, \ldots, a_k\) be integers and \(d = \gcd(a_1, a_2, \ldots, a_k)\). Then, there integers are \(r_1, r_2, \ldots, r_k\) such that
Proof. Proof of the Generalized Bezout’s Lemma
The idea is to reduce the number of integers inside the GCD until we get only two and can use the Extended Euclidean Algorithm. The key result is that
which has \(n-1\) integers inside the GCD, instead of \(n\). This result is not too hard to prove, and we leave it as an exercise for the more mathematically inclined readers.
We will illustrate the proof with \(n=4\), to make it more concrete, but the logic on how to expand for all \(n\) should be clear. So, we want to find \( r_{1}, r_{2}, r_3 , r_{4}\) such that
Let \(d_3 = \gcd(a_3, a_4)\). By Bezout’s Lemma we have that there are integers \(u_3\) and \(v_3\) such that
So, we have that \(d = \gcd(a_1, a_2, d_3)\), and repeating the argument, we have that \(d = \gcd(a_1, \gcd(a_2, d_3))\). Letting \(d_2 = \gcd(a_2, d_3)\), we can use Bezout’s Lemma again to find integers \(u_2\) \(v_2\) such that
We can now use Bezout’s Lemma one last time and get integers \(u_1\) and \(v_1\) such that
So, we can take \(r_1 = u_1\), \(r_2 = v_1v_2\), \(r_3 = v_1v_2u_3\), and \(r_4 = v_1 v_2 v_3\).
Hopefully, now one can see how we can repeat this process to get the result for five, six, or any number of integers.
8.3.1. Algorithm#
To actually compute the \(r_1\), \(r_2\), etc., above, one can follow the idea of the proof. The key part is to keep track of the list of coefficients we get in each step.
In the first step, we get a list of coefficients \((u_3, v_3)\). In the second, the list becomes \((u_2, v_2u_3, v_2v_3)\). Then, in the third and final step, it becomes, \((u_1, v_1 v_2, v_1v_2u_3, v_1v_2v_3)\). So, in each step, when we use Bezou’t Lemma, we get a pair of integers \((u, v)\) and change the list we had before by adding \(u\) to the beginning of the new list, and by multiplying all elements of the old list by \(v\). We repeat the process until we run out of integers.
Before we describe the algorithm, we need some standard notation:
Definition 8.1 (Assignment Notation)
We will use \(\leftarrow\), as is customary, to denote assignment to a variable. So, we may write, for instance,
to say that we assign to the variable \(a\) the square of its previous value. In Sage this is done simply with a = a^2, but in algebra we cannot use the equal sign, since it has a different meaning. (If we have \(a = a^2\), then that means that \(a\) must be either \(0\) or \(1\).)
With this notation in hand, we can describe the algorithm in more detail:
Algorithm 8.1 (Generalized Extended Euclidean Algorithm)
Given a list \(v = (a_1, a_2, \ldots, a_n)\) of integers, to find \(w = ( r_{1}, r_{2}, \ldots , r_{n})\) such that
we do:
Initialize \(b \leftarrow a_n\) (the last element of \(v\)), \(w \leftarrow (1)\) (a list with only \(1\) in it), and remove the last element of \(v\).
While \(v\) is not empty:
Remove the last element of \(v\) and save it in the variable \(a\).
Use Bezout’s Lemma to find \(r\) and \(s\) such that \(\gcd(a, b) = ra + sb\).
Replace \(w\) to start with \(r\), followed by the previous elements of \(w\) multiplied by \(s\).
Set \(b \leftarrow \gcd(a, b)\) (computed earlier).
Return the final value of \(b\), which has \(\gcd( a_{1}, a_{2}, \ldots , a_{n})\), and the list \(w\), which has the desired coefficients \(( r_{1}, r_{2}, \ldots , r_{n})\).
8.3.2. Example#
As an example, let’s find \(\gcd(123, 573, 942, 3105)\) and \(r_1\), \(r_2\), \(r_3\), \(r_4\) such that
So, our input is:
v = [123, 573, 942, 3105]
Since we need to change this list in the process, let’s use a copy of the original:
v_copy = list(v) # or use v.copy()
Then, we initialize b and w, and remove last element of v:
b = v_copy.pop() # remove last element of v and save it in b
w = [1]
We now start the loop, which runs as long as v is not empty. So, let’s check it:
v_copy
[123, 573, 942]
Since v_copy is not empty, we run the body of the loop. First, remove the last element of v_copy and save it in a:
a = v_copy.pop() # remove last element of v and save it in a
Then, we use the Extended Euclidean Algorithm, using Sage’s own xgcd:
d, r, s = xgcd(a, b) # so d = gcd(a, b)
Then, we need to update w. Here is one way of doing, using list comprehension:
w = [r] + [s * x for x in w]
And we save the old GCD (saved in d above) as b.
b = d
The body of the loop is over, so we check if v_copy is empty:
v_copy
[123, 573]
It is not, so we repeat the body of the loop (all at once now):
a = v_copy.pop()
d, r, s = xgcd(a, b)
w = [r] + [s * x for x in w]
b = d
Check v_copy again:
v_copy
[123]
Since it is not empty, we run the body of the loop:
a = v_copy.pop()
d, r, s = xgcd(a, b)
w = [r] + [s * x for x in w]
b = d
Check v_copy again:
v_copy
[]
Since v_copy is now empty, we are done. The variable b (or d) should have our GCD:
b
3
Let’s check it with Sage’s gcd:
gcd(v)
3
So, it worked!
And w should have the coefficients:
w
[0, 0, 89, -27]
Let’s test it:
sum(x * y for (x, y) in zip(v, w))
3
It worked!
Note that we can also use Sage’s xgcd:
xgcd(v)
(3, 14, -3, 0, 0)
The first element is the GCD, matching our result. But the coefficients are different. Of course, they also work:
w_sage = list(xgcd(v))[1:] # convert result to list, and slice the first element off
w_sage
[14, -3, 0, 0]
Checking:
sum(x * y for (x, y) in zip(v, w_sage))
3
8.3.3. Number of Operations#
Remember that the Euclidean Algorithm, and therefore the Extended Euclidean Algorithm, requires about \(2 \log_2(b) + 2\) long divisions, where \(b\) is the smallest of the two numbers. So, in this case we need to use the Extended Euclidean Algorithm \(n-1\) times. If our list is sorted in decreasing order, this means we perform about \(2(n-1) \log_2(b) + 2(n-1)\) long divisions, where \(b\) now is the least element in the list.
Homework
You will implement this Generalized Extended Euclidean Algorithm in your homework.