17. Finite Fields and the Advanced Encryption Standard (AES)#

17.1. Finite Fields#

As we have seen before in a previous section, if \(p\) is a prime, then \(\mathbb{F}_p = \mathbb{Z}/p\mathbb{Z}\) is what we call a field: this means, in particular, that every non-zero element can be multiplied by another element to get \(1\). As we also have observed before, \(\mathbb{Z}/4\mathbb{Z}\) is not a field, since \(a \cdot 2 \neq 1\) in \(\mathbb{Z}/4\mathbb{Z}\), no matter what \(a\) we take:

# test if we can get 1
for a in range(4):
    print(f"{a} * 2 = {Mod(a *2, 4)}")
0 * 2 = 0
1 * 2 = 2
2 * 2 = 0
3 * 2 = 2

But we can construct a field with \(4\) elements! Let \(\mathbb{F}_4 = \{0, 1, a, b\}\), where the addition and multiplication are given by the tables

(17.1)#\[\begin{split}\begin{array}{r|cccc} {}+ & 0 & 1 & a & b \\ \hline 0 & 0 & 1 & a & b \\ 1 & 1 & 0 & b & a \\ a & a & b & 0 & 1 \\ b & b & a & 1 & b \end{array} \qquad \qquad \begin{array}{r|cccc} {}\times & 0 & 1 & a & b \\ \hline 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & a & b \\ a & 0 & a & b & 1 \\ b & 0 & b & 1 & a \end{array}\end{split}\]

Note that indeed, every non-zero elements has an inverse: \(1^{-1} = 1\), \(a^{-1} = b\), and \(b^{-1} =a\) (since \(ab=1\)).

But there is a systematic way of constructing larger finite fields from a field \(\mathbb{F}_p\) (with \(p\) prime). Before we can describe this process, we need to first introduce polynomials.

17.1.1. Polynomials#

Remember that a polynomial is a “function” (or simply an algebraic object) of the form

\[a_n x^n + a_{n-1}x^{n-1} + \cdots + a_1 x + a_0,\]

where \(x\) is the variable and the \(a_i\)’s are “numbers”. In Algebra and Pre-Calculus, these \(a_i\)’s would be real numbers, but we can also make polynomials where the \(a_i\)’s are in \(\mathbb{F}_p\). For instance, in \(\mathbb{F}_2\), we can have the polynomial

(17.2)#\[f = x^8 + x^4 + x^3 + x + 1.\]

(So, the coefficients are \(0, 1 \in \mathbb{F}_2\).) We denote the set of polynomials with coefficients in \(\mathbb{F}_p\) and with variable \(x\) by \(\mathbb{F}_p[x]\), so we would say that the polynomial \(f\) above is in \(\mathbb{F}_2[x]\).

We can easily create polynomials in Sage:

F2 = GF(2)  # or F2 = FiniteField(2)
F2x.<x> = F2[]
F2x
Univariate Polynomial Ring in x over Finite Field of size 2 (using GF2X)

Note

The notation here is somewhat strange. One could also do

F2x.<x> = F2["x"]

or

F2x.<x> = PolynomialRing(F2)

But the first version is shorter. The x in F2.<x> specifies the variable in Sage that will use for the polynomial variable.

Now, Sage will interpret x always as a variable for \(\mathbb{F}_2[x]\):

f = x^8 + x^4 + x^3 + x + 1
f
x^8 + x^4 + x^3 + x + 1
parent(f)
Univariate Polynomial Ring in x over Finite Field of size 2 (using GF2X)

Here is some basic terminology:

Definition 17.1 (Basic Definitions for Polynomials)

If \(g = a_n x^n + a_{n-1}x^{n-1} + \cdots + a_1 x + a_0\), with \(n \geq 0\) and \(a_n \neq 0\), then:

  1. The degree of \(g\) is \(n\), i.e., the highest power of the variable that appears in \(g\), and we write \(\deg g = n\).

  2. We define the degree of the zero polynomial \(0\) as \(-\infty\). (Some authors prefer to leave it undefined. But we will not need this degree here, so it makes no real difference to us.)

  3. The leading coefficient of \(g\) is \(a_n\), i.e., the coefficient of the highest power of the variable that appears in \(g\).

  4. For some “number” \(c\), we define \(g(c) = a_n c^n + a_{n-1}c^{n-1} + \cdots + a_1 c + a_0\), the evaluation of \(g\) at \(x=c\).

  5. The element \(c\) is a root or zero of \(g\) if \(g(c)=0\).

Of course, Sage can give you that information:

f
x^8 + x^4 + x^3 + x + 1
f.degree()
8
f.leading_coefficient()
1
f(0)
1
f.roots()
[]

(Note that \(f\) above has no roots in \(\mathbb{F}_2\).)

A non-constant polynomial \(f\) is irreducible if it cannot be written as a product of two polynomials of smaller degrees.

For example, in \(\mathbb{F}_2[x]\), we have that \(x^2 + 1\) is reducible, as it is a product of two polynomial of degree \(1\): \(x^2 + 1 = (x + 1)^2\) (remembering that \(2=0\) in \(\mathbb{F}_2\)). On the other hand, \(x^2 + x + 1\) is irreducible. Indeed, if \(x^2 + x + 1 = g_1 g_2\), then \(g_1\) and \(g_2\) must have degree one, and their leading coefficients must also be \(1\) (the only non-zero element). If we try

\[x^2 + x + 1 = (x + c_1)(x + c_2) = x^2 + (c_1 + c_2)x + c_1c_2,\]

we have that

\[\begin{split}\begin{align*} c_1 + c_2 &= 1, \\ c_1 \cdot c_2 &=1, \end{align*}\end{split}\]

which is impossible with \(c_1, c_2 \in \mathbb{F}_2\), as the second equation implies that \(c_1 = c_2 = 1\), but then \(c_1 + c_2 = 0 \neq 1\).

We can also verify it with Sage:

(x^2 + 1).is_irreducible()
False
(x^2 + 1).factor()
(x + 1)^2
(x^2 + x + 1).is_irreducible()
True

Note that \(f = x^8 + x^4 + x^3 + x + 1 \in \mathbb{F}_2[x]\) is also irreducible.

f.is_irreducible()
True

17.1.2. Construction of Larger Finite Fields#

Given a prime \(p\) and an irreducible polynomial \(g \in \mathbb{F}_p[x]\) of degree \(n\), we can construct a field with \(p^n\) elements. The idea is very similar as the construction of \(\mathbb{Z}/p\mathbb{Z}\) from \(\mathbb{Z}\): just as with integers, we have long division of polynomials. Irreducible polynomials correspond to prime numbers, as they cannot be written as products of “smaller” parts. (In the case of polynomials, “smaller” means smaller degrees.)

So, if \(f, g_1, g_2 \in \mathbb{F}_p[x]\), then we can define:

\[g_1 \equiv g_2 \pmod{f} \qquad \text{if } f \mid (g_1 - g_2).\]

Just as \(\mathbb{Z}/m\mathbb{Z}\) contains the possible remainders of division of integers by \(m\) (so numbers between \(0\) and \(m-1\)), we let \(\mathbb{F}_p[x] / f\mathbb{F}_p[x]\) contain the possible remainders of the division of polynomials in \(\mathbb{F}_p[x]\) by \(f\), so polynomials of degree strictly less than \(\deg f\). And, just as with \(\mathbb{Z}/m\mathbb{Z}\), elements congruent modulo \(f\) become equal!

More importantly, we can also perform arithmetic operations in \(\mathbb{F}_p[x] / f\mathbb{F}_p[x]\) by performing them in \(\mathbb{F}_p[x]\) and then reducing the result modulo \(f\).

Also note that if \(\deg f = n\), then \(\mathbb{F}_p[x] / f\mathbb{F}_p[x]\) has \(p^n\) elements, as there are \(p^n\) polynomials in \(\mathbb{F}_p[x]\) with degree less than \(n\): they must be of the form \(a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + \cdots + a_1 x + a_0\), with \(a_i \in \mathbb{F}_p[x]\), and hence we have \(p\) possibilities for each \(a_i\) (from \(0\) to \(p-1\)).

Example 17.1 (\(\mathbb{F}_4\))

In \(\mathbb{F}_2[x]\), let \(f = x^2 + x + 1\). Then since \(\deg f = 2\), we have

\[\mathbb{F}_2[x]/f \mathbb{F}_2[x] = \{0, 1, x, x+1\},\]

and hence it has \(2^2\) elements.

Note that in \(\mathbb{F}_2[x]/f \mathbb{F}_2[x]\) we have that \(x^2\) is equal to the remainder of the division of \(x^2\) by \(f\), so \(x^2 = x+1\). In fact, we have the following addition and multiplication tables:

\[\begin{split}\begin{array}{r|cccc} {}+ & 0 & 1 & x & x+1 \\ \hline 0 & 0 & 1 & x & x+1 \\ 1 & 1 & 0 & x+1 & x \\ x & x & x+1 & 0 & 1 \\ x+1 & x+1 & x & 1 & x+1 \end{array} \qquad \qquad \begin{array}{r|cccc} {}\times & 0 & 1 & x & x+1 \\ \hline 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 1 & x & x+1 \\ x & 0 & x & x+1 & 1 \\ x+1 & 0 & x+1 & 1 & x \end{array}\end{split}\]

Comparing with #eq-f4-1, we see it is the same table if we replace \(a\) and \(b\) by \(x\) and \(x+1\) respectively. Therefore, we can see that \(x^{-1} = x+1\) and \((x+1)^{-1} =x\) and \(\mathbb{F}_2[x]/(x^2+x+1) \mathbb{F}_2[x]\) is a field with \(4\) elements.

We can create this field with Sage. But instead of using x again, let’s use X for the variable in \(\mathbb{F}_2[x]/f\mathbb{F}_2[x]\) and leave x for the variable in the polynomial ring itself.

F4.<X> = GF(4)  # same as FiniteField(4)

So, the elements of F4 are 0, 1, X, and X + 1:

for element in F4:
    print(element)
0
X
X + 1
1
X^2  # this should give X + 1
X + 1

And we can reproduce the addition and multiplication tables above:

print(f"{"ADDITION TABLE:":^39}\n")

table = f"{"+":^7}"

for a in F4:
    table += f"|{a.__repr__():^7}"

table += "\n"

table += 7 * "-" + 4 * ("|" + 7 * "-") + "\n"

for a in F4:
    table += f"{a.__repr__():^7}"
    for b in F4:
        table += f"|{(a + b).__repr__():^7}"
    table += "\n"

print(table)
            ADDITION TABLE:            

   +   |   0   |   X   | X + 1 |   1   
-------|-------|-------|-------|-------
   0   |   0   |   X   | X + 1 |   1   
   X   |   X   |   0   |   1   | X + 1 
 X + 1 | X + 1 |   1   |   0   |   X   
   1   |   1   | X + 1 |   X   |   0   
print(f"{"MULTIPLICATION TABLE:":^39}\n")

table = f"{"*":^7}"

for a in F4:
    table += f"|{a.__repr__():^7}"

table += "\n"

table += 7 * "-" + 4 * ("|" + 7 * "-") + "\n"

for a in F4:
    table += f"{a.__repr__():^7}"
    for b in F4:
        table += f"|{(a * b).__repr__():^7}"
    table += "\n"

print(table)
         MULTIPLICATION TABLE:         

   *   |   0   |   X   | X + 1 |   1   
-------|-------|-------|-------|-------
   0   |   0   |   0   |   0   |   0   
   X   |   0   | X + 1 |   1   |   X   
 X + 1 |   0   |   1   |   X   | X + 1 
   1   |   0   |   X   | X + 1 |   1   

17.1.3. The Field with \(256\) Elements#

A more relevant example for us, which will be used with the Advanced Encryption Standard (AES) below, is the finite field obtained from \(f = x^8 + x^4 + x^3 + x + 1 \in \mathbb{F}_2[x]\). We get a field with \(2^8=256\) elements that we will denote by \(\mathbb{F}_{256}\). So, its elements are of the form

\[a_7 x^7 + a_6 x^6 + \cdots + a_1x + a_0, \qquad a_i \in \mathbb{F}_2.\]

We can identify elements of \(\mathbb{F}_{256}\) with lists/vectors of length \(8\) and elements in \(\mathbb{F}_2\) and with \(8\)-bit numbers:

(17.3)#\[a_7 x^7 + a_6 x^6 + \cdots + a_1x + a_0 \quad \longleftrightarrow \quad [a_0, a_1, a_2, \ldots, a_7] \quad \longleftrightarrow \quad a_0 + a_1 \cdot 2 + a_2 \cdot 2^2 + \cdots + a_7 \cdot 2^7.\]

Let’s construct this field with Sage. The variable f still contains the irreducible polynomial above:

f, parent(f)
(x^8 + x^4 + x^3 + x + 1,
 Univariate Polynomial Ring in x over Finite Field of size 2 (using GF2X))

To construct \(\mathbb{F}_2[x] /f \mathbb{F}_2[x]\), using X again as the element associated to x:

F256.<X> = GF(256, f)
F256
Finite Field in X of size 2^8

Important

If one omits the polynomial f, as in F256.<X> = GF(256), Sage will construct a finite field with \(256\) elements, but the element X will not satisfy the polynomial f, which is necessary for our implementation of the AES below.

Let’s take a couple of random elements in the field:

element1 = X^7 + X^6 + X^3 + X
element2 = X^7 + X^3 + 1

We can add and multiply:

element1 + element2
X^6 + X + 1
element1 * element2
X^5 + X^3 + 1

We can also convert the elements to lists (with entries in \(\mathbb{F}_2\)):

element1.list()
[0, 1, 0, 1, 0, 0, 1, 1]

And \(8\)-bit integers:

sum([ZZ(coef) * 2^i for i, coef in enumerate(element1.list())])
202

And we have the inverse operations:

F256([0, 1, 1, 0, 0, 1, 0, 1])  # should be X + X^2 + X^5 + X^7
X^7 + X^5 + X^2 + X
F256(132.bits())  # 132 = 4 + 128, so we should get X^2 + X^7
X^7 + X^2

These conversions will be important later on, so let’s implement them as functions:

def f256_to_bits(x):
    """
    Converts an elment of the field with 256 elements to a list of bits.

    INPUT: An element of the field with 256 elements (GF(256)).
    OUTPUT: The list of corresponding bits.  If the element is given by
            a0 + a1 * x + ... + a7 * x^7, with ai in GF(2), returns the
            list [a0, a1, ... , a7].
    """
    return x.list()

def f256_to_int(x):
    """
    Converts an elment of the field with 256 elements to a 8-bit integer.

    INPUT: An element of the field with 256 elements (GF(256)).
    OUTPUT: The list of corresponding bits.  If the element is given by
            a0 + a1 * x + ... + a7 * x^7, with ai in GF(2), returns the
            integer a0 + a1 * 2 +  ... + a7 * 2^7.
    """
    return sum([ZZ(coef) * 2^i for i, coef in enumerate(x.list())])

def int_to_f256(n):
    """
    Converts an 8-bit integer to an elment of the field with 256 elements.

    INPUT: An 8-bit integer.
    OUTPUT: The corresponding element of GF(256): if the integer is given by
            a0 + a1 * 2 +  ... + a7 * 2^7, returns the element of GF(256) given
            by a0 + a1 * x + ... + a7 * x^7.
    """
    PR.<x> = GF(2)[]
    f = x^8 + x^4 + x^3 + x + 1
    F256.<X> = GF(2^8, f)

    return F256(n.bits())

def bits_to_f256(b):
    """
    Converts a list of 8 bits (in GF(2)) to an elment of the field with 256 elements.

    INPUT: A list of 8 bits (in GF(2)).
    OUTPUT: The corresponding element of GF(256): if the list is given by
            [a0, a1, ... , a7], returns the element of GF(256) given by
            a0 + a1 * x + ... + a7 * x^7.
    """
    PR.<x> = GF(2)[]
    f = x^8 + x^4 + x^3 + x + 1
    F256.<X> = GF(2^8, f)

    return F256(b)

We finish this section with some important facts about finite fields that we will not prove, but can be found in algebra references such as Dummit and Foote’s Algebra, Jacobson’s Basic Algebra I, or Lang’s Algebra.

Proposition 17.1 (Facts About Finite Fields)

  1. If \(F\) is a field with \(m\) elements, then \(m=p^n\) for some prime \(p\) and some integer \(n \geq 1\). Moreover, \(F\) can be constructed, as above, using an irreducible polynomial of degree \(n\).

  2. For every prime \(p\) and every integer \(n \geq 1\), there is an irreducible polynomial in \(\mathbb{F}_p[x]\) of degree \(n\), and hence there is a field with \(p^n\) elements.

  3. If \(F_1\) and \(F_2\) are two fields with \(p^n\) elements, they are “basically the same”. (Vaguely, we can identify the elements in such a way that they behave identically in both fields.) So, essentially, for each prime \(p\) and integer \(n \geq 1\) there is one, and only one, field with \(p^n\) elements, which we denote by \(\mathbb{F}_{p^n}\).

  4. If \(F\) is a finite field with \(p^n\) elements, then there is some \(a \in F\) of order \(p^n-1\), and therefore all non-zero elements of \(F\) are powers of this \(a\). Such an element is called a primitive element of \(F\).

17.2. Matrices#

We need to briefly review matrix operations. Recall that a matrix is basically a table containing numerical data. This numerical data can be real numbers or, as in our applications, elements in \(\mathbb{Z}/m\mathbb{Z}\) or a finite field.

If a matrix has \(m\) rows and \(n\) columns, we say that it is an \(m \times n\) (read as “\(m\) by \(n\)”) matrix. If the entries are in, say \(\mathbb{F}_{256}\), we say that the matrix is over \(\mathbb{F}_{256}\).

We can create matrices in Sage in a few different ways. To create

\[\begin{split}\left[ \begin{array}{ccc} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array} \right]\end{split}\]

we can give it a list with rows as entries:

matrix([[1, 2, 3], [4, 5, 6]])
[1 2 3]
[4 5 6]

Alternatively, we can give the dimensions and a single list:

matrix(2, 3, srange(1, 7))
[1 2 3]
[4 5 6]

If we want to specify the parent of the entries, we can give it the first argument. For instance, to create

\[\begin{split}\left[ \begin{array}{cc} x^2 + 1 & x^5 + x^3 \\ x^6 + x & 1 \end{array} \right]\end{split}\]

over \(\mathbb{F}_{256}\), we can do:

matrix(F256, [[X^2 + 1, X^5 + X^3], [X^6 + X, 1]])
[  X^2 + 1 X^5 + X^3]
[  X^6 + X         1]

17.2.1. Matrix Addition#

We can add two matrices of the same size by adding entrywise. For example:

\[\begin{split}\left[ \begin{array}{cc} a & b \\ c & d \end{array} \right] + \left[ \begin{array}{cc} x & y \\ z & w \end{array} \right] = \left[ \begin{array}{cc} a + x & b + y \\ c + z & d + w \end{array} \right].\end{split}\]

We can do it directly in Sage, of course:

A = matrix(2, 2, srange(1, 5))
B = matrix(2, 2, srange(5, 9))
A, B
(
[1 2]  [5 6]
[3 4], [7 8]
)
A + B
[ 6  8]
[10 12]

17.2.2. Matrix Multiplication#

On the other hand, matrix multiplication is more complex. Let’s start with how we multiply a row matrix, i.e., a matrix containing a single row, and a column matrix, i.e., a matrix containing a single column. We have:

\[\begin{split}\left[ \begin{array}{cccc} a_1 & a_2 & \cdots & a_n \end{array} \right] \cdot \left[ \begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array} \right] = \left[ \begin{array}{c} a_1 x_1 + a_2 x_2 + \cdots + a_n x_n \end{array} \right].\end{split}\]

Note that the number of columns of the row matrix and the number of rows in the column matrix must to be the same (\(n\) in our example) and the result is a matrix with a single row and single column, which we can think as just a number. In other words, the product of a \(1 \times n\) and a \(n \times 1\) matrix results in a \(1 \times 1\) matrix.

In Sage, we use * for multiplication, as usual:

A = matrix(1, 2, [1, 2])
B = matrix(2, 1, [3, 4])

A, B
(
       [3]
[1 2], [4]
)
A * B
[11]

Often times we call (or identify) column matrices as vectors. So, we can also do in Sage:

A = matrix(1, 2, [1, 2])
v = vector([3, 4])

A, v
([1 2], (3, 4))

Note the a vector in Sage is represented “horizontally” (as a row) and using a parentheses instead of square brackets. But they behave as column matrices:

A * v
(11)

Now, let’s multiply a \(m \times n\) matrix by a column matrix (or vector) of size \(n \times 1\). (Note that the number of columns of the first matrix must match the number of rows of the second.) The result is an \(m \times 1\) matrix, whose rows are obtained by multiplying each row of the first matrix by the column matrix, as above. For example,

\[\begin{split}\underbrace{\left[ \begin{array}{ccc} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{array} \right]}_{2 \times 3} \cdot \underbrace{\left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right]}_{3 \times 1} = \underbrace{\left[ \begin{array}{cccc} a_1 x_1 + a_2 x_2 + a_3 x_3 \\ b_1 x_1 + b_2 x_2 + b_3 x_3 \end{array} \right]}_{2 \times 1}\end{split}\]

Let’s illustrate it with another example in Sage, now over \(\mathbb{F}_{256}\):

A = matrix(F256, [[1, x, x^2], [x^3, x^4, x^6]])
B = matrix(F256, 3, 1, [x + 1, x^2 + x, x^3 + x^2])

A, B
(
               [    X + 1]
[  1   X X^2]  [  X^2 + X]
[X^3 X^4 X^6], [X^3 + X^2]
)
A * B
[X^5 + X^4 + X^3 + X^2 + X + 1]
[X^6 + X^4 + X^3 + X^2 + X + 1]

Or, using vectors:

A = matrix(F256, [[1, x, x^2], [x^3, x^4, x^6]])
v = vector(F256, [x + 1, x^2 + x, x^3 + x^2])

A, v
(
[  1   X X^2]                             
[X^3 X^4 X^6], (X + 1, X^2 + X, X^3 + X^2)
)
A * v
(X^5 + X^4 + X^3 + X^2 + X + 1, X^6 + X^4 + X^3 + X^2 + X + 1)

And finally, in full generality, we multiply an \(m \times n\) matrix by a \(n \times r\) matrix by multiplying each row of the first by each column of the second, resulting in an \(m \times r\) matrix. (Note that, again, the number of columns of the first matrix must match the number of rows of the second.) So, in the \((i, j)\)-position of the product we have the product of the \(i\)-th row of the first matrix with the \(j\)-th column of the second. For example,

\[\begin{split}\underbrace{\left[ \begin{array}{ccc} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{array} \right]}_{2 \times 3} \cdot \underbrace{\left[ \begin{array}{cc} x_1 & y_1 \\ x_2 & y_2\\ x_3 & y_3 \end{array} \right]}_{3 \times 2} = \underbrace{\left[ \begin{array}{cccc} a_1 x_1 + a_2 x_2 + a_3 x_3 & a_1 y_1 + a_2 y_2 + a_3 y_3 \\ b_1 x_1 + b_2 x_2 + b_3 x_3 & b_1 y_1 + b_2 y_2 + b_3 y_3 \end{array} \right]}_{2 \times 2}\end{split}\]

Therefore, a system of equations

\[\begin{split}\begin{align*} ax + by &= u \\ cx + dy &= v \end{align*}\end{split}\]

can be put in matrix form as

\[\begin{split}\left[ \begin{array}{cc} a & b \\ c & d \end{array}\right] \cdot \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} u \\ v \end{array} \right]\end{split}\]

Here is an example in Sage:

A = matrix(F256, [[1, 0, x], [x^3, 1, 0]])
B = matrix(F256, [[1, x], [x^2, x^3], [x^4, x^5]])

A, B
(
               [  1   X]
[  1   0   X]  [X^2 X^3]
[X^3   1   0], [X^4 X^5]
)
A * B
[  X^5 + 1   X^6 + X]
[X^3 + X^2 X^4 + X^3]

Note that we can use the show function to display the result properly formatted:

show(A * B)
\(\displaystyle \left(\begin{array}{rr} X^{5} + 1 & X^{6} + X \\ X^{3} + X^{2} & X^{4} + X^{3} \end{array}\right)\)

Important

When multiplying matrices, order matters! Even when it is possible to switch order, the results might not be the same. For example,

\[\begin{split}\left[ \begin{array}{rr} 1 & 2 \\ 0 & 0 \end{array}\right] \cdot \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array}\right] = \left[ \begin{array}{rr} 1 & -2 \\ 0 & 0 \end{array}\right] \qquad \text{ but } \qquad \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array}\right] \cdot \left[ \begin{array}{rr} 1 & 2 \\ 0 & 0 \end{array}\right] = \left[ \begin{array}{rr} 1 & 2 \\ 0 & 0 \end{array}\right]\end{split}\]

17.2.3. The Identity Matrix#

A square matrix (i.e., an \(n \times n\) matrix) that has \(1\)’s at the diagonal and zeros everywhere else is the \(n \times n\) identity matrix, often times denoted by \(I_n\) or simply \(I\) if the size is clear from the context.

It has an important property that one can easily verify: if \(A\) is an \(m \times n\) matrix, then \(A \cdot I_n = A\) and \(I_m \cdot A = A\).

One can easily produce these matrices in Sage:

I_3 = identity_matrix(F256, 3)
I_3
[1 0 0]
[0 1 0]
[0 0 1]

17.2.4. The Inverse of a Matrix#

An \(n \times n\) matrix is invertible (note that it must be a square matrix, i.e., the numbers of rows and columns must be equal) if there exists another \(n \times n\) matrix such that \(AB = BA = I_n\). In this case, we say that \(B\) is the inverse of \(A\), and denote it by \(A^{-1}\).

Not all matrices are invertible, though. It is clear that a matrix \(O\) made entirely of zeros is not invertible, as no matter what matrix \(B\) we take, we have that \(OB = BO = O \neq I_n\).

Sage can find it a matrix is invertible, and in the positive case, find the inverse:

O = zero_matrix(F256, 3)
O
[0 0 0]
[0 0 0]
[0 0 0]
O.is_invertible()
False
A = matrix(F256, 2, [[X, 1], [X^3 + 1, X^6 + X + 1]])
A
[          X           1]
[    X^3 + 1 X^6 + X + 1]
A.is_invertible()
True
B = A^(-1)
B
[                      X^7 + X^3 + 1     X^7 + X^6 + X^5 + X^4 + X^2 + 1]
[                      X^3 + X^2 + X X^7 + X^6 + X^5 + X^4 + X^2 + X + 1]
A * B
[1 0]
[0 1]
B * A
[1 0]
[0 1]

Note that if we have a linear system

\[\begin{split}\left[ \begin{array}{cc} a & b \\ c & d \end{array}\right] \cdot \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} u \\ v \end{array} \right]\end{split}\]

and if the matrix of coefficients, say \(A\), is invertible, then multiplying both sides by \(A^{-1}\) yields:

\[\begin{split}\left[ \begin{array}{c} x \\ y \end{array} \right] = I_2 \cdot \left[ \begin{array}{c} x \\ y \end{array} \right] = A^{-1} \cdot A \left[ \begin{array}{c} x \\ y \end{array} \right] = A^{-1} \cdot \left[ \begin{array}{c} u \\ v \end{array} \right]\end{split}\]

Hence, the solution is given by

\[\begin{split}\left[ \begin{array}{c} x \\ y \end{array} \right] = A^{-1} \cdot \left[ \begin{array}{c} u \\ v \end{array} \right]\end{split}\]

17.3. The Advanced Encryption Standard (AES)#

In 1997, the National Institute of Standards and Technology (NIST) put a call for proposals for a new encryption algorithm that would become the new “standard”. The winner, originally named Rijndael, is now, after minor changes, known as the Advanced Encryption Standard. It is not only a very safe method, it is also quite fast.

Here are some important observations about the AES:

  1. The AES is the most widely used encryption method in use today!

  2. The American National Security Agency allows the use of AES (in its stronger variants) for top secret classified data. (This is a very strong testament for its security!)

  3. The AES is a symmetrical cryptosystem, meaning that the same key is used both for encryption and decryption. Therefore, it is usually used with another public-key cryptsystem, such as the Elliptic Curve ElGamal, to exchange the key.

  4. Unlike all other cryptosytems covered in this book, the AES is quantum resistant!

Important

Although we will implement the AES here using Sage, we will do it in a naive direct way. There are more efficient ways to implement it, but the code presented should illustrate the process and ideas behind it.

17.3.1. Confusion and Diffusion#

Two key ideas of the AES are confusion and diffusion.

Confusion means that each binary bit of the encoded text should depend on several parts of the key, obscuring the connections between the two.

Diffusion means that if we change a single bit of the original text, then about half of the bits in the encoded text should change, and similarly, if we change one bit of the encoded text, then about half of the original text bits should change. If makes the method less susceptible to statistical attacks.

17.3.2. General Idea#

The AES is a block cipher, meaning the encryption algorithm works with chunks (or blocks) of data. So, the full data is broken into smaller blocks which are then individually encoded. This process to break the data in blocks follows the idea introduced in the section Encrypting Large Numbers.

There block size is \(128\) bits, while there are three possible key lengths for the AES: \(128\) (standard), \(192\) or \(256\) bits. (The NSA accepts only \(192\) and \(256\) block sizes for top secret classified data.) You often see these specified as in AES-128, AES-192, and AES-256.

The data is encrypted in rounds, and each round, up to the last, has four layers, which will be described in more details below:

  1. Byte substitution (ByteSub). (Provides confusion].)

  2. Shift row (ShiftRow). (Provides diffusion).

  3. Mix column (MixCol). (Provides diffusion]).

  4. Key addition.

The last round does not have the mix column layer, and at the beginning we an extra key addition.

The number of rounds depends on the block size:

Block size

Number of rounds

\(128\)

\(10\)

\(192\)

\(12\)

\(256\)

\(14\)

For the sake of simplicity, we will only illustrate the \(128\)-bit key size, i.e., AES-128.

17.3.3. Blocks#

The \(128\) bits of the block to be encoded and of the key are split into \(16\) bytes, meaning sub-blocks of \(8\) bits. So, if the original data is \(A\), we break it into sub-blocks \([A_0, A_1, \ldots, A_{15}]\), where each \(A_i\) has \(8\) bits (or one byte). And similarly for the key \(K\). We usually put these sub-blocks along the columns of matrices:

\[\begin{split}A = \left[\begin{array}{cccc} A_0 & A_4 & A_8 & A_{12} \\ A_1 & A_5 & A_9 & A_{13} \\ A_2 & A_6 & A_{10} & A_{14} \\ A_3 & A_7 & A_{11} & A_{15} \end{array}\right] \qquad \text{ and } \qquad K = \left[\begin{array}{cccc} K_0 & K_4 & K_8 & K_{12} \\ K_1 & K_5 & K_9 & K_{13} \\ K_2 & K_6 & K_{10} & K_{14} \\ K_3 & K_7 & K_{11} & K_{15} \end{array}\right]\end{split}\]

We can see each \(A_i\) and \(K_i\) as elements of \(\mathbb{F}_{256}\): if \(A_i = [a_0, a_1, \ldots, a_7]\), with \(a_i \in \{0, 1\}\), we can see it as the element \(a_0 + a_1 x + \cdots + a_7 x^7 \in \mathbb{F}_{256}\).

The first key addition, before the rounds, will simply add \(A\) and \(K\):

\[\begin{split}A + K = \left[\begin{array}{cccc} A_{0} + K_{0} & A_{4} + K_{4} & A_{8} + K_{8} & A_{12} + K_{12} \\ A_{1} + K_{1} & A_{5} + K_{5} & A_{9} + K_{9} & A_{13} + K_{13} \\ A_{2} + K_{2} & A_{6} + K_{6} & A_{10} + K_{10} & A_{14} + K_{14} \\ A_{3} + K_{3} & A_{7} + K_{7} & A_{11} + K_{11} & A_{15} + K_{15} \end{array}\right]\end{split}\]

Note that the addition \(A_{i} + K_{i}\) is done in \(\mathbb{F}_{256}\). Let’s denote \(B_i = A_i + K_i\) and \(B\) the corresponding matrix.

17.3.4. Round Keys#

We use the given key \(K\) to produce \(10\) additional keys, say \(K^{(1)}, K^{(2)}, \ldots, K^{(10)}\), that will be used in each layer. We will describe how these keys are obtained below, but for now let’s describe the other steps.

17.3.5. Byte Substitution#

We use the Rijndael S-box \(S\) to transform every single byte of the input matrix \(B\), which we now describe.

Each byte \(B_i\) from the input block/matrix \(B\), thought as an element of \(\mathbb{F}_{256}\), is by its inverse \(B_i^{-1}\), if it’s non-zero, or left unchanged otherwise. If \(C_i\) is the result, then we write each \(C_i = [c_0, c_1, \ldots, c_7]\), with \(c_i \in \mathbb{F}_2\), i.e., we get its bit-representation.

We then perform the following operation (over \(\mathbb{F}_2\)):

\[\begin{split}\left[\begin{array}{rrrrrrrr} 1 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 0 & 0 & 0 & 1 & 1 \\ 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 1 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 1 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 \end{array}\right] \cdot \left[ \begin{array}{c} c_0 \\ c_1 \\ c_2 \\ c_3 \\ c_4 \\ c_5 \\ c_6 \\ c_7 \end{array} \right] + \left[ \begin{array}{c} 1 \\ 1 \\ 0 \\ 0 \\ 0 \\ 1 \\ 1\\ 0 \end{array} \right] = \left[ \begin{array}{c} d_0 \\ d_1 \\ d_2 \\ d_3 \\ d_4 \\ d_5 \\ d_6 \\ d_7 \end{array} \right]\end{split}\]

This process will then take the \(8\)-bit \(C_i = [c_0, c_1, \ldots, c_7]\) into some \(8\)-bit \(D_i = [d_0, d_1, \ldots, d_7]\). The whole process is the Rijndael \(S\)-box, so in this case we would write \(S(B_i) = D_i\).

Let’s implement this step in Sage:

def byte_subs(x):
    """
    Given an element of GF(256), applies the Rijndael S-box.

    INPUT: An element of GF(256).
    OUTPUT: The corresponding output of the S-box, also an element of GF(256).
    """
    S_M = matrix(
        GF(2),
        [
            [1, 0, 0, 0, 1, 1, 1, 1],
            [1, 1, 0, 0, 0, 1, 1, 1],
            [1, 1, 1, 0, 0, 0, 1, 1],
            [1, 1, 1, 1, 0, 0, 0, 1],
            [1, 1, 1, 1, 1, 0, 0, 0],
            [0, 1, 1, 1, 1, 1, 0, 0],
            [0, 0, 1, 1, 1, 1, 1, 0],
            [0, 0, 0, 1, 1, 1, 1, 1],
        ],
    )
    S_v = vector(GF(2), [1, 1, 0, 0, 0, 1, 1, 0])

    if x == 0:
        return bits_to_f256(S_v)

    c = vector(f256_to_bits(x^(-1)))

    return bits_to_f256(S_M * c + S_v)

Note that since the function \(S\) is defined on \(8\) bits, instead of performing the computations, we could just have a “look-up table”, where we have the output for all \(256\) possible inputs. In Sage, this could be done with a dictionary, where the keys are the inputs and the values the associated outputs through \(S\).

To decrypt, we need a function to undo what byte_subs does, which means we need to undo each step in reverse order. So, first we need to invert the matrix S_M:

S_M = matrix(
    GF(2),
    [
        [1, 0, 0, 0, 1, 1, 1, 1],
        [1, 1, 0, 0, 0, 1, 1, 1],
        [1, 1, 1, 0, 0, 0, 1, 1],
        [1, 1, 1, 1, 0, 0, 0, 1],
        [1, 1, 1, 1, 1, 0, 0, 0],
        [0, 1, 1, 1, 1, 1, 0, 0],
        [0, 0, 1, 1, 1, 1, 1, 0],
        [0, 0, 0, 1, 1, 1, 1, 1],
    ],
)

S_M_inv = S_M^(-1)
S_M_inv
[0 0 1 0 0 1 0 1]
[1 0 0 1 0 0 1 0]
[0 1 0 0 1 0 0 1]
[1 0 1 0 0 1 0 0]
[0 1 0 1 0 0 1 0]
[0 0 1 0 1 0 0 1]
[1 0 0 1 0 1 0 0]
[0 1 0 0 1 0 1 0]

We will hard-code it in our function. Also, note that since we are working in \(\mathbb{F}_2\), we have that adding and subtracting are the same operation (since \(1=-1\) in \(\mathbb{F}_2\)). Therefore, to undo the addition of S_v, we simply need to add it again.

def byte_subs_inv(x):
    """
    Given an element of GF(256), applies the inverse Rijndael S-box.

    INPUT: An element of GF(256).
    OUTPUT: The corresponding output of the inverse S-box, also an element of
            GF(256).
    """
    # create the field
    PR.<xx> = GF(2)[]
    f = xx^8 + xx^4 + xx^3 + xx + 1
    F256.<X> = GF(2^8, f)

    # the inverse matrix
    S_M_inv = matrix(
        GF(2),
        8,
        [
            [0, 0, 1, 0, 0, 1, 0, 1],
            [1, 0, 0, 1, 0, 0, 1, 0],
            [0, 1, 0, 0, 1, 0, 0, 1],
            [1, 0, 1, 0, 0, 1, 0, 0],
            [0, 1, 0, 1, 0, 0, 1, 0],
            [0, 0, 1, 0, 1, 0, 0, 1],
            [1, 0, 0, 1, 0, 1, 0, 0],
            [0, 1, 0, 0, 1, 0, 1, 0],
        ],
    )
    S_v = vector(GF(2), [1, 1, 0, 0, 0, 1, 1, 0])

    d = vector(GF(2), f256_to_bits(x))
    if d == S_v:
        return F256(0)

    c = S_M_inv * (d + S_v)

    y = bits_to_f256(c.list())

    return y^(-1)

Let’s test to check that byte_subs_inv indeed inverts byte_subs:

x = F256.random_element()

byte_subs_inv(byte_subs(x)) == x
True

17.3.6. Shift Row#

The shift row step is quite simple:

\[\begin{split}\left[\begin{array}{llll} D_0 & D_4 & D_8 & D_{12} \\ D_1 & D_5 & D_9 & D_{13} \\ D_2 & D_6 & D_{10} & D_{14} \\ D_3 & D_7 & D_{11} & D_{15} \end{array}\right] \mapsto \left[\begin{array}{llll} D_0 & D_4 & D_8 & D_{12} \\ D_{13} & D_1 & D_5 & D_9 \\ D_{10} & D_{14} & D_2 & D_6 \\ D_7 & D_{11} & D_{13} & D_3 \end{array}\right]\end{split}\]

Basically, the first row is left intact, the second is shifted by \(1\) (and wrapped back), the third by \(2\), and the fourth by \(3\). Let’s denote the resulting matrix \(E\), with the corresponding entries \(E_i\).

Let’s implement it in Sage:

def shift_rows(M):
    """
    Performs the shift rows step of the AES on a 4x4 matrix with entries in GF(256).

    INPUT: A 4x4 matrix M with entries in GF(256).
    OUTPUT: A 4x4 matrix with entries in GF(256), obtained by rotating the second
            row by 1 to the left, the third by 2, and the fourth by 3.
    """
    N = matrix(M)  # copy of M!
    for i in range(1, 4):
        N[i] = N[i][i:].concatenate(N[i][:i])
    return N

Again, for decryption we need the inverse process:

def shift_rows_inv(M):
    """
    Performs the inverse of the shift rows step of the AES on a 4x4 matrix with
    entries in GF(256).

    INPUT: A 4x4 matrix M with entries in GF(256).
    OUTPUT: A 4x4 matrix with entries in GF(256), obtained by rotating the second
            row by 3 to the left, the third by 2, and the fourth by 1.
    """
    N = matrix(M)
    for i in range(1, 4):
        N[i] = N[i][4 - i :].concatenate(N[i][: 4 - i])
    return N

Let’s again test that these are inverses:

M = random_matrix(F256, 4)

shift_rows_inv(shift_rows(M)) == M
True

17.3.7. Mix Columns#

In the mix columns step, we replace the input matrix \(E\), by the matrix \(F\) given by:

\[\begin{split}\left[\begin{array}{cccc} F_0 & F_4 & F_8 & F_{12} \\ F_1 & F_5 & F_9 & F_{13} \\ F_2 & F_6 & F_{10} & F_{14} \\ F_3 & F_7 & F_{11} & F_{15} \end{array}\right] = \left[\begin{array}{cccc} x & x+1 & 1 & 1 \\ 1 & x & x+1 & 1 \\ 1 & 1 & x & x+1 \\ x+1 & 1 & 1 & x \end{array}\right] \cdot \left[\begin{array}{cccc} E_0 & E_4 & E_8 & E_{12} \\ E_1 & E_5 & E_9 & E_{13} \\ E_2 & E_6 & E_{10} & E_{14} \\ E_3 & E_7 & E_{11} & E_{15} \end{array}\right],\end{split}\]

where, as usual, we see the entries in \(\mathbb{F}_{256}\).

def mix_cols(M):
    """
    Performs the mix columns step of the AES on a 4x4 matrix with entries in
    GF(256).

    INPUT: A 4x4 matrix M with entries in GF(256).
    OUTPUT: A 4x4 matrix with entries in GF(256), obtained by multiplying it on
            the left by
            matrix(GF(256), [[X, X + 1, 1, 1], [1, X, X + 1, 1], [1, 1, X, X + 1], [X + 1, 1, 1, X]])
            where X^8 + X^4 + X^3 + X + 1 = 0.
    """
    F256 = parent(M[0, 0])
    X = F256.0
    N = matrix(F256, [[X, X + 1, 1, 1], [1, X, X + 1, 1], [1, 1, X, X + 1], [X + 1, 1, 1, X]])
    return N * M

Again, to undo this process, we need to invert the given matrix:

matrix(
    F256, [[X, X + 1, 1, 1], [1, X, X + 1, 1], [1, 1, X, X + 1], [X + 1, 1, 1, X]]
)^(-1)
[X^3 + X^2 + X   X^3 + X + 1 X^3 + X^2 + 1       X^3 + 1]
[      X^3 + 1 X^3 + X^2 + X   X^3 + X + 1 X^3 + X^2 + 1]
[X^3 + X^2 + 1       X^3 + 1 X^3 + X^2 + X   X^3 + X + 1]
[  X^3 + X + 1 X^3 + X^2 + 1       X^3 + 1 X^3 + X^2 + X]

And, again, we hard-code it:

def mix_cols_inv(M):
    """
    Performs the inverse of the mix columns step of the AES on a 4x4 matrix with
    entries in GF(256).

    INPUT: A 4x4 matrix M with entries in GF(256).
    OUTPUT: A 4x4 matrix with entries in GF(256), obtained by multiplying it on
            the left by 
             matrix(
                GF(256),
                4,
                [
                    X ^ 3 + X ^ 2 + X,
                    X ^ 3 + X + 1,
                    X ^ 3 + X ^ 2 + 1,
                    X ^ 3 + 1,
                    X ^ 3 + 1,
                    X ^ 3 + X ^ 2 + X,
                    X ^ 3 + X + 1,
                    X ^ 3 + X ^ 2 + 1,
                    X ^ 3 + X ^ 2 + 1,
                    X ^ 3 + 1,
                    X ^ 3 + X ^ 2 + X,
                    X ^ 3 + X + 1,
                    X ^ 3 + X + 1,
                    X ^ 3 + X ^ 2 + 1,
                    X ^ 3 + 1,
                    X ^ 3 + X ^ 2 + X,
                ],
            )
            where X^8 + X^4 + X^3 + X + 1 = 0.
    """
    PR.<x> = GF(2)[]
    f = x^8 + x^4 + x^3 + x + 1
    F256.<X> = GF(2^8, f)

    N = matrix(
        F256,
        4,
        [
            X ^ 3 + X ^ 2 + X,
            X ^ 3 + X + 1,
            X ^ 3 + X ^ 2 + 1,
            X ^ 3 + 1,
            X ^ 3 + 1,
            X ^ 3 + X ^ 2 + X,
            X ^ 3 + X + 1,
            X ^ 3 + X ^ 2 + 1,
            X ^ 3 + X ^ 2 + 1,
            X ^ 3 + 1,
            X ^ 3 + X ^ 2 + X,
            X ^ 3 + X + 1,
            X ^ 3 + X + 1,
            X ^ 3 + X ^ 2 + 1,
            X ^ 3 + 1,
            X ^ 3 + X ^ 2 + X,
        ],
    )

    return N * M

And, again, we test it:

M = random_matrix(F256, 4)

mix_cols_inv(mix_cols(M)) == M
True

17.3.8. Key Addition#

In this step, we just add the corresponding round key, say \(K^{(i)}\) to the input matrix \(F\), similar to the first step before the rounds:

\[\begin{split}\left[\begin{array}{cccc} F_0 & F_4 & F_8 & F_{12} \\ F_1 & F_5 & F_9 & F_{13} \\ F_2 & F_6 & F_{10} & F_{14} \\ F_3 & F_7 & F_{11} & F_{15} \end{array}\right] + \left[ \begin{array}{cccc} K^{(i)}_0 & K^{(i)}_4 & K^{(i)}_8 & K^{(i)}_{12} \\ K^{(i)}_1 & K^{(i)}_5 & K^{(i)}_9 & K^{(i)}_{13} \\ K^{(i)}_2 & K^{(i)}_6 & K^{(i)}_{10} & K^{(i)}_{14} \\ K^{(i)}_3 & K^{(i)}_7 & K^{(i)}_{11} & K^{(i)}_{15} \end{array}\right] = \left[\begin{array}{cccc} F_{0} + K^{(i)}_0 & F_{4} + K^{(i)}_4 & F_{8} + K^{(i)}_8 & F_{12} + K^{(i)}_{12} \\ F_{1} + K^{(i)}_1 & F_{5} + K^{(i)}_5 & F_{9} + K^{(i)}_9 & F_{13} + K^{(i)}_{13} \\ F_{2} + K^{(i)}_2 & F_{6} + K^{(i)}_6 & F_{10} + K^{(i)}_{10} & F_{14} + K^{(i)}_{14} \\ F_{3} + K^{(i)}_3 & F_{7} + K^{(i)}_7 & F_{11} + K^{(i)}_{11} & F_{15} + K^{(i)}_{15} \end{array}\right].\end{split}\]

(Again, the addition is performed in \(\mathbb{F}_{256}\).)

Note that since we are working in \(\mathbb{F}_{256}\), which is constructed over \(\mathbb{F}_2\), we have that adding and subtracting are the same operation (since \(1=-1\) in \(\mathbb{F}_2\)). Therefore, to undo a key addition, we just add the key again.

17.3.9. Round Key Generation#

We now describe how we produce the round keys. The process is recursive: start with a matrix with \(4\) columns, the original matrix for the key \(K\). Then, we will recursively add more columns to this matrix, until we get \(44\) columns. Columns \(5\) to \(8\) form \(K^{(1)}\), columns \(9\) to \(12\) from \(K^{(2)}\), etc.

So, suppose the first columns (from \(K\)) are \(W(0)\), \(W(1)\), \(W(2)\), and \(W(3)\). If we have \(W(i)\) computed, we obtain \(W(i+1)\) in the following way: if \(4 \nmid (i+1)\), then we simply define \(W(i+1) = W(i - 3) + W(i)\), where, again, the addition of the columns is done entrywise and in \(\mathbb{F}_{256}\).

When \(4 \mid (i+1)\), the operation is more complicated: we define \(W(i + 1) = W(i - 3) + T(W(i))\), where \(T\) is given by

\[\begin{split}T \left( \left[ \begin{array}{c} a \\ b \\ c \\ d \end{array} \right] \right) = \left[ \begin{array}{c} x^{(i - 3)/4} + S(b) \\ S(c) \\ S(d) \\ S(a) \end{array} \right],\end{split}\]

where \(S\) is the Rijndael \(S\)-box above. (Note that since \(4 \mid (i+1)\) in this case, we have that \((i-3)/4\) is an integer.)

Let’s implement this process:

def round_key_gen(K):
    """
    Given a 4x4 matrix K that represents the key for the AES, generates all the
    ten extra round keys.

    INPUT: A 4x4 matrix M with entries in GF(256).
    OUTPUT: A list of 11 4x4 matrices in GF(256) corresponding to K and the ten
            extra round keys for the layers of the AES.
    """
    F256 = parent(K[0, 0])
    X = F256.0
    KK = matrix(K)
    for i in range(3, 43):
        if (i + 1) % 4 != 0:
            new_col = KK[:, i - 3] + KK[:, i]
            KK = KK.augment(new_col)
        else:
            new_col = KK[:, i]

            new_col.swap_rows(0, 1)
            new_col.swap_rows(1, 2)
            new_col.swap_rows(2, 3)

            for j in range(4):
                new_col[j, 0] = byte_subs(new_col[j, 0])

            new_col[0, 0] += X^((i - 3) // 4)

            new_col += KK[:, i - 3]

            KK = KK.augment(new_col)
    return [KK[:, i : (i + 4)] for i in range(0, 41, 4)]

17.3.10. Encryption and Decryption#

Now we can use the functions introduced above to perform the encryption and decryption.

Both the \(128\)-bit key and the \(128\)-bit number to be encoded will be given simply as integers, so we will need functions to convert them to \(4 \times 4\) matrices with entries in \(\mathbb{F}_{256}\) and back.

def int128_to_f256_mat(n):
    """
    Given a 128-bit integer, break into 16 bytes [b_0, b_1, ..., b_15], convert
    each byte b_i to an element of GF(256) A_i, and return the matrix
    [A_0  A_4  A_8  A_12]
    [A_1  A_5  A_9  A_13]
    [A_2  A_6  A_10 A_14]
    [A_3  A_7  A_11 A_15]

    INPUT: A 128-bit integer n.
    OUTPUT: A 4x4 matrix with entries in GF(256) that correspond to bytes of n,
            arragned by columns.
    """
    n = ZZ(n)  # make sure it is a Sage integer

    # create GF(256)
    PR.<x> = GF(2)[]
    f = x^8 + x^4 + x^3 + x + 1
    F256.<X> = GF(2^8, f)

    list_bytes = n.digits(base=256)
    list_bytes.extend((16 - len(list_bytes)) * [0])  # add extra zeros if necessary
    list_f256 = [int_to_f256(x) for x in list_bytes]

    return matrix(F256, 4, 4, list_f256).transpose()  # transpose to arrange in cols

def f256_mat_to_int128(M):
    """
    Given a 4x4 matrix with entries in GF(246), convert each entry to a byte, then
    make a 128-bit interger aby putting the bytes together by columns.

    INPUT: A 4x4 matrix M with entries in GF(256).
    OUTPUT: A 128-bit integer obtained by converting each entry of M to a byte, then
    making a 128-bit interger by putting the bytes together by columns.
    """
    list_bytes = [f256_to_int(x) for x in M.transpose().list()]
    return sum(x * 256^i for i, x in enumerate(list_bytes))

Let’s make sure these conversions are inverses:

n = randint(0, 2^128)
f256_mat_to_int128(int128_to_f256_mat(n)) == n
True
M = random_matrix(F256, 4, 4)
int128_to_f256_mat(f256_mat_to_int128(M)) == M
True

We now can implement the algorithms:

def aes128_enc(n, k):
    """
    Given a 128-bit integer n to be encrypted and a 128-bit key, use the AES to
    encrypt n.

    INPUTS:
    * A 128-bit integer to be encrypted (the plaintext).
    * A 128-bit key (also an integer).

    OUTPUT: The encrypted n (the ciphertext), also a 128-bit integer.
    """
    # create GF(256)
    PR.<x> = GF(2)[]
    f = x^8 + x^4 + x^3 + x + 1
    F256.<X> = GF(2^8, f)

    # generate round keys
    list_K = round_key_gen(int128_to_f256_mat(k))

    # convert n to matrix
    M = int128_to_f256_mat(n)

    # start
    M += list_K[0]

    # rounds
    for i in range(1, 10):
        M = matrix(F256, 4, 4, [byte_subs(x) for x in M.list()])  # byte subst.
        M = shift_rows(M)  # shift rows
        M = mix_cols(M)  # mix cols
        M += list_K[i]  # add key

    # last round
    M = matrix(F256, 4, 4, [byte_subs(x) for x in M.list()])  # byte subst.
    M = shift_rows(M)  # shift row
    M += list_K[10]  # add key

    return f256_mat_to_int128(M)
def aes128_dec(n, k):
    """
    Given a 128-bit integer n to be decrypted and a 128-bit key, use the AES to
    decrypt n.

    INPUTS:
    * A 128-bit integer to be decrypted (the ciphertext).
    * A 128-bit key (also an integer).

    OUTPUT: The decrypted n (the plaintext), also a 128-bit integer.
    """
    # create GF(256)
    PR.<x> = GF(2)[]
    f = x^8 + x^4 + x^3 + x + 1
    F256.<X> = GF(2^8, f)

    # generate round keys
    list_K = round_key_gen(int128_to_f256_mat(k))

    # convert n to matrix
    M = int128_to_f256_mat(n)

    # undo the last round
    M += list_K[10]  # undo add key
    M = shift_rows_inv(M)  # undo shift rows
    M =  matrix(F256, 4, 4, [byte_subs_inv(x) for x in M.list()])  # undo byte subs

    # rounds
    for i in range(9, 0, -1):
        M += list_K[i]  # undo add round key
        M = mix_cols_inv(M)  # undo mix cols
        M = shift_rows_inv(M)  # undo shit rows
        M =  matrix(F256, 4, 4, [byte_subs_inv(x) for x in M.list()])  #undo byte subs

    M += list_K[0]  # undo initial add key

    return f256_mat_to_int128(M)

Let’s test them. First we encrypt a random \(128\)-bit integer using a random \(128\)-bit key:

n = randint(0, 2^128)
k = randint(0, 2^128)

n, k
(104975788207245459983207123896300672914,
 79619222898248013625203864634893601905)
encrypted_n = aes128_enc(n, k)
encrypted_n
127826812792101144147716652175730753189

We now should be able to decrypt it:

aes128_dec(encrypted_n, k) == n  # use same key!
True

It works!