One way to symmetrically encrypt a message is to generate a sequence of bits of the same length as the message, and perform a bitwise exclusive-or between the message and the sequence. Then the recipent performs the same operation, and recovers the original message. (This is one way to construct a *stream cipher*.)

Clearly, the security level provided by such an encryption scheme depends entirely on the way in which the sequence was generated. If it is truly random, then we obtain a so-called *one time-pad*, which, as long as it is used only once, guarantees perfect secrecy. At the other extreme, if the sequence consists of only 0-bits, then the encryption process does nothing at all and an adversary can directly read the message.

So why do we not just use one-time pads and call it a day? This is because one-time pads are not practical: they require us to first generate a bunch of truly random data of the same size as the message, and then transmit it between the sender and the recipient. This transmission must be secure, since we don’t want an adversary to obtain the key, but then if we can transmit the key securely, we might as well transmit the message we want to send in the first place, since they have the same size.

So we must find some kind of compromise: we generate a small amount of truly random data (called the *seed*), and then starting from the seed we generate a *pseudo-random sequence*. A pseudo-random sequence is deterministic, but in such a way that it is still difficult for an attacker to predict its behavior. A common construction is to iterate a function $f : E \to E$, where $E$ is some finite set with $n$ elements. Then we choose an element $x_0 \in E$ truly randomly, and the sequence is $x_0$, $x_1 = f(x_0)$, $x_2 = f(x_1)$, etc. Of course since our set is finite, after a while we will obtain a value $x_j = x_i$ for some $i < j$, and the sequence will repeat the cycle $x_i, x_{i+1}, \dots, x_{j-1}$. Thus our sequence will be periodic, and surely it is desirable to have as long a period as possible (ideally, equal to $n$).
How should we choose our function $f$ to achieve this? One is tempted to make up a function which does seemingly random things. Von Neumann proposed using $E = \{0,\dots,10^m\}$ for some $m$ and $f$ defined as follows: to compute $f(x)$, first square $x$, and then take the $m$ middle digits of $x^2$ as the result. Unfortunately, as Von Neumann realized when he tried to implement it, this doesn't really work too well. If we implement it in Sage like this:

def f(x, m): x2 = str(x^2) while len(x2) < 2*m: x2 = "0" + x2 start = (m/2)+1 res = x2[start:start+m] return int(res) def vonneumann(x0, m, n): x = x0 for i in range(n): print(x) x = f(x, m)

we see that if the value of $x_0$ is chosen at random, the generator will very often have a short period. Somewhat paradoxically, a pseudo-random generator chosen at random will be bad. In order to ensure good pseudo-randomness, we must design our generator carefully. Enter the theory of finite fields.

## Construction

We construct a bit sequence $(a_i)$ (*i.e.*, a sequence with values in the finite field $\mathbf{F}_2$) as follows: the first $m$ bits $a_0, \dots, a_{m-1}$ are chosen truly randomly, not all zero, and after that each bit is computed from the $m$ preceeding bits with a relation of the form

\[ a_i = h_{m-1}a_{i-1} + h_{m-2}a_{i-2} + \dots + h_0a_{i-m}, \]

where the $h_i$ are coefficients in $\mathbf{F}_2$. Another way to look at this construction, in the framework of the previous paragraph, is to take $E = \mathbf{F}_2^m \setminus \{0\}$ and to consider the function $f:E \to E$ which gives

\[ f(a_{i-m}, \dots, a_{i-1}) = (a_{i-m+1},\dots,a_{i}). \]

So the function computes the next bit, appends it to the input, and discards the oldest bit. In terms of linear algebra, we have

\[ \begin{pmatrix}

a_{k} \\ a_{k+1} \\ \vdots \\ a_{k+m-1}

\end{pmatrix} = U \begin{pmatrix}

a_{k-1} \\ a_{k} \\ \vdots \\ a_{k+m-2}

\end{pmatrix} = U^k \begin{pmatrix}

a_0 \\ a_1 \\ \vdots \\ a_{m-1}

\end{pmatrix}, \]

where $U$ is the $m\times m$ matrix

\[ U = \begin{pmatrix}

0 & 1 & 0 & \cdots & 0 \\

0 & 0 & 1 & \cdots & 0 \\

\vdots & \vdots & \vdots & \ddots & \vdots \\

0 & 0 & 0 & \cdots & 1 \\

h_0 & h_1 & h_2 & \cdots & h_{m-1}

\end{pmatrix}. \]

Now, our set $E = \mathbf{F}_2^m\setminus \{0\}$ has cardinality $2^m-1$, and we know that this is the maximal possible period for our sequence, so our goal is to determine the coefficients $h_i$ of the recurrence relation so as to obtain a sequence of period $2^m-1$. We note that they do not depend on the initial values $(a_0, \dots, a_{m-1})$. Indeed, if our sequence has period $2^m-1$, every possible non-zero $m$-uple will occur, and so we can choose any one of them as our starting point without changing the overall behavior of the sequence.

## Obtaining a sequence of maximal period

Taking $\mathbf{a} = (a_0, \dots, a_{m-1})$ as our starting values, the period of the sequence is the smallest positive integer $k$ such that $U^k\mathbf{a} = \mathbf{a}$. In other words, such that $\mathbf{a}$ is an eigenvector of $U^k$ associated to the eigenvalue $1$. We start by searching for the values of $k$ for which $U^k$ has $1$ as an eigenvalue. It is easily seen that the characteristic polynomial of $U$ (often called the *feedback polynomial* of the sequence) is

\[ h(x) = x^m + h_{m-1}x^{m-1} + \dots + h_0. \]

Since it depends only on the coefficients $h_i$ in the recurrence relation of our sequence, we can choose the $h_i$ so that $h(x)$ is irreducible (over $\mathbf{F}_2$). (The theory of finite fields tells us that over a finite field, there always exist irreducible polynomials of any degree.) Thus we will assume that $h(x)$ is irreducible.

Now, $h(x)$ determines in the usual way a field extension $K = \mathbf{F}_2(\alpha)$ of degree $m$, where $\alpha$ is a root of $h(x)$. We know also that $h(x)$ splits completely in $K$, and that its roots are given by $\alpha^{2^i}$, for $i = 0,\dots,m-1$ (obtained by repeated application of the Frobenius automorphism $x \mapsto x^2$ on $\alpha$). Being the roots of the characteristic polynomial of $U$ in $K$, they are its eigenvalues, and so finally the eigenvalues of $U^k$ are of the form $\alpha^{2^ik}$, for $i = 0,\dots,m-1$. We claim that if one of them is $1$, then all of them are, which implies that $U^k = I$. Indeed,

\[ (\alpha^{2^ik})^{2^{m-i}} = \alpha^{2^ik2^{m-i}} = \alpha^{2^mk} = \alpha^k, \]

since $\alpha^{2^m} = \alpha$, and so if $\alpha^{2^ik} = 1$, then $\alpha^k = 1$ and all the eigenvalues are $1$. It is clear that in this case, all vectors are eigenvectors associated with $1$, so we obtain finally that the period of the sequence is the smallest positive integer $k$ such that $U^k = I$. In other words, the order of $U$ in the group $\mathrm{GL}_m(\mathbf{F}_2)$.

The last thing we want to show is that the order of $U$ in the group $\mathrm{GL}_m(\mathbf{F}_2)$ is equal to the order of $\alpha$ in the group $K^\times$. (We remark that this is independent of our choice of $\alpha$ as a root of $h(x)$: since $h(x)$ is irreducible, all its roots are images of each other under conjugation automorphisms, and thus have the same order.)

Recall that $U$ diagonalises (in $K$) to

\[ \begin{pmatrix}

\alpha & 0 & 0 & \cdots & 0 \\

0 & \alpha^2 & 0 & \cdots & 0 \\

0 & 0 & \alpha^4 & \cdots & 0 \\

\vdots & \vdots & \vdots & \ddots & \vdots \\

0 & 0 & 0 & \cdots & \alpha^{2^{m-1}}

\end{pmatrix}, \]

and so $U^k$ diagonalises to

\[ \begin{pmatrix}

\alpha^k & 0 & 0 & \cdots & 0 \\

0 & \alpha^{2k} & 0 & \cdots & 0 \\

0 & 0 & \alpha^{4k} & \cdots & 0 \\

\vdots & \vdots & \vdots & \ddots & \vdots \\

0 & 0 & 0 & \cdots & \alpha^{2^{m-1}k}

\end{pmatrix}, \]

and it is clear that if $k$ is the order of $\alpha$ in $K^\times$ (or in general if $\alpha^k = 1$), then $U^k$ diagonalises to the unit matrix, meaning we have $U^k = I$. Conversely, if $U^k$ is the unit matrix, then all its eigenvalues are $1$ and in particular we have $\alpha^k = 1$. Thus $U^k = I$ if and only if $\alpha^k = 1$, meaning the order $U$ in $\mathrm{GL}_m(\mathbf{F}_2)$ is equal to the order of $\alpha$ in $K^\times$. Since the multiplicative group of a finite field is cyclic, it is always possible to find an element $\alpha$ of order $2^m-1$ (actually, there are $\varphi(2^m-1)$ of them), and letting $h(x)$ be its minimal polynomial we obtain what we want: a sequence of maximal period. Such a polynomial, whose roots are generators of the multiplicative group of the field extension it induces, is called a *primitive polynomial* in finite field theory. (This is different from primitive polynomials in ring theory.) Thus, if the feedback poynomial is irreducible and primitive, the sequence will be of maximal length.

## But…

This is not the end of the story, however. First, we have the natural question of whether the converse of the previous statement holds. The answer is yes, although we need to develop some more theory and use different mathematical machinery (namely, representations of our sequence as a formal power series) in order to prove it.

Next we have the question of cryptographic strength: our sequence has maximal period all right, but how random does it really look? From a statistical point of view, it is excellent. For example, one can prove that our sequence contains an equal number of zeroes and ones, and in general that all $k$-uples, for some $k \le m$, have the same number of occurences. From a non-predictability point of view, however, it is horrendous. If an attacker knows as little as $2m$ consecutive bits of the sequence, then he can recover the coefficients $h_i$ of the recurrence relation, and thus reconstruct the whole sequence, just by solving a system of linear equations.

So, a challenge is to find ways to make the sequence less predictable, without hurting its good statistical properties…