# Polynomial factorisation over finite fields, part 3: Final splitting (Cantor-Zassenhaus)

(Part of this.)

We are left with the following problem: given a polynomial $A \in \mathbf{F}_p[X]$ which is known to be squarefree and the product of irreducible factors which are all of degree $d$, find these factors. One algorithm to do this was introduced by Cantor and Zassenhaus in 1981.

## In odd characteristic

We may assume that $A$ has at least two factors (if it has only one, it is already irreducible, so we have nothing to do). We also assume $p > 2$ for now. Then for any $T \in \mathbf{F}_p[X]$, we have the following equality:
$A = \gcd(A,T)\times \gcd(A,T^{(p^d-1)/2}+1)\times \gcd(A,T^{(p^d-1)/2}-1).$

This is easy to see: since all the elements of $\mathbf{F}_{p^d}$ are roots of $X^{p^d}-X$ and since $T \in \mathbf{F}_p[X]$, we have that for any $x \in \mathbf{F}_{p^d}$, $T(x) \in \mathbf{F}_{p^d}$ is again a root of $X^{p^d}-X$, and so $x$ is a root of $T^{p^d}-T$. Thus $X^{p^d}-X$ divides $T^{p^d}-T$. This means that every monic irreducible polynomial of degree $d$ divides $T^{p^d}-T$, and so in the end $A$ divides $T^{p^d}-T$, since $A$ is squarefree. Finally, we clearly have the equality
$T^{p^d}-T = T(T^{(p^d-1)/2}+1)(T^{(p^d-1)/2}-1),$
and since those three factors are pairwise relatively prime, the equality follows.

Let now $T$ be monic of degree $e \le 2d-1$. According to the above equality, the common factors of $A$ and $T^{p^d}-T$, which are just the factors of $A$ since $A$ divides $T^{p^d}-T$, are “spread” among the three polynomials $T$, $T^{(p^d-1)/2}+1$, and $T^{(p^d-1)/2}-1$. It seems reasonable to expect that each of the second and third factors will contain about half of them, since the degree of $T^{(p^d-1)/2}\pm 1$ is about half that of $T^{p^d}-T$. Indeed, one can show that if $T$ is chosen at random, the probability that $T^{(p^d-1)/2}-1$ contains at least one, but not all, of the common factors of $A$ and $T^{p^d}-T$ (and so, that $\gcd(A,T^{(p^d-1)/2}-1)$ is a non-trivial factor of $A$) is more than $4/9$.

We will thus normally not have to try too many different $T$s in order to obtain a non-trivial factor $U$, and we can then apply the algorithm recursively to $U$ and $A/U$. We obtain

def czodd(A, d):
if A.degree() == d:
return [A]
p = A.base_ring().characteristic()
while True:
T = parent(A).random_element((1, 2*d-1))
if T.degree() < 1:
continue
T = T.monic()
U = gcd(A, T^((p^d-1)//2)-1)
if U.degree() > 0 and U.degree() < A.degree():
return czodd(U, d) + czodd(A//U, d)


This works as expected:

sage: R.<x> = PolynomialRing(GF(5))
sage: czodd( (x+1)*(x+2)*(x+3)*(x+4), 1)
[x + 2, x + 1, x + 3, x + 4]
sage: czodd( (x^2+x+1)*(x^2+2), 2)
[x^2 + 2, x^2 + x + 1]


However, this naive algorithm has the same problem as the one for distinct degree factorisation: we have to manipulate a polynomial ($T^{(p^d-1)/2}-1$) whose degree grows exponentially with $d$, and this slows down our algorithm a lot: it takes several seconds to factor a polynomial with two irreducible factors of degree $8$, when Sage's factor() again takes less than a tenth of a second:

sage: f
x^16 + 2*x^13 + x^12 + 4*x^11 + 2*x^10 + x^9 + 3*x^7 + 4*x^6 + 2*x^5 + 2*x^4 + x^3 + 3*x^2 + 2
sage: t = cputime()
sage: factor(f)
(x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1) * (x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2)
sage: cputime(t)
0.01883699999999955
sage: t = cputime()
sage: czodd(f, 8)
[x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2,
x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1]
sage: cputime(t)
2.0217479999999988


Again, we are not interested in $T^{(p^d-1)/2}-1$ itself, only in its GCD with $A$, and since $A$ is of much smaller degree, we work modulo $A$. This time, however, we need to explicitly construct the ring $\mathbf{F}_p[X]/(A)$ and compute $T^{(p^d-1)/2}-1$ in it (in the previous case, we could just reduce modulo $A$ on the fly). This gives:

def czodd(A, d):
if A.degree() == d:
return [A]
R = parent(A)
RmodA = R.quotient_by_principal_ideal(A)
p = R.characteristic()
while True:
T = parent(A).random_element((1, 2*d-1))
if T.degree() < 1:
continue
T = RmodA(T.monic())
U = gcd(A, lift(T^((p^d-1)//2))-1)
if U.degree() > 0 and U.degree() < A.degree():
return czodd(U, d) + czodd(A//U, d)


which works much better:

sage: t = cputime()
sage: czodd(f, 8)
[x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2,
x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1]
sage: cputime(t)
0.036122999999999905


### In characteristic $2$

The algorithm for splitting in odd characteristic is based on the fact that for any $T \in \mathbf{F}_p[X]$ we have
$A = \gcd(A,T)\times \gcd(A,T^{(p^d-1)/2}+1)\times \gcd(A,T^{(p^d-1)/2}-1),$
and that if $T$ is chosen at random with sufficiently small degree, then there is a good chance that the second and third GCDs each will contain about half of the factors of $A$, and will thus be non-trivial divisors of $A$. This fails in characteristic $2$ because then $T^{(p^d-1)/2}+1 = T^{(p^d-1)/2}-1$, and so the equality becomes
$A = \gcd(A,T)\times \gcd(A, T^{p^d-1}-1),$
which is useless because then it is almost certain that all the factors of $A$ will be in the second GCD, which means that the equality will just give us the trivial factorisation $A = 1\times A$.

What can we do then? For a polynomial $T \in \mathbf{F}_2[X]$, consider
$W = T + T^2 + T^4 + \dots + T^{2^{d-1}}.$
Then we have
$A = \gcd(A, W)\times \gcd(A, W + 1).$
This is easy to see, we have
$W^2 = T^2 + T^4 + T^8 + \dots + T^{2^d}$
(remember that we are in characteristic $2$), and so
$W\times (W+1) = W^2+W = T^{2^d}+T,$
and the same argument as in odd characteristic shows that $A$ divides $T^{2^d}-T$, and the asserted equality follows. Since obviously $W$ and $W+1$ have the same degree, with good probability they will each contain about half of the common factors of $A$ and $T^{2^d}-T$, and $\gcd(A,W)$ will give a non-trivial divisor of $A$. We obtain the following algorithm:

def cz2(A, d):
if A.degree() == d:
return [A]
R = parent(A)
while True:
T = R.random_element((1, 2*d-1))
if T.degree() < 1:
continue
W = T
for i in range(d-1):
T = T^2 % A
W = W + T
U = gcd(A, W)
if U.degree() > 0 and U.degree() < A.degree():
return cz2(U, d) + cz2(A//U, d)


which works as expected:

sage: R. = PolynomialRing(GF(2))
sage: A = x^16 + x^14 + x^10 + x^5 + x^3 + x + 1
sage: t = cputime()
sage: factor(A)
(x^8 + x^4 + x^3 + x^2 + 1) * (x^8 + x^6 + x^4 + x^3 + x^2 + x + 1)
sage: cputime(t)
0.03154299999999921
sage: t = cputime()
sage: cz2(A,8)
[x^8 + x^6 + x^4 + x^3 + x^2 + x + 1, x^8 + x^4 + x^3 + x^2 + 1]
sage: cputime(t)
0.023383000000002596


### Final algorithm

We can combine the three previous algorithms to obtain a full factorisation algorithm in $\mathbf{F}_p[X]$:

def polfactor(A):
if A.base_ring().characteristic() == 2:
cz = cz2
else:
cz = czodd
factors = []
for P, e in sqfreefact(A):
for q, d in ddfact(P):
for r in cz(q, d):
factors.append((r, e))
return factors


We can test it for example like this:

def testfactor(p, pmax, n, d):
if p is None:
R.<x> = PolynomialRing(GF(random_prime(pmax)))
else:
R.<x> = PolynomialRing(GF(p))
for i in range(n):
A = R.random_element(d)
while A.degree() < 1:
A = R.random_element(d)
A = A.monic()
if set(polfactor(A)) != set(factor(A)):
print "Failed on %s" % A
return False
return True


to check that our function returns the same factorisation as Sage's factor() function on random polynomials. It seems to work:

sage: testfactor(2, None, 100, 100)
True
sage: testfactor(None, 50, 100, 100)
True


## 4 thoughts on “Polynomial factorisation over finite fields, part 3: Final splitting (Cantor-Zassenhaus)”

1. Frederic Bautista

Thanks, this actually was helpful :)
As well as your other posts on the subject.

2. Lucio Santi

Thanks for this great article. A very minor comment, though: when discussing the splitting algorithm in characteristic 2, in the definition of $W$, the last $T$ should have exponent $2^{d-1}$ instead of $2^d – 1$.

3. Lucio Santi

Hi, Firas,

One more thought about your post. I was playing around with polynomials over GF(2^k) for a given k > 1 (better not to use LaTeX tags as they seem to be disabled for comments), but the algorithm as it is stated above only works for GF(2)[X]. However, it can be generalized for GF(2^k) with a very small change in W, as follows:

W = T + T^2 + … + T^{2^{kd – 1}}

This way, W(W+1) = W^2 + W = T^{2^{kd}} + T = T^{(2^k)^d} + T, which implies that A | W(W+1).

So, the loop that computes W should be run for i in range(k*d-1).

Thanks again,

Lucio