*(Part of this.)*

We are left with the following problem: given a polynomial $A \in \mathbf{F}_p[X]$ which is known to be squarefree and the product of irreducible factors which are all of degree $d$, find these factors. One algorithm to do this was introduced by Cantor and Zassenhaus in 1981.

## In odd characteristic

We may assume that $A$ has at least two factors (if it has only one, it is already irreducible, so we have nothing to do). We also assume $p > 2$ for now. Then for any $T \in \mathbf{F}_p[X]$, we have the following equality:

\[ A = \gcd(A,T)\times \gcd(A,T^{(p^d-1)/2}+1)\times \gcd(A,T^{(p^d-1)/2}-1). \]

This is easy to see: since all the elements of $\mathbf{F}_{p^d}$ are roots of $X^{p^d}-X$ and since $T \in \mathbf{F}_p[X]$, we have that for any $x \in \mathbf{F}_{p^d}$, $T(x) \in \mathbf{F}_{p^d}$ is again a root of $X^{p^d}-X$, and so $x$ is a root of $T^{p^d}-T$. Thus $X^{p^d}-X$ divides $T^{p^d}-T$. This means that every monic irreducible polynomial of degree $d$ divides $T^{p^d}-T$, and so in the end $A$ divides $T^{p^d}-T$, since $A$ is squarefree. Finally, we clearly have the equality

\[ T^{p^d}-T = T(T^{(p^d-1)/2}+1)(T^{(p^d-1)/2}-1), \]

and since those three factors are pairwise relatively prime, the equality follows.

Let now $T$ be monic of degree $e \le 2d-1$. According to the above equality, the common factors of $A$ and $T^{p^d}-T$, which are just the factors of $A$ since $A$ divides $T^{p^d}-T$, are “spread” among the three polynomials $T$, $T^{(p^d-1)/2}+1$, and $T^{(p^d-1)/2}-1$. It seems reasonable to expect that each of the second and third factors will contain about half of them, since the degree of $T^{(p^d-1)/2}\pm 1$ is about half that of $T^{p^d}-T$. Indeed, one can show that if $T$ is chosen at random, the probability that $T^{(p^d-1)/2}-1$ contains at least one, but not all, of the common factors of $A$ and $T^{p^d}-T$ (and so, that $\gcd(A,T^{(p^d-1)/2}-1)$ is a non-trivial factor of $A$) is more than $4/9$.

We will thus normally not have to try too many different $T$s in order to obtain a non-trivial factor $U$, and we can then apply the algorithm recursively to $U$ and $A/U$. We obtain

def czodd(A, d): if A.degree() == d: return [A] p = A.base_ring().characteristic() while True: T = parent(A).random_element((1, 2*d-1)) if T.degree() < 1: continue T = T.monic() U = gcd(A, T^((p^d-1)//2)-1) if U.degree() > 0 and U.degree() < A.degree(): return czodd(U, d) + czodd(A//U, d)

This works as expected:

sage: R.<x> = PolynomialRing(GF(5)) sage: czodd( (x+1)*(x+2)*(x+3)*(x+4), 1) [x + 2, x + 1, x + 3, x + 4] sage: czodd( (x^2+x+1)*(x^2+2), 2) [x^2 + 2, x^2 + x + 1]

However, this naive algorithm has the same problem as the one for distinct degree factorisation: we have to manipulate a polynomial ($T^{(p^d-1)/2}-1$) whose degree grows exponentially with $d$, and this slows down our algorithm a lot: it takes several seconds to factor a polynomial with two irreducible factors of degree $8$, when Sage's factor() again takes less than a tenth of a second:

sage: f x^16 + 2*x^13 + x^12 + 4*x^11 + 2*x^10 + x^9 + 3*x^7 + 4*x^6 + 2*x^5 + 2*x^4 + x^3 + 3*x^2 + 2 sage: t = cputime() sage: factor(f) (x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1) * (x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2) sage: cputime(t) 0.01883699999999955 sage: t = cputime() sage: czodd(f, 8) [x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2, x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1] sage: cputime(t) 2.0217479999999988

Again, we are not interested in $T^{(p^d-1)/2}-1$ itself, only in its GCD with $A$, and since $A$ is of much smaller degree, we work modulo $A$. This time, however, we need to explicitly construct the ring $\mathbf{F}_p[X]/(A)$ and compute $T^{(p^d-1)/2}-1$ in it (in the previous case, we could just reduce modulo $A$ on the fly). This gives:

def czodd(A, d): if A.degree() == d: return [A] R = parent(A) RmodA = R.quotient_by_principal_ideal(A) p = R.characteristic() while True: T = parent(A).random_element((1, 2*d-1)) if T.degree() < 1: continue T = RmodA(T.monic()) U = gcd(A, lift(T^((p^d-1)//2))-1) if U.degree() > 0 and U.degree() < A.degree(): return czodd(U, d) + czodd(A//U, d)

which works much better:

sage: t = cputime() sage: czodd(f, 8) [x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2, x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1] sage: cputime(t) 0.036122999999999905

### In characteristic $2$

The algorithm for splitting in odd characteristic is based on the fact that for any $T \in \mathbf{F}_p[X]$ we have

\[ A = \gcd(A,T)\times \gcd(A,T^{(p^d-1)/2}+1)\times \gcd(A,T^{(p^d-1)/2}-1), \]

and that if $T$ is chosen at random with sufficiently small degree, then there is a good chance that the second and third GCDs each will contain about half of the factors of $A$, and will thus be non-trivial divisors of $A$. This fails in characteristic $2$ because then $T^{(p^d-1)/2}+1 = T^{(p^d-1)/2}-1$, and so the equality becomes

\[ A = \gcd(A,T)\times \gcd(A, T^{p^d-1}-1), \]

which is useless because then it is almost certain that *all* the factors of $A$ will be in the second GCD, which means that the equality will just give us the trivial factorisation $A = 1\times A$.

What can we do then? For a polynomial $T \in \mathbf{F}_2[X]$, consider

\[ W = T + T^2 + T^4 + \dots + T^{2^{d-1}}. \]

Then we have

\[ A = \gcd(A, W)\times \gcd(A, W + 1). \]

This is easy to see, we have

\[ W^2 = T^2 + T^4 + T^8 + \dots + T^{2^d} \]

(remember that we are in characteristic $2$), and so

\[ W\times (W+1) = W^2+W = T^{2^d}+T, \]

and the same argument as in odd characteristic shows that $A$ divides $T^{2^d}-T$, and the asserted equality follows. Since obviously $W$ and $W+1$ have the same degree, with good probability they will each contain about half of the common factors of $A$ and $T^{2^d}-T$, and $\gcd(A,W)$ will give a non-trivial divisor of $A$. We obtain the following algorithm:

def cz2(A, d): if A.degree() == d: return [A] R = parent(A) while True: T = R.random_element((1, 2*d-1)) if T.degree() < 1: continue W = T for i in range(d-1): T = T^2 % A W = W + T U = gcd(A, W) if U.degree() > 0 and U.degree() < A.degree(): return cz2(U, d) + cz2(A//U, d)

which works as expected:

sage: R.= PolynomialRing(GF(2)) sage: A = x^16 + x^14 + x^10 + x^5 + x^3 + x + 1 sage: t = cputime() sage: factor(A) (x^8 + x^4 + x^3 + x^2 + 1) * (x^8 + x^6 + x^4 + x^3 + x^2 + x + 1) sage: cputime(t) 0.03154299999999921 sage: t = cputime() sage: cz2(A,8) [x^8 + x^6 + x^4 + x^3 + x^2 + x + 1, x^8 + x^4 + x^3 + x^2 + 1] sage: cputime(t) 0.023383000000002596

### Final algorithm

We can combine the three previous algorithms to obtain a full factorisation algorithm in $\mathbf{F}_p[X]$:

def polfactor(A): if A.base_ring().characteristic() == 2: cz = cz2 else: cz = czodd factors = [] for P, e in sqfreefact(A): for q, d in ddfact(P): for r in cz(q, d): factors.append((r, e)) return factors

We can test it for example like this:

def testfactor(p, pmax, n, d): if p is None: R.<x> = PolynomialRing(GF(random_prime(pmax))) else: R.<x> = PolynomialRing(GF(p)) for i in range(n): A = R.random_element(d) while A.degree() < 1: A = R.random_element(d) A = A.monic() if set(polfactor(A)) != set(factor(A)): print "Failed on %s" % A return False return True

to check that our function returns the same factorisation as Sage's factor() function on random polynomials. It seems to work:

sage: testfactor(2, None, 100, 100) True sage: testfactor(None, 50, 100, 100) True

Frederic BautistaThanks, this actually was helpful :)

As well as your other posts on the subject.

Lucio SantiThanks for this great article. A very minor comment, though: when discussing the splitting algorithm in characteristic 2, in the definition of [latex]W[/latex], the last [latex]T[/latex] should have exponent [latex]2^{d-1}[/latex] instead of [latex]2^d – 1[/latex].

FirasPost authorThank you, that’s corrected.

Lucio SantiHi, Firas,

One more thought about your post. I was playing around with polynomials over GF(2^k) for a given k > 1 (better not to use LaTeX tags as they seem to be disabled for comments), but the algorithm as it is stated above only works for GF(2)[X]. However, it can be generalized for GF(2^k) with a very small change in W, as follows:

W = T + T^2 + … + T^{2^{kd – 1}}

This way, W(W+1) = W^2 + W = T^{2^{kd}} + T = T^{(2^k)^d} + T, which implies that A | W(W+1).

So, the loop that computes W should be run for i in range(k*d-1).

Thanks again,

Lucio