Polynomial factorisation over finite fields, part 3: Final splitting (Cantor-Zassenhaus)

(Part of this.)

We are left with the following problem: given a polynomial $A \in \mathbf{F}_p[X]$ which is known to be squarefree and the product of irreducible factors which are all of degree $d$, find these factors. One algorithm to do this was introduced by Cantor and Zassenhaus in 1981.

In odd characteristic

We may assume that $A$ has at least two factors (if it has only one, it is already irreducible, so we have nothing to do). We also assume $p > 2$ for now. Then for any $T \in \mathbf{F}_p[X]$, we have the following equality:
\[ A = \gcd(A,T)\times \gcd(A,T^{(p^d-1)/2}+1)\times \gcd(A,T^{(p^d-1)/2}-1). \]

This is easy to see: since all the elements of $\mathbf{F}_{p^d}$ are roots of $X^{p^d}-X$ and since $T \in \mathbf{F}_p[X]$, we have that for any $x \in \mathbf{F}_{p^d}$, $T(x) \in \mathbf{F}_{p^d}$ is again a root of $X^{p^d}-X$, and so $x$ is a root of $T^{p^d}-T$. Thus $X^{p^d}-X$ divides $T^{p^d}-T$. This means that every monic irreducible polynomial of degree $d$ divides $T^{p^d}-T$, and so in the end $A$ divides $T^{p^d}-T$, since $A$ is squarefree. Finally, we clearly have the equality
\[ T^{p^d}-T = T(T^{(p^d-1)/2}+1)(T^{(p^d-1)/2}-1), \]
and since those three factors are pairwise relatively prime, the equality follows.

Let now $T$ be monic of degree $e \le 2d-1$. According to the above equality, the common factors of $A$ and $T^{p^d}-T$, which are just the factors of $A$ since $A$ divides $T^{p^d}-T$, are “spread” among the three polynomials $T$, $T^{(p^d-1)/2}+1$, and $T^{(p^d-1)/2}-1$. It seems reasonable to expect that each of the second and third factors will contain about half of them, since the degree of $T^{(p^d-1)/2}\pm 1$ is about half that of $T^{p^d}-T$. Indeed, one can show that if $T$ is chosen at random, the probability that $T^{(p^d-1)/2}-1$ contains at least one, but not all, of the common factors of $A$ and $T^{p^d}-T$ (and so, that $\gcd(A,T^{(p^d-1)/2}-1)$ is a non-trivial factor of $A$) is more than $4/9$.

We will thus normally not have to try too many different $T$s in order to obtain a non-trivial factor $U$, and we can then apply the algorithm recursively to $U$ and $A/U$. We obtain

def czodd(A, d):
    if A.degree() == d:
        return [A]
    p = A.base_ring().characteristic()
    while True:
        T = parent(A).random_element((1, 2*d-1))
        if T.degree() < 1:
            continue
        T = T.monic()
        U = gcd(A, T^((p^d-1)//2)-1)
        if U.degree() > 0 and U.degree() < A.degree():
            return czodd(U, d) + czodd(A//U, d)

This works as expected:

sage: R.<x> = PolynomialRing(GF(5))
sage: czodd( (x+1)*(x+2)*(x+3)*(x+4), 1)
[x + 2, x + 1, x + 3, x + 4]
sage: czodd( (x^2+x+1)*(x^2+2), 2)
[x^2 + 2, x^2 + x + 1]

However, this naive algorithm has the same problem as the one for distinct degree factorisation: we have to manipulate a polynomial ($T^{(p^d-1)/2}-1$) whose degree grows exponentially with $d$, and this slows down our algorithm a lot: it takes several seconds to factor a polynomial with two irreducible factors of degree $8$, when Sage's factor() again takes less than a tenth of a second:

sage: f
x^16 + 2*x^13 + x^12 + 4*x^11 + 2*x^10 + x^9 + 3*x^7 + 4*x^6 + 2*x^5 + 2*x^4 + x^3 + 3*x^2 + 2
sage: t = cputime()
sage: factor(f)
(x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1) * (x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2)
sage: cputime(t)
0.01883699999999955
sage: t = cputime()
sage: czodd(f, 8)
[x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2,
 x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1]
sage: cputime(t)
2.0217479999999988

Again, we are not interested in $T^{(p^d-1)/2}-1$ itself, only in its GCD with $A$, and since $A$ is of much smaller degree, we work modulo $A$. This time, however, we need to explicitly construct the ring $\mathbf{F}_p[X]/(A)$ and compute $T^{(p^d-1)/2}-1$ in it (in the previous case, we could just reduce modulo $A$ on the fly). This gives:

def czodd(A, d):
    if A.degree() == d:
        return [A]
    R = parent(A)
    RmodA = R.quotient_by_principal_ideal(A)
    p = R.characteristic()
    while True:
        T = parent(A).random_element((1, 2*d-1))
        if T.degree() < 1:
            continue
        T = RmodA(T.monic())
        U = gcd(A, lift(T^((p^d-1)//2))-1)
        if U.degree() > 0 and U.degree() < A.degree():
            return czodd(U, d) + czodd(A//U, d)

which works much better:

sage: t = cputime()
sage: czodd(f, 8)
[x^8 + 4*x^7 + 4*x^6 + 4*x^3 + 3*x^2 + 3*x + 2,
 x^8 + x^7 + 2*x^6 + 3*x^4 + 3*x^3 + x^2 + x + 1]
sage: cputime(t)
0.036122999999999905

In characteristic $2$

The algorithm for splitting in odd characteristic is based on the fact that for any $T \in \mathbf{F}_p[X]$ we have
\[ A = \gcd(A,T)\times \gcd(A,T^{(p^d-1)/2}+1)\times \gcd(A,T^{(p^d-1)/2}-1), \]
and that if $T$ is chosen at random with sufficiently small degree, then there is a good chance that the second and third GCDs each will contain about half of the factors of $A$, and will thus be non-trivial divisors of $A$. This fails in characteristic $2$ because then $T^{(p^d-1)/2}+1 = T^{(p^d-1)/2}-1$, and so the equality becomes
\[ A = \gcd(A,T)\times \gcd(A, T^{p^d-1}-1), \]
which is useless because then it is almost certain that all the factors of $A$ will be in the second GCD, which means that the equality will just give us the trivial factorisation $A = 1\times A$.

What can we do then? For a polynomial $T \in \mathbf{F}_2[X]$, consider
\[ W = T + T^2 + T^4 + \dots + T^{2^{d-1}}. \]
Then we have
\[ A = \gcd(A, W)\times \gcd(A, W + 1). \]
This is easy to see, we have
\[ W^2 = T^2 + T^4 + T^8 + \dots + T^{2^d} \]
(remember that we are in characteristic $2$), and so
\[ W\times (W+1) = W^2+W = T^{2^d}+T, \]
and the same argument as in odd characteristic shows that $A$ divides $T^{2^d}-T$, and the asserted equality follows. Since obviously $W$ and $W+1$ have the same degree, with good probability they will each contain about half of the common factors of $A$ and $T^{2^d}-T$, and $\gcd(A,W)$ will give a non-trivial divisor of $A$. We obtain the following algorithm:

def cz2(A, d):
    if A.degree() == d:
        return [A]
    R = parent(A)
    while True:
        T = R.random_element((1, 2*d-1))
        if T.degree() < 1:
            continue
        W = T
        for i in range(d-1):
            T = T^2 % A
            W = W + T
        U = gcd(A, W)
        if U.degree() > 0 and U.degree() < A.degree():
            return cz2(U, d) + cz2(A//U, d)

which works as expected:

sage: R. = PolynomialRing(GF(2))
sage: A = x^16 + x^14 + x^10 + x^5 + x^3 + x + 1
sage: t = cputime()
sage: factor(A)
(x^8 + x^4 + x^3 + x^2 + 1) * (x^8 + x^6 + x^4 + x^3 + x^2 + x + 1)
sage: cputime(t)
0.03154299999999921
sage: t = cputime()
sage: cz2(A,8)
[x^8 + x^6 + x^4 + x^3 + x^2 + x + 1, x^8 + x^4 + x^3 + x^2 + 1]
sage: cputime(t)
0.023383000000002596

Final algorithm

We can combine the three previous algorithms to obtain a full factorisation algorithm in $\mathbf{F}_p[X]$:

def polfactor(A):
    if A.base_ring().characteristic() == 2:
        cz = cz2
    else:
        cz = czodd
    factors = []
    for P, e in sqfreefact(A):
        for q, d in ddfact(P):
            for r in cz(q, d):
                factors.append((r, e))
    return factors

We can test it for example like this:

def testfactor(p, pmax, n, d):
    if p is None:
        R.<x> = PolynomialRing(GF(random_prime(pmax)))
    else:
        R.<x> = PolynomialRing(GF(p))
    for i in range(n):
        A = R.random_element(d)
        while A.degree() < 1:
            A = R.random_element(d)
        A = A.monic()
        if set(polfactor(A)) != set(factor(A)):
            print "Failed on %s" % A
            return False
    return True

to check that our function returns the same factorisation as Sage's factor() function on random polynomials. It seems to work:

sage: testfactor(2, None, 100, 100)
True
sage: testfactor(None, 50, 100, 100)
True

4 thoughts on “Polynomial factorisation over finite fields, part 3: Final splitting (Cantor-Zassenhaus)

  1. Lucio Santi

    Thanks for this great article. A very minor comment, though: when discussing the splitting algorithm in characteristic 2, in the definition of [latex]W[/latex], the last [latex]T[/latex] should have exponent [latex]2^{d-1}[/latex] instead of [latex]2^d – 1[/latex].

    Reply
  2. Lucio Santi

    Hi, Firas,

    One more thought about your post. I was playing around with polynomials over GF(2^k) for a given k > 1 (better not to use LaTeX tags as they seem to be disabled for comments), but the algorithm as it is stated above only works for GF(2)[X]. However, it can be generalized for GF(2^k) with a very small change in W, as follows:

    W = T + T^2 + … + T^{2^{kd – 1}}

    This way, W(W+1) = W^2 + W = T^{2^{kd}} + T = T^{(2^k)^d} + T, which implies that A | W(W+1).

    So, the loop that computes W should be run for i in range(k*d-1).

    Thanks again,

    Lucio

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *