Polynomial factorisation over finite fields, part 1: Squarefree factorisation

(Continues this post.)

We wish to factor a polynomial $A$ with coefficients in the finite field $\mathbf{F}_p$ where $p$ is prime. As we stated in the previous post, the first step is to obtain a squarefree factorisation of $A$. Namely, we wish to obtain polynomials $A_1,A_2,\dots,A_k$, which are all squarefree and relatively prime, and such that $A = A_1A_2^2\dots A_k^k$. Ultimately, $A_i$ will be precisely the product of all the irreducible factors of $A$ with exponent $i$, which will ensure that all the above conditions are satisfied. We first need some preliminary results.

Derivative of a polynomial

It is well-known from calculus that the derivative of $X^n$ is $nX^{n-1}$, and that the derivative is linear in respect to addition of functions and multiplication by a scalar. This allows us to define the derivative of the polynomial $A = \sum_{i=0}^n a_iX^i$ as $A’ = \sum_{i=1}^n ia_iX^{i-1}$. In calculus, the derivative was defined using limits, distances and suchlike. We cannot use this definition when working over a finite field, because we cannot define the distance between two elements in a sensible way. However, the final expression of the derivative of a polynomial still makes sense over a finite field (or any ring, for that matter), since it involves only the basic ring operations of addition and multiplication.

So we just define the derivative of a polynomial with coefficients in $\mathbf{F}_p$ as in the expression above. Our derivative still has the familiar properties:

  • Linearity: for all $f,g \in \mathbf{F}_p[X]$ and $a \in \mathbf{F}_p$, we have $(f+g)’ = f’+g’$, and $(af)’ = af’$.
  • Product rule: for all $f,g \in \mathbf{F}_p[X]$, we have $(fg)’ = f’g + fg’$. This formula generalises by induction to any number of factors:
    \[ \left( \prod_i f_i \right)’ = \sum_i \left( f_i’ \prod_{j \ne i} f_j \right). \]
  • Power rule: for all $f \in \mathbf{F}_p[X]$ and integers $n > 0$, we have $(f^n)’ = nf’f^{n-1}$.

A simpler case: in characteristic $0$

To better grasp the idea of the squarefree factorisation algorithm, it is worthwhile to first consider the simpler case of a ring of characteristic $0$, such as $\mathbf{Z}$. The key is to consider the GCD of $A$ and $A’$. The reader might be aware from study of polynomials over $\mathbf{R}$ that in that case, a factor $P$ of $A$ is also a factor of $A’$ if and only if its exponent (in the decomposition of $A$) is more than $1$. This remains true in general over any ring of characteristic $0$.

More precisely, if $A = \prod P_i^{e_i}$, then $A’ = Q\prod P_i^{e_i-1}$, where none of the $P_i$ divide $Q$. We thus obtain finally that $T = \gcd(A,A’) = \prod P_i^{e_i-1}$; in other words, $T$ is the polynomial whose decomposition is the same as $A$, but with all exponents decreased by $1$. In particular, this means that all the factors of $A$ which had exponent $1$ have disappeared and are not factors of $T$. This means that we know how to obtain the factors of exponent $1$ (which constitute $A_1$ in our squarefree factorisation): we must “isolate” all the factors which are in $A$ but not in $T$.

We thus compute $V = A/T = \prod P_i$, which is the product of all factors of $A$, but each with exponent $1$, and $V_2 = \gcd(T,V)$, which is the product of all factors of $A$ which are also in $T$ (i.e., which have exponent more than $1$ in $A$). Since we want the factors which are not in $T$, we obtain finally $A_1 = V/V_2$. If we then let $T_2 = T/V_2$, this is the polynomial whose decomposition is the same as $A$, but with all exponents decreased by $2$, and so $A_2$ is the product of factors which are in $T$ but not in $T_2$. Continuing this process until $V_k$ is constant (which means all the factors have been accounted for), we obtain the following algorithm (in Sage):

def sqfreefactz(A):
    factors = []
    T = gcd(A, A.derivative())
    k = 1
    Tk = T
    Vk = A//T
    while Vk.degree() > 0:
        Vkplus1 = gcd(Tk, Vk)
        Tkplus1 = Tk//Vkplus1
        factors.append((Vk//Vkplus1, k))
        k = k+1
        Vk = Vkplus1
        Tk = Tkplus1
    return factors

which works as expected:

sage: R.<x> = PolynomialRing(ZZ)
sage: sqfreefactz( (x+1) * (x+2)^3 * (x+3)^3 * (x+4)^4 * (x+10)^10 )
[(x + 1, 1),
 (1, 2),
 (x^2 + 5*x + 6, 3),
 (x + 4, 4),
 (1, 5),
 (1, 6),
 (1, 7),
 (1, 8),
 (1, 9),
 (x + 10, 10)]

It does not handle non-monic polynomials:

sage: sqfreefactz( 100*(x+1) * (x+2)^3 * (x+3)^3 * (x+4)^4 * (x+10)^10 )
[(x + 1, 1),
 (1, 2),
 (x^2 + 5*x + 6, 3),
 (x + 4, 4),
 (1, 5),
 (1, 6),
 (1, 7),
 (1, 8),
 (1, 9),
 (x + 10, 10)]

but since we eventually want to work over a field, this is not a concern.

Over $\mathbf{F}_p$

Over a finite field (or in general any field of non-zero characteristic), this algorithm does not work, as we can easily see:

sage: R.<x> = PolynomialRing(GF(5))
sage: sqfreefactz( (x+1)^5 )
[]

What went wrong? The previous algorithm was based on the fact that if $A = \prod P_i^{e_i}$, then $\gcd(A,A’) = \prod P_i^{e_i-1}$. This is no longer true in non-zero characteristic, for example with $A = (X+1)^5 \in \mathbf{F}_5[X]$ as above, we have $A’ = 0$ and so $\gcd(A,A’) = A$ (and of course in the previous case we had $\gcd(A,A’) \ne A$). So we must determine the correct expression for $\gcd(A,A’)$, which in this case is more complicated.

We start from the squarefree factorisation $A = \prod A_i^i$, where the $A_i$ are all squarefree and relatively prime, and we have
\[ A’ = \sum_i \left(\prod_{j \ne i} A_j^j\right) iA_i’A_i^{i-1}. \]
We want to obtain the factorisation of $T = \gcd(A,A’)$. Let $P$ be an irreducible factor of $T$. Then $P$ is an irreducible factor of $A$, and so it is a factor of $A_m$ for some $m$ (note that $m$ is unique, since the $A_i$ are all relatively prime), and its exponent in the factorisation of $A$ is $m$.

To obtain its exponent in the factorisation of $A’$, we determine its exponent in the factorisation of each summand in the expression above. For all summands $i \ne m$, there is a factor $A_m^m$, so the exponent of $P$ is at least $m$. For the summand $i=m$, $P$ divides none of the $A_j$ (since all the $A_i$ are relatively prime), and it also does not divide $A_m’$ (since $A_m$ is squarefree). Thus the exponent of $P$ in the summand is exactly the exponent of $P$ in $mA_m^{m-1}$, and we obtain finally that the exponent is $m-1$ if $p$ does not divide $m$ (meaning that $m \ne 0$ in $\mathbf{F}_p$), and otherwise the summand is $0$. This means that if $p$ does not divide $m$, then the exponent of $P$ is $m-1$ (the behavior in this case is identical to that in characteristic $0$), and otherwise the exponent of $P$ is at least $m$. In that latter case, since the exponent of $P$ in $A$ is $m$, its exponent in $T$ is $m$ also, and we obtain
\[ T = \prod_{p\nmid i}A_i^{i-1} \prod_{p|i}A_i^i. \]
This means that contrary to the previous case where the exponents of all factors were decreased by $1$, in this case only the factors whose exponent is not a multiple of $p$ have their exponents decreased, and the others are unchanged. We can see that when we try to run the previous algorithm, only the factors whose exponent is not a multiple of $p$ are accounted for:

sage: sqfreefactz( (x+2)^4 * (x+1)^5)
[(1, 1), (1, 2), (1, 3), (x + 2, 4)]
sage: sqfreefactz( (x+2)^4 * (x+1)^5 * (x+3)^7 * (x+4)^15)
[(1, 1), (1, 2), (1, 3), (x + 2, 4), (1, 5), (1, 6), (x + 3, 7)]

Since in this case the algorithm is more complicated, we describe it more formally. As above we contruct two sequences of polynomials $(T_k)$ and $(V_k)$. We let $T_1 = T = \gcd(A,A’)$, and $V_1 = A/T = \prod_{p\nmid i} A_i$. The subsequent terms are defined by induction as $V_{k+1} = \gcd(T_k,V_k)$ if $p\nmid k$, $V_{k+1} = V_k$ if $p|k$, and $T_{k+1} = T_k/V_{k+1}$. It is easily checked by induction that
\[ V_k = \prod_{i\ge k,\ p\nmid i} A_i, \]
and that
\[ T_k = \prod_{i>k,\ p\nmid i} A_i^{i-k} \prod_{p|i} A_i^i. \]

As before, we have $A_k = V_{k+1}/V_k$ if $p\nmid k$. How can we obtain the others? If we continue as above until $V_k$ becomes constant, in the end the left factor of $T_k$ becomes constant also, which means that we have $T_k = \prod_{p|i} A_i^i$. Since all the exponents of $T_k$ are multiples of $p$, we have $T_k = W^p$ for some $W$. It is then easy to recover $W$ (divide all exponents by $p$), and we can apply the algorithm recursively to obtain the squarefree factorisation of $W$. We then multiply back the exponents by $p$ to obtain the exponents in $A$, and this finally gives the following algorithm (with an added test to discard unit factors):

def sqfreefact(A, pmult=0):
    p = A.base_ring().characteristic()
    x = A.variables()[0]
    T = gcd(A, A.derivative())
    factors = []
    k = 1
    Tk = T
    Vk = A//T
    while Vk.degree() > 0:
        if k % p != 0:
            Vkplus1 = gcd(Tk, Vk)
            if Vk//Vkplus1 != 1:
                factors.append((Vk//Vkplus1, p^(pmult)*k))
        else:
            Vkplus1 = Vk
        Tkplus1 = Tk//Vkplus1
        k = k+1
        Vk = Vkplus1
        Tk = Tkplus1
    # Now Tk = W^p, so we recover W and continue (unless T is
    # actually constant).
    if Tk.degree() == 0:
        return factors
    newA = 0*x
    for i in range(Tk.degree()//p + 1):
        newA = newA + (Tk.coeffs()[p*i])*x^i
    return factors + sqfreefact(newA, pmult+1)

which works as expected:

sage: sqfreefact((x+2)^4 * x^4 * (x+1)^5 * (x+3)^7 * (x+4)^15)
[(x^2 + 2*x, 4), (x + 3, 7), (x + 1, 5), (x + 4, 15)]

One thought on “Polynomial factorisation over finite fields, part 1: Squarefree factorisation

  1. Divya Grover

    Hey,

    Thanks for this wonderful post !! This could not have been more intuitively explained.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *