# Polynomial factorisation over finite fields, part 1: Squarefree factorisation

(Continues this post.)

We wish to factor a polynomial $A$ with coefficients in the finite field $\mathbf{F}_p$ where $p$ is prime. As we stated in the previous post, the first step is to obtain a squarefree factorisation of $A$. Namely, we wish to obtain polynomials $A_1,A_2,\dots,A_k$, which are all squarefree and relatively prime, and such that $A = A_1A_2^2\dots A_k^k$. Ultimately, $A_i$ will be precisely the product of all the irreducible factors of $A$ with exponent $i$, which will ensure that all the above conditions are satisfied. We first need some preliminary results.

### Derivative of a polynomial

It is well-known from calculus that the derivative of $X^n$ is $nX^{n-1}$, and that the derivative is linear in respect to addition of functions and multiplication by a scalar. This allows us to define the derivative of the polynomial $A = \sum_{i=0}^n a_iX^i$ as $A’ = \sum_{i=1}^n ia_iX^{i-1}$. In calculus, the derivative was defined using limits, distances and suchlike. We cannot use this definition when working over a finite field, because we cannot define the distance between two elements in a sensible way. However, the final expression of the derivative of a polynomial still makes sense over a finite field (or any ring, for that matter), since it involves only the basic ring operations of addition and multiplication.

So we just define the derivative of a polynomial with coefficients in $\mathbf{F}_p$ as in the expression above. Our derivative still has the familiar properties:

• Linearity: for all $f,g \in \mathbf{F}_p[X]$ and $a \in \mathbf{F}_p$, we have $(f+g)’ = f’+g’$, and $(af)’ = af’$.
• Product rule: for all $f,g \in \mathbf{F}_p[X]$, we have $(fg)’ = f’g + fg’$. This formula generalises by induction to any number of factors:
$\left( \prod_i f_i \right)’ = \sum_i \left( f_i’ \prod_{j \ne i} f_j \right).$
• Power rule: for all $f \in \mathbf{F}_p[X]$ and integers $n > 0$, we have $(f^n)’ = nf’f^{n-1}$.

## A simpler case: in characteristic $0$

To better grasp the idea of the squarefree factorisation algorithm, it is worthwhile to first consider the simpler case of a ring of characteristic $0$, such as $\mathbf{Z}$. The key is to consider the GCD of $A$ and $A’$. The reader might be aware from study of polynomials over $\mathbf{R}$ that in that case, a factor $P$ of $A$ is also a factor of $A’$ if and only if its exponent (in the decomposition of $A$) is more than $1$. This remains true in general over any ring of characteristic $0$.

More precisely, if $A = \prod P_i^{e_i}$, then $A’ = Q\prod P_i^{e_i-1}$, where none of the $P_i$ divide $Q$. We thus obtain finally that $T = \gcd(A,A’) = \prod P_i^{e_i-1}$; in other words, $T$ is the polynomial whose decomposition is the same as $A$, but with all exponents decreased by $1$. In particular, this means that all the factors of $A$ which had exponent $1$ have disappeared and are not factors of $T$. This means that we know how to obtain the factors of exponent $1$ (which constitute $A_1$ in our squarefree factorisation): we must “isolate” all the factors which are in $A$ but not in $T$.

We thus compute $V = A/T = \prod P_i$, which is the product of all factors of $A$, but each with exponent $1$, and $V_2 = \gcd(T,V)$, which is the product of all factors of $A$ which are also in $T$ (i.e., which have exponent more than $1$ in $A$). Since we want the factors which are not in $T$, we obtain finally $A_1 = V/V_2$. If we then let $T_2 = T/V_2$, this is the polynomial whose decomposition is the same as $A$, but with all exponents decreased by $2$, and so $A_2$ is the product of factors which are in $T$ but not in $T_2$. Continuing this process until $V_k$ is constant (which means all the factors have been accounted for), we obtain the following algorithm (in Sage):

def sqfreefactz(A):
factors = []
T = gcd(A, A.derivative())
k = 1
Tk = T
Vk = A//T
while Vk.degree() > 0:
Vkplus1 = gcd(Tk, Vk)
Tkplus1 = Tk//Vkplus1
factors.append((Vk//Vkplus1, k))
k = k+1
Vk = Vkplus1
Tk = Tkplus1
return factors


which works as expected:

sage: R.<x> = PolynomialRing(ZZ)
sage: sqfreefactz( (x+1) * (x+2)^3 * (x+3)^3 * (x+4)^4 * (x+10)^10 )
[(x + 1, 1),
(1, 2),
(x^2 + 5*x + 6, 3),
(x + 4, 4),
(1, 5),
(1, 6),
(1, 7),
(1, 8),
(1, 9),
(x + 10, 10)]

It does not handle non-monic polynomials:

sage: sqfreefactz( 100*(x+1) * (x+2)^3 * (x+3)^3 * (x+4)^4 * (x+10)^10 )
[(x + 1, 1),
(1, 2),
(x^2 + 5*x + 6, 3),
(x + 4, 4),
(1, 5),
(1, 6),
(1, 7),
(1, 8),
(1, 9),
(x + 10, 10)]


but since we eventually want to work over a field, this is not a concern.

### Over $\mathbf{F}_p$

Over a finite field (or in general any field of non-zero characteristic), this algorithm does not work, as we can easily see:

sage: R.<x> = PolynomialRing(GF(5))
sage: sqfreefactz( (x+1)^5 )
[]

What went wrong? The previous algorithm was based on the fact that if $A = \prod P_i^{e_i}$, then $\gcd(A,A’) = \prod P_i^{e_i-1}$. This is no longer true in non-zero characteristic, for example with $A = (X+1)^5 \in \mathbf{F}_5[X]$ as above, we have $A’ = 0$ and so $\gcd(A,A’) = A$ (and of course in the previous case we had $\gcd(A,A’) \ne A$). So we must determine the correct expression for $\gcd(A,A’)$, which in this case is more complicated.

We start from the squarefree factorisation $A = \prod A_i^i$, where the $A_i$ are all squarefree and relatively prime, and we have
$A’ = \sum_i \left(\prod_{j \ne i} A_j^j\right) iA_i’A_i^{i-1}.$
We want to obtain the factorisation of $T = \gcd(A,A’)$. Let $P$ be an irreducible factor of $T$. Then $P$ is an irreducible factor of $A$, and so it is a factor of $A_m$ for some $m$ (note that $m$ is unique, since the $A_i$ are all relatively prime), and its exponent in the factorisation of $A$ is $m$.

To obtain its exponent in the factorisation of $A’$, we determine its exponent in the factorisation of each summand in the expression above. For all summands $i \ne m$, there is a factor $A_m^m$, so the exponent of $P$ is at least $m$. For the summand $i=m$, $P$ divides none of the $A_j$ (since all the $A_i$ are relatively prime), and it also does not divide $A_m’$ (since $A_m$ is squarefree). Thus the exponent of $P$ in the summand is exactly the exponent of $P$ in $mA_m^{m-1}$, and we obtain finally that the exponent is $m-1$ if $p$ does not divide $m$ (meaning that $m \ne 0$ in $\mathbf{F}_p$), and otherwise the summand is $0$. This means that if $p$ does not divide $m$, then the exponent of $P$ is $m-1$ (the behavior in this case is identical to that in characteristic $0$), and otherwise the exponent of $P$ is at least $m$. In that latter case, since the exponent of $P$ in $A$ is $m$, its exponent in $T$ is $m$ also, and we obtain
$T = \prod_{p\nmid i}A_i^{i-1} \prod_{p|i}A_i^i.$
This means that contrary to the previous case where the exponents of all factors were decreased by $1$, in this case only the factors whose exponent is not a multiple of $p$ have their exponents decreased, and the others are unchanged. We can see that when we try to run the previous algorithm, only the factors whose exponent is not a multiple of $p$ are accounted for:

sage: sqfreefactz( (x+2)^4 * (x+1)^5)
[(1, 1), (1, 2), (1, 3), (x + 2, 4)]
sage: sqfreefactz( (x+2)^4 * (x+1)^5 * (x+3)^7 * (x+4)^15)
[(1, 1), (1, 2), (1, 3), (x + 2, 4), (1, 5), (1, 6), (x + 3, 7)]


Since in this case the algorithm is more complicated, we describe it more formally. As above we contruct two sequences of polynomials $(T_k)$ and $(V_k)$. We let $T_1 = T = \gcd(A,A’)$, and $V_1 = A/T = \prod_{p\nmid i} A_i$. The subsequent terms are defined by induction as $V_{k+1} = \gcd(T_k,V_k)$ if $p\nmid k$, $V_{k+1} = V_k$ if $p|k$, and $T_{k+1} = T_k/V_{k+1}$. It is easily checked by induction that
$V_k = \prod_{i\ge k,\ p\nmid i} A_i,$
and that
$T_k = \prod_{i>k,\ p\nmid i} A_i^{i-k} \prod_{p|i} A_i^i.$

As before, we have $A_k = V_{k+1}/V_k$ if $p\nmid k$. How can we obtain the others? If we continue as above until $V_k$ becomes constant, in the end the left factor of $T_k$ becomes constant also, which means that we have $T_k = \prod_{p|i} A_i^i$. Since all the exponents of $T_k$ are multiples of $p$, we have $T_k = W^p$ for some $W$. It is then easy to recover $W$ (divide all exponents by $p$), and we can apply the algorithm recursively to obtain the squarefree factorisation of $W$. We then multiply back the exponents by $p$ to obtain the exponents in $A$, and this finally gives the following algorithm (with an added test to discard unit factors):

def sqfreefact(A, pmult=0):
p = A.base_ring().characteristic()
x = A.variables()
T = gcd(A, A.derivative())
factors = []
k = 1
Tk = T
Vk = A//T
while Vk.degree() > 0:
if k % p != 0:
Vkplus1 = gcd(Tk, Vk)
if Vk//Vkplus1 != 1:
factors.append((Vk//Vkplus1, p^(pmult)*k))
else:
Vkplus1 = Vk
Tkplus1 = Tk//Vkplus1
k = k+1
Vk = Vkplus1
Tk = Tkplus1
# Now Tk = W^p, so we recover W and continue (unless T is
# actually constant).
if Tk.degree() == 0:
return factors
newA = 0*x
for i in range(Tk.degree()//p + 1):
newA = newA + (Tk.coeffs()[p*i])*x^i
return factors + sqfreefact(newA, pmult+1)


which works as expected:

sage: sqfreefact((x+2)^4 * x^4 * (x+1)^5 * (x+3)^7 * (x+4)^15)
[(x^2 + 2*x, 4), (x + 3, 7), (x + 1, 5), (x + 4, 15)]


## 1 thought on “Polynomial factorisation over finite fields, part 1: Squarefree factorisation”

1. Divya Grover

Hey,

Thanks for this wonderful post !! This could not have been more intuitively explained.

This site uses Akismet to reduce spam. Learn how your comment data is processed.