# Finitely generated modules over a P.I.D. and the Smith Normal Form

I’m teaching Graduate Algebra this semester, and I wanted to record here the proof I gave in class of the (existence part of the) structure theorem for finitely generated modules over a PID. It’s a standard argument, based on the existence of the Smith Normal Form for a matrix with entries in a PID, but it’s surprisingly hard to find a concise and accessible reference.

We assume familiarity with basic definitions in the theory of modules over a (commutative) ring. Our goal is to prove the following:

Theorem: Let $R$ be a principal ideal domain and let $M$ be a finitely generated $R$-module. Then $M$ is isomorphic to a direct sum of cyclic $R$-modules. More precisely, there are a non-negative integer $r$ (called the rank of $M$) and elements $d_1,\ldots,d_n \in M$ (called the invariant factors of $M$) such that $d_i \mid d_{i+1}$ for all $i=1,\ldots,n-1$ and $M \cong R^r \oplus R/(d_1) \oplus R/(d_2) \oplus \cdots \oplus R/(d_n)$.

Remark: The rank of $M$ is uniquely determined, and the invariant factors of $M$ are uniquely determined up to multiplication by a unit. There are many references for this uniqueness assertion, see for example Section 12.1 in Dummit and Foote’s “Abstract Algebra”. Alternatively, see my next blog post on Fitting ideals for a proof of a more general assertion.

The proof of the theorem will use the following two results, which will be established later:

Proposition 1: Let $R$ be a PID and let $M$ be a finitely generated free $R$-module of rank $m \geq 0$ (i.e., $M \cong R^m$ ). Then any $R$-submodule of $M$ is free of rank at most $m$.

Proposition 2: Let $R$ be a PID, let $n \leq m$ be nonnegative integers, and let $A$ be an $m \times n$ matrix with entries in $R$. Then there exist an invertible $m \times m$ matrix $B$ and an invertible $n \times n$ matrix $C$, both with coefficients in $R$, such that $\tilde{A} := BAC$ is in Smith Normal Form. (This means that the last $m-n$ rows of $\tilde{A}$ are zero, and the square submatrix given by the first $n$ rows is diagonal, with diagonal entries (from top left to bottom right) $d_1,\ldots,d_n \in R$ such that $d_i \mid d_{i+1}$ for all $i=1,\ldots,n-1$. By abuse of terminology, we call $d_1,\ldots,d_n$ the invariant factors of $A$.)

Assuming these two results, we show how to prove the main theorem.

Proof of the Theorem: Suppose $M$ is generated by $x_1,\ldots,x_m$. Then the map $\phi : R^m \to M$ given by $(r_1,\ldots,r_m) \mapsto r_1x_1 + \cdots + r_m x_m$ is a surjective $R$-module homomorphism, hence (by the first isomorphism theorem) $M \cong R^m / N$ for some $R$-submodule $N$ of $R^m$. By Proposition 1, $N$ is free of rank $n \leq m$, i.e., $N \cong R^n$.

Let $e_1,\ldots,e_m$ be an ordered basis for $R^m$ and let $f_1,\ldots,f_n$ be an ordered basis for $N$. Write $f_j = \sum_i a_{ij} e_i$ with $A = (a_{ij})$. Let $\tilde{A} = BAC$ be the Smith Normal Form of $A$, and let $d_1,\ldots,d_n$ be the invariant factors of $A$. Then there exist ordered bases $e'_1,\ldots,e'_m$ for $R^m$ and $f'_1,\ldots,f'_n$ for $N$ (corresponding to the change-of-basis matrices $B^{-1}$ and $C$, respectively) such that $f'_i = d_i \cdot e'_i$ for $i = 1,\ldots n$ (exercise).

In other words, we have $M \cong R^m / N \cong (Re'_1 \oplus \cdots \oplus Re'_m) / (Rd_1e'_1 \oplus \cdots \oplus Rd_n e'_n \oplus (0) \oplus \cdots (0)).$

But this latter quotient module is easily verified to be isomorphic to $R/(d_1) \oplus \cdots \oplus R/(d_m) \oplus R^{m-n}.$ Q.E.D.

Proof of Proposition 1: Without loss of generality we may assume that $M = R^m$. The proof is by induction on $m$.

The case $m=1$ is true because then $M=R$ and $N$ is an ideal in $R$, which is principle since $R$ is a PID. Thus $N = R\cdot a$ for some $a \in R$. If $a = 0$ then $N$ is free of rank zero, and if $a \neq 0$ then the map $R \to N$ given by $r \mapsto ra$ is an $R$-module isomorphism (since $R$ is a domain), so $N$ is free of rank 1.

For the inductive step, let $\pi : R^m \to R$ be the $R$-module homomorphism given by projection onto the last coordinate. Then ${\rm ker}(\pi)$ is isomorphic to $R^{m-1}$. If $N$ is a submodule of $M$, we define $N' = N \cap {\rm ker}(\pi)$. By induction, $N'$ is free of rank at most $m-1$.

In addition, $\pi(N)$ is an ideal of $R$, hence $\pi(N) = R \cdot x$ for some $x \in R$. Choose $y \in N$ with $\pi(y)=x$ and let $N'' = R \cdot y$. If $x = 0$ then $N = N'$ and we’re done. So we may assume, without loss of generality, that $x \neq 0$.

Claim: $N = N' \oplus N''$.

Given the claim, it follows (since $N'$ is free of rank at most $m-1$ and $N''$ is free of rank 1) that $N$ is free of rank at most $m$, which proves the theorem.

To prove the claim, it suffices to show that (a) $N = N' + N''$ and (b) $N' \cap N'' = (0)$. For (a), let $n \in N$ and choose $r \in R$ such that $\pi(n) = rx$. Then $\pi(n-ry)= \pi(n) - r \pi(y) = \pi(n) - rx = 0$, hence $n-ry \in N'$. Thus $n = n' + ry$ with $n' \in N'$ and $ry \in N''$, proving (a). For (b), apply $\pi$ to some element $n \in N' \cap N''$. Since $n \in N'$ we have $\pi(n)=0$. On the other hand, since $n \in N''$ we have $n = ry$ for some $r \in R$, and then $\pi(n) = r \pi(y) = rx$. Thus $rx = 0$, which implies (since $R$ is a domain) that $r = 0$, and thus $n=0$. Q.E.D.

Proof of Proposition 2: Recalling that every PID is a UFD, define the “length” function $\ell : R \to {\mathbb N} \cup \{ +\infty \}$ by setting $\ell(0) = +\infty$ and $\ell(a) = t$ iff $a = u p_1 p_2 \cdots p_t$ with $u \in R^\times$ and $p_1,\ldots,p_t \in R$ non-zero prime (= irreducible) elements of $R$. In particular, $\ell(a)=0$ iff $a$ is a unit. It is easy to see, by unique factorization, that for all $a,b \in R$ we have:

(1) If $a \mid b$ then $\ell(a) \leq \ell(b)$.

(2) If $a \mid b$ and $\ell(a) = \ell(b)$ then $a$ and $b$ are associate, i.e., they differ by multiplication by some unit.

We say that two $m \times n$ matrices $A,A'$ with entries in $R$ are equivalent if there exist $B,C$ as above with $A' = BAC$.

We proceed by induction on $m$. The case $m=1$ is trivial, and we may assume without loss of generality that $A = (a_{ij}) \neq 0$. Let $A' = (a'_{ij})$ be a matrix equivalent to $A$ such that $\ell(a'_{11})$ is minimal among all matrices equivalent to $A$.

If we swap two rows, or two columns, of any matrix we obtain an equivalent matrix (exercise). It follows easily that $\ell(a'_{11}) \leq \ell(a'_{ij})$ for all $i,j$.

Claim 1: $a'_{11} \mid a'_{1j}$ for all $j$.

By swapping columns as needed, it suffices to prove that $a'_{11} \mid a'_{12}$. Let $d$ be a generator for the ideal generated by $a'_{11}$ and $a'_{12}$. Then there exist $x,y \in R$ such that $a'_{11}x + a'_{12} y = d$. If $a'_{11}=d\alpha$ and $a'_{12}=d\beta$ with $\alpha,\beta \in R$ then $\alpha x + \beta y = 1$.

Define $A'' = A' \cdot C'$, where $C'$ is the following matrix:

(Note that $C'$ is invertible, since its determinant is $\alpha x + \beta y = 1$.)

Then $A'' = (a''_{ij})$ is equivalent to $A$ and $a''_{11} = d$. By the definition of $A'$, we must have $\ell(d) \geq \ell(a'_{11})$. But also $d \mid a'_{11}$, so that $\ell(d) \leq \ell(a'_{11})$ by (1) above. So $\ell(d) \leq \ell(a'_{11})$, hence by (2) $d$ and $a'_{11}$ are associate. But then $a'_{11} \mid a'_{12}$ as desired, proving Claim 1.

By a similar argument using left-multiplication by a suitable $m \times m$ matrix $B'$ instead of right-multiplication by $C'$, we obtain $a'_{11} \mid a'_{i1}$ for all $i$ (exercise).

By performing elementary row and column operations (which yield equivalent matrices), we arrive at a matrix $A^*$ equivalent to $A$ which has all entries in the first row and first column equal to zero, except for the upper-left entry which is equal to $d_1 := a'_{11}$.

Let $A^{**}$ be the $(m-1) \times (n-1)$ matrix obtained from $A^*$ by removing the first row and first column. By induction on $n$, $A^{**}$ has a Smith Normal Form with diagonal entries $d_2,\ldots,d_n \in R$ such that $d_i \mid d_{i+1}$ for all $i=2,\ldots,n-1$.

Claim 2: $d_1 \mid d_2$.

To see this, add the second row of $A^{**}$ to the first row. The resulting matrix is equivalent to $A$ and its first two entries are $d_1$ and $d_2$. Applying the proof of Claim 1 (and recalling that $d_1 = a'_{11}$) gives $d_1 \mid d_2$, proving Claim 2.

Proposition 2 now follows, since Claim 2 implies that the matrix $\tilde{A}$ given by replacing the block $A^{**}$ inside $A^*$ with its Smith Normal Form is itself in Smith Normal Form. Q.E.D.

Concluding remarks

(1) One can prove the main theorem without Proposition 1 by first showing instead that if $R$ is a noetherian ring (e.g., a PID) then any submodule of a finitely generated module is finitely presented. This implies that for any finitely generated $R$-module $M$, there is an $m \times n$ matrix $A$ with coefficients in $R$ such that $M$ is isomorphic to $R^m / A(R^n)$. The rest of the proof now proceeds more or less as above, with the only differences being that the matrix $A$ no longer necessarily has rank $n$ and we don’t know that $n \leq m$ (but these facts are not really crucial in the above argument).

(2) If $R$ is a Euclidean domain (which implies that $R$ is a PID, but not conversely), we can get any $m \times n$ matrix $A$ as above into Smith Normal Form using just elementary row and column operations. To see this, imitate the proof above but use the Euclidean norm $\| a \|$ in place of the length function $\ell$. In our proof of Claim 1, we had to right-multiply by $C'$, which doesn’t correspond in general to an elementary column operation. However, if $R$ is Euclidean then we can instead write $a'_{12} = q a'_{11} + r$ with $r = 0$ or $\| r \| < \| a'_{11} \|$. If we add $-q$ times column 1 of $A'$ to column 2 and then swap columns 1 and 2, we obtain a matrix equivalent to $A$ whose upper-left entry is $r$. This contradicts the choice of $A'$ unless $r=0$, which means that $a'_{11} \mid a'_{12}$ as claimed. (An example of a PID which is not a Euclidean domain is ${\mathbb Z}[\frac{1 + \sqrt{-19}}{2}]$, see e.g. this paper.) Note that this proof can easily be turned into an algorithm for computing the Smith Normal Form of a matrix over a Euclidean domain.

(3) The idea behind the proof of uniqueness of the rank and invariant factors given in Dummit and Foote is as follows. The rank of $M$ can be recovered as the maximal number of $R$-linearly independent elements of $M$. By the Chinese Remainder Theorem, if $a = p_1^{k_1} \cdots p_t^{k_t}$ in $R$ with the $p_i$ distinct nonzero prime elements, then $R / aR \cong R / (p_1^{k_1}) \oplus \cdots \oplus R / (p_t^{k_t}).$ Thus we can write $M$ as a direct sum of $R^r$ and some number of cyclic $R$-modules, each isomorphic to $R / (p^{k})$ for some $p$ and some $k$. The nonzero prime-powers $p^{k}$ which occur are called the elementary divisors of $M$; they are defined only up to associates. One can recover the invariant factors from the elementary divisors (exercise), so it suffices to prove that the elementary divisors are unique up to reordering and multiplying by units. For this, one shows that if $p$ is a nonzero prime element, we have $M[p^{k+1}]/M[p^k] \cong \left(R/(p)\right)^s$, where $s = s_k(p)$ is the number of elementary divisors $p^j$ with $j > k$. It follows easily that for each $p$ and $k$, the number of elementary divisors of the form $p^k$ is uniquely determined by $M$.

For different proof, which works directly with the invariant factors without first converting to elementary divisors, see my next blog post on Fitting ideals.

(4) An integral domain $R$ is called a Bézout domain if every finitely generated ideal is principal. It is an open question whether every matrix over a Bézout domain $R$ is equivalent to a matrix in Smith Normal Form. (The ring of all algebraic integers is an example of a Bézout domain which is not a PID.)

(5) Two $n \times n$ matrices $A$ and $A'$ over a field $F$ are similar over $F$ if and only if the characteristic matrices $xI-A$ and $xI-A'$ have the same Smith Normal Form over $F[x]$. This is a well-known consequence of the structure theorem for finitely generated modules over a PID. Note that the minimal polynomial of $A$ is equal to the largest invariant factor of $xI-A$ and the characteristic polynomial is the product of all the invariant factors of $xI-A$.

(6) Richard Stanley has written this survey paper about applications of the Smith Normal Form to combinatorics, one of which concerns the Jacobian group (a.k.a. the critical group) of a graph.

(7) The Smith Normal Form is named for Henry John Stephen Smith (1826-1883), an Irish mathematician who also discovered a version of the Cantor set before Cantor did.

## 3 thoughts on “Finitely generated modules over a P.I.D. and the Smith Normal Form”

1. Antoine Chambert-Loir says:

Dear Matt I agree that it’s hard to find a convenient reference for this. I spent a lot of time writing something for my â(Mostly) commutative algebraâ book that I would myself find reasonable. At the end, I gave many proofs… Best Antoine

â£– UniversitÃ© de Paris â Diderot Institut de mathÃ©matiques de Jussieu Paris Rive Gauche BÃ¢timent Sophie Germain, 8 place AurÃ©lie Nemours, 75205 Paris cedex 13