I’m teaching Graduate Algebra this semester, and I wanted to record here the proof I gave in class of the (existence part of the) structure theorem for finitely generated modules over a PID. It’s a standard argument, based on the existence of the Smith Normal Form for a matrix with entries in a PID, but it’s surprisingly hard to find a concise and accessible reference.
We assume familiarity with basic definitions in the theory of modules over a (commutative) ring. Our goal is to prove the following:
Theorem: Let be a principal ideal domain and let be a finitely generated -module. Then is isomorphic to a direct sum of cyclic -modules. More precisely, there are a non-negative integer (called the rank of ) and elements (called the invariant factors of ) such that for all and .
Remark: The rank of is uniquely determined, and the invariant factors of are uniquely determined up to multiplication by a unit. There are many references for this uniqueness assertion, see for example Section 12.1 in Dummit and Foote’s “Abstract Algebra”. Alternatively, see my next blog post on Fitting ideals for a proof of a more general assertion.
The proof of the theorem will use the following two results, which will be established later:
Proposition 1: Let be a PID and let be a finitely generated free -module of rank (i.e., ). Then any -submodule of is free of rank at most .
Proposition 2: Let be a PID, let be nonnegative integers, and let be an matrix with entries in . Then there exist an invertible matrix and an invertible matrix , both with coefficients in , such that is in Smith Normal Form. (This means that the last rows of are zero, and the square submatrix given by the first rows is diagonal, with diagonal entries (from top left to bottom right) such that for all . By abuse of terminology, we call the invariant factors of .)
Assuming these two results, we show how to prove the main theorem.
Proof of the Theorem: Suppose is generated by . Then the map given by is a surjective $R$-module homomorphism, hence (by the first isomorphism theorem) for some -submodule of . By Proposition 1, is free of rank , i.e., .
Let be an ordered basis for and let be an ordered basis for . Write with . Let be the Smith Normal Form of , and let be the invariant factors of . Then there exist ordered bases for and for (corresponding to the change-of-basis matrices and , respectively) such that for (exercise).
In other words, we have
But this latter quotient module is easily verified to be isomorphic to Q.E.D.
Proof of Proposition 1: Without loss of generality we may assume that . The proof is by induction on .
The case is true because then and is an ideal in , which is principle since is a PID. Thus for some . If then is free of rank zero, and if then the map given by is an -module isomorphism (since is a domain), so is free of rank 1.
For the inductive step, let be the -module homomorphism given by projection onto the last coordinate. Then is isomorphic to . If is a submodule of , we define . By induction, is free of rank at most .
In addition, is an ideal of , hence for some . Choose with and let . If then and we’re done. So we may assume, without loss of generality, that .
Given the claim, it follows (since is free of rank at most and is free of rank 1) that is free of rank at most , which proves the theorem.
To prove the claim, it suffices to show that (a) and (b) . For (a), let and choose such that . Then , hence . Thus with and , proving (a). For (b), apply to some element . Since we have . On the other hand, since we have for some , and then . Thus , which implies (since is a domain) that , and thus . Q.E.D.
Proof of Proposition 2: Recalling that every PID is a UFD, define the “length” function by setting and iff with and non-zero prime (= irreducible) elements of . In particular, iff is a unit. It is easy to see, by unique factorization, that for all we have:
(1) If then .
(2) If and then and are associate, i.e., they differ by multiplication by some unit.
We say that two matrices with entries in are equivalent if there exist as above with .
We proceed by induction on . The case is trivial, and we may assume without loss of generality that . Let be a matrix equivalent to such that is minimal among all matrices equivalent to .
If we swap two rows, or two columns, of any matrix we obtain an equivalent matrix (exercise). It follows easily that for all .
Claim 1: for all .
By swapping columns as needed, it suffices to prove that . Let be a generator for the ideal generated by and . Then there exist such that . If and with then .
Define , where is the following matrix:
(Note that is invertible, since its determinant is .)
Then is equivalent to and . By the definition of , we must have . But also , so that by (1) above. So , hence by (2) and are associate. But then as desired, proving Claim 1.
By a similar argument using left-multiplication by a suitable matrix instead of right-multiplication by , we obtain for all (exercise).
By performing elementary row and column operations (which yield equivalent matrices), we arrive at a matrix equivalent to which has all entries in the first row and first column equal to zero, except for the upper-left entry which is equal to .
Let be the matrix obtained from by removing the first row and first column. By induction on , has a Smith Normal Form with diagonal entries such that for all .
Claim 2: .
To see this, add the second row of to the first row. The resulting matrix is equivalent to and its first two entries are and . Applying the proof of Claim 1 (and recalling that ) gives , proving Claim 2.
Proposition 2 now follows, since Claim 2 implies that the matrix given by replacing the block inside with its Smith Normal Form is itself in Smith Normal Form. Q.E.D.
(1) One can prove the main theorem without Proposition 1 by first showing instead that if is a noetherian ring (e.g., a PID) then any submodule of a finitely generated module is finitely presented. This implies that for any finitely generated -module , there is an matrix with coefficients in such that is isomorphic to . The rest of the proof now proceeds more or less as above, with the only differences being that the matrix no longer necessarily has rank and we don’t know that (but these facts are not really crucial in the above argument).
(2) If is a Euclidean domain (which implies that is a PID, but not conversely), we can get any matrix as above into Smith Normal Form using just elementary row and column operations. To see this, imitate the proof above but use the Euclidean norm in place of the length function . In our proof of Claim 1, we had to right-multiply by , which doesn’t correspond in general to an elementary column operation. However, if is Euclidean then we can instead write with or . If we add times column 1 of to column 2 and then swap columns 1 and 2, we obtain a matrix equivalent to whose upper-left entry is . This contradicts the choice of unless , which means that as claimed. (An example of a PID which is not a Euclidean domain is , see e.g. this paper.) Note that this proof can easily be turned into an algorithm for computing the Smith Normal Form of a matrix over a Euclidean domain.
(3) The idea behind the proof of uniqueness of the rank and invariant factors given in Dummit and Foote is as follows. The rank of can be recovered as the maximal number of -linearly independent elements of . By the Chinese Remainder Theorem, if in with the distinct nonzero prime elements, then Thus we can write as a direct sum of and some number of cyclic -modules, each isomorphic to for some and some . The nonzero prime-powers which occur are called the elementary divisors of ; they are defined only up to associates. One can recover the invariant factors from the elementary divisors (exercise), so it suffices to prove that the elementary divisors are unique up to reordering and multiplying by units. For this, one shows that if is a nonzero prime element, we have , where is the number of elementary divisors with . It follows easily that for each and , the number of elementary divisors of the form is uniquely determined by .
For different proof, which works directly with the invariant factors without first converting to elementary divisors, see my next blog post on Fitting ideals.
(4) An integral domain is called a Bézout domain if every finitely generated ideal is principal. It is an open question whether every matrix over a Bézout domain is equivalent to a matrix in Smith Normal Form. (The ring of all algebraic integers is an example of a Bézout domain which is not a PID.)
(5) Two matrices and over a field are similar over if and only if the characteristic matrices and have the same Smith Normal Form over . This is a well-known consequence of the structure theorem for finitely generated modules over a PID. Note that the minimal polynomial of is equal to the largest invariant factor of and the characteristic polynomial is the product of all the invariant factors of .
(6) Richard Stanley has written this survey paper about applications of the Smith Normal Form to combinatorics, one of which concerns the Jacobian group (a.k.a. the critical group) of a graph.