# Fitting ideals of modules

In my previous post, I presented a proof of the existence portion of the structure theorem for finitely generated modules over a PID based on the Smith Normal Form of a matrix. In this post, I’d like to explain how the uniqueness portion of that theorem is actually a special case of a more general result, called Fitting’s Lemma, which holds for arbitrary commutative rings.

We begin by proving that one can characterize the diagonal entries in the Smith Normal Form of a matrix $A$ over a PID in an intrinsic way by relating them to the GCD of the $k \times k$ minors of $A$ for all $k$. Actually, since the GCD isn’t defined for general rings, we will instead consider the ideal generated by the $k \times k$ minors (which makes sense for any ring, and is the ideal generated by the GCD in the case of a PID).

Throughout this post, $R$ will be a non-zero commutative ring with identity.

Determinental ideals

Definition: Let $A$ be an $m \times n$ matrix with entries in $R$. For $k \geq 0$, we define the $k^{\rm th}$ determinental ideal ${\mathcal D}_k(A)$ of $A$ to be the ideal generated by all of the $k \times k$ minors of $A$. By convention, we set ${\mathcal D}_k(A) = R$ if $k \leq 0$ and ${\mathcal D}_k(A) = 0$ if $k \geq \min(m,n)$.

By the Laplace expansion for determinants, the ideals ${\mathcal D}_k(A)$ form a decreasing nested sequence: $R = {\mathcal D}_0(A) \supseteq {\mathcal D}_1(A) \supseteq \cdots \supseteq (0).$

As mentioned above, we have:

Proposition 1: Suppose $R$ is a PID and $n \leq m$. Then ${\mathcal D}_k(A) = (d_1 d_2 \cdots d_k)$ for all $1 \leq k \leq n$, where $d_1,d_2,\ldots,d_n$ are the invariant factors of $A$ (the diagonal entries in the Smith Normal Form).

Thus, when $R$ is a PID, knowledge of the invariant factors of $A$ is equivalent to knowledge of the determinental ideals.

Proposition 1 is a consequence of Lemma 2 below, which itself is a consequence of:

Lemma 1: Let $A$ be an $m \times n$ matrix with entries in a ring $R$, and suppose $B$ (resp. $C$) is an $m \times m$ (resp. $n \times n$) matrix over $R$. Then ${\mathcal D}_k(AC) \subseteq{\mathcal D}_k(A)$ and ${\mathcal D}_k(BA) \subseteq{\mathcal D}_k(A)$ for all $k$.

Proof: The columns of $AC$ are linear combinations of the columns of $A$. Since the determinant is multilinear, it follows that each $k \times k$ minor of $AC$ is an $R$-linear combination of $k \times k$ minors of $A$. Therefore ${\mathcal D}_k(AC) \subseteq{\mathcal D}_k(A)$ for all $k$. By a similar consideration involving rows, ${\mathcal D}_k(BA) \subseteq{\mathcal D}_k(A)$. Q.E.D.

Lemma 2: Let $A$ be an $m \times n$ matrix with entries in a ring $R$, and suppose $B$ (resp. $C$) is an invertible $m \times m$ (resp. $n \times n$) matrix over $R$. Let $A' = BAC$. Then ${\mathcal D}_k(A') = {\mathcal D}_k(A)$ for all $k$.

Proof: By Lemma 1, ${\mathcal D}_k(A') = {\mathcal D}_k(BAC) \subseteq{\mathcal D}_k(AC) \subseteq {\mathcal D}_k(A)$. Since $B$ and $C$ are invertible, we can write $A = B^{-1} A' C^{-1}$ and apply the same argument to obtain ${\mathcal D}_k(A) \subseteq {\mathcal D}_k(A')$. Thus ${\mathcal D}_k(A') = {\mathcal D}_k(A)$. Q.E.D.

In particular, performing elementary row operations on a matrix $A$ (adding a multiple of one row to another, permuting the rows, or multiplying some row by a unit) does not change the ideals ${\mathcal D}_k(A)$, and the same holds for elementary column operations.

Fitting ideals of modules

The theory of Fitting ideals can be developed in the context of finitely generated modules, but it’s slightly simpler to explain in the context of finitely presented modules, so I’ll restrict my discussion to that case. (Note that if $R$ is noetherian then every finitely generated $R$-module is finitely presented.)

Let $M$ be a finitely presented $R$-module with presentation $R^m \to R^n \to M \to 0$, and let $A$ be the $n \times m$ matrix with entries in $R$ encoding the map $R^m \to R^n$. (In down-to-earth terms, $M$ is generated by the image of the standard basis vectors $e_1,\ldots,e_n$ under the map $R^n \to M$, and the columns of $A$ encode the $R$-linear relations between these generators.)

Definition: For $k \geq 0$, define the $k^{\rm th}$ Fitting ideal of $M$ to be the ideal generated by the $(n-k) \times (n-k)$ minors of $A$, i.e., ${\rm Fit}_k(M) = {\mathcal D}_{n-k}(A)$.

The key result in the theory is the following:

Fitting’s Lemma: The ideal ${\rm Fit}_k(M)$ is independent of the choice of presentation.

It therefore makes sense to talk about the Fitting ideals of a module without reference to any particular presentation.

Note that, by construction, the ideals ${\rm Fit}_k(M)$ form an increasing sequence: ${\rm Fit}_0(M) \subseteq {\rm Fit}_1(M) \subseteq \cdots$

When $R$ is a PID and $M$ is a finitely generated torsion module, Fitting ideals encode the same information as the invariant factors (or elementary divisors) of $M$, because if $M$ has invariant factors $d_1,d_2,\ldots,d_n$ with $d_i \mid d_{i+1}$ for all $i$, we have ${\rm Fit}_{n-k}(M) = (d_1 d_2 \cdots d_k)$ (cf. Proposition 1 and the previous post). Fitting’s theorem therefore generalizes the uniqueness of the invariant factors in the structure theorem for finitely generated modules over a PID.

Proof of Fitting’s Lemma

Our proof will follow this short note by Mel Hochster. Some readers may prefer the exposition given on this page of The Stacks Project.

Proof: We first show that the ideals ${\rm Fit}_k(M)$ depend only on the kernel of the surjection $\phi : R^n \to M$, and not on the choice of a particular set of relations generating this kernel (which correspond to the columns of the matrix $A$). Given two finite sets of vectors in $R^n$ generating ${\rm ker}(\phi)$, we can compare each with the union. Therefore, it suffices to consider the case where one set of relations is included in the other. In terms of matrices, this means that we have two presentations for $M$, one generated by an $n \times m$ matrix $A$ and one generated by an $n \times (m+m')$ matrix $A'$ whose first $m$ columns are the same as those of $A$ and whose last $m'$ columns are linear combinations of the first $m$. By subtracting linear combinations of the first $m$ columns from the last $m'$ (which does not change the determinental ideals), we may assume that the last $m'$ columns are all zero, in which case the result is clear.

It remains to show that the ideals ${\rm Fit}_k(M)$ are independent of the choice of generators for $M$, i.e., of the choice of a surjection $\phi : R^n \to M$. Once again, we can compare each of two different sets of generators with their union, and so we may assume that one set of generators is contained in the other. By induction, it suffices to consider the case where there is just one additional generator. By the previous paragraph, we may assume that, included among the list of relations for the second set of generators, there is a relation expressing the additional generator as a linear combination of the others. By relabeling the generators and relations (i.e., permuting the rows and columns of the presentation matrix), we may assume that the matrix $A'$ with the additional generators present has a 1 in the last row and column. By performing elementary column operations (subtracting multiples of the last column from the others), we can assume that all other entries in the last row of $A'$ are zero. In other words,

where $A$ is $n \times m$ and the last row of $A'$ is $(0,0\ldots,0,1)$. Note that $A$ is a relations matrix for the presentation using the first $n$ generators.

We now show that for all $k$, we have ${\mathcal D}_{n+1-k}(A') = {\mathcal D}_{n-k}(A)$, which will finish the proof. Let $t = n-k$. Each $(t+1)\times (t+1)$ minor of $A'$ involving the 1 in the lower right-hand corner is the same, up to sign, as a $t \times t$ minor of $A$, all of which occur. It remains to check that the other $(t+1)\times (t+1)$ minors of $A'$ also belong to ${\mathcal D}_{t}(A)$. If such a minor involves the last row of $A'$, it is zero. Otherwise, it has at least $t$ columns in $A$, and thus its expansion by minors with respect to the remaining column belongs to ${\mathcal D}_{t}(A)$. Q.E.D

A generalization of the Cayley-Hamilton theorem

The ideal ${\rm Fit}_0(M)$ is called the initial Fitting ideal of $M$. If $R$ is a PID and $M$ is a torsion $R$-module, ${\rm Fit}_0(M)$ is the product of all the invariant factors, which when $R = {\mathbb Z}$ is the order of $M$ and when $R = F[x]$ for some field $F$ is the characteristic polynomial of $T$ (if $M$ corresponds to $(V,T)$ in the usual way).

Proposition 2: The initial Fitting ideal of $M$ annihilates $M$.

This is just the Cayley-Hamilton theorem when $M$ is a finitely generated torsion module over $R = F[x]$, and of Lagrange’s theorem when $R = {\mathbb Z}$.

Proof: Suppose $A$ is an $n \times m$ matrix representing some presentation $R^m \to R^n \to M \to 0$ for $M$. Let $x_i = \phi(e_i)$ be the corresponding generators of $M$, where $\phi : R^n \to M$ is the given surjection. Let $B$ be any $n \times n$ minor of $A$; in particular, the columns of $B$ represent certain $R$-linear relations between the generators. We want to show that $d := {\rm det}(B)$ annihilates $M$. This is a simple consequence of the identity (which holds for any square matrix over any commutative ring) $dI = B \cdot {\rm adj}(B) = {\rm adj}(B) \cdot B$, where ${\rm adj}(B)$ is the adjugate matrix of $B$. Indeed, we have $d x_i = \phi({\rm adj}(B) B e_i) = 0$, since $Be_i = 0$ by assumption. Since the $x_i$ generate $M$, it follows that $dM = 0$. Q.E.D.

We mention, without proof, some additional properties of Fitting ideals. Proofs can be found, for example, in David Eisenbud’s book “Commutative Algebra with a View Toward Algebraic Geometry”, D.G. Northcott’s monograph “Finite Free Resolutions”, Antoine Chambert-Loir’s “(Mostly) Commutative Algebra”, or the Stacks Project page mentioned above.

(1) If $M$ can be generated by $n$ elements then ${\rm Fit}_n(M) = R$ (this is clear from the definitions), and if $R$ is a local ring then the converse holds as well. We can therefore view the $n^{\rm th}$ Fitting ideal as measuring, in a certain precise sense, the obstruction to a module being generated by $n$ elements.

(2) Fitting ideals commute with localization: if $S \subset R$ is multiplicative then ${\rm Fit}_k(S^{-1}M) = S^{-1}{\rm Fit}_k(M)$.

(3) More generally, Fitting ideals commute with base change: if $R \to R'$ is a ring homomorphism then ${\rm Fit}_k(M \otimes_R R')$ is the ideal generated by the image of ${\rm Fit}_k(M)$.

(4) If $0 \to M' \to M \to M'' \to 0$ is a short exact sequence of $R$-modules then ${\rm Fit}_i(M') {\rm Fit}_{j}(M'') \subseteq {\rm Fit}_{i+j}(M)$ for all $i,j$. If the sequence is split, so that $M \cong M' \oplus M''$, then ${\rm Fit}_k(M)$ is the ideal generated by all products ${\rm Fit}_i(M') {\rm Fit}_{j}(M'')$ with $i+j=k$.

(5) As discussed above, if $R$ is a PID then two finitely generated $R$-modules $M,M'$ are isomorphic if and only if they have the same Fitting ideals. If we consider only torsion modules, this remains true for Dedekind domains. However, for non-torsion modules the result fails: if $I$ is a non-principal ideal in Dedekind domain $R$ then $I$ and $R$ are not isomorphic as $R$-modules but they have the same Fitting ideals (namely, the zeroth Fitting ideal is $(0)$ and all higher Fitting ideals are equal to $R$). If $R$ is not a Dedekind ring, it is possible for two non-isomorphic torsion $R$-modules to have the same Fitting ideals. For example (see pp. 40-42 in this thesis for details), if $R = {\mathbb Z}[\sqrt{-3}]$ and $J$ is the ideal generated by $2$ and $1+\sqrt{-3}$, the torsion $R$-modules $R/(2) \oplus R/J$ and $R/J \oplus R/J$ have the same Fitting ideals but are not isomorphic. As another example, let $R$ be the Unique Factorization Domain ${\mathbb Z}[t]$ and let $I = (2,t)$ and $J=(4,t^2)$. Then the torsion $R$-modules $R/I \oplus R/J$ and $R/I \oplus R/I^2$ have the same Fitting ideals but are not isomorphic.

A glimpse of Iwasawa theory

I first heard about Fitting ideals in the context of Iwasawa theory, a rich area of study within modern number theory. Iwasawa was interested in studying the behavior of the $p$-power torsion in the ideal class group of the cyclotomic field $K = {\mathbb Q}(\zeta_p)$, where $p$ is a prime number, because Kummer had established a close connection between this problem and Fermat’s Last Theorem. Iwasawa’s audacious and perspicacious idea was that it is in fact easier, in many ways, to study the ideal class groups in the entire tower of number fields $K = K_0 \subseteq K_1 \subseteq K_2 \subseteq \cdots$ all at once, where $K_n = {\mathbb Q}(\zeta_{p^{n+1}})$. Each $K_n / K_0$ is a Galois extension with Galois group $G_n$ isomorphic to $\mathbb Z /p^n \mathbb Z$, and the $p$-part of the ideal class group of each $K_n$ is naturally a ${\mathbb Z}_p[G_n]$-module. Iwasawa considered the inverse limit of these groups as a module over the Iwasawa ring $\Lambda := {\mathbb Z}_p[\Gamma]$, where $\Gamma \cong {\mathbb Z}_p$ is the inverse limit of the $G_n$. Using class field theory, Iwasawa constructed a closely related module $X_\infty$ which is a finitely generated torsion module over $\Lambda$, and he used the structure of such modules to draw conclusions about the entire tower, including eventually the class group of $K$ itself!

The point is that $\Lambda$ is one of the simplest kinds of rings that isn’t a PID, namely it’s a complete 2-dimensional regular local ring. A version of the structure theorem for finitely generated torsion modules over $\Lambda$ can be stated as follows. We say that two $\Lambda$-modules $M,M'$ are pseudo-isomorphic if there is a homomorphism $M \to M'$ with finite kernel and finite cokernel.

Theorem (Iwasawa, Serre): The Fitting ideals of a finitely generated torsion $\Lambda$-module determine the module up to pseudo-isomorphism.

The initial Fitting ideal ${\rm Fit}_0(M)$ is called the characteristic ideal of $M$. The Main Conjecture of Iwasawa Theory, which was proved by Mazur and Wiles many years before Wiles’ revolutionary work on Fermat’s Last Theorem, relates the characteristic ideal of $X_\infty$ (or, more precisely, its eigenspaces under the action of ${\rm Gal}(K/{\mathbb Q})$) to $p$-adic L-functions, yielding a far-reaching generalization of the work of Kummer. See this survey paper by Romyar Sharifi for further details.

Concluding remarks

(1) One can also prove Lemma 1 using exterior algebra. Let $\Lambda^k(A)$ denote the $k^{\rm th}$ exterior power of a matrix $A$ over a commutative ring $R$, i.e., the matrix whose $(I,J)$-entry is the determinant of the $k \times k$ minor $A_{IJ}$, where $I,J$ range over all $k$-element subsets of ${ 1,\ldots,m }$ and ${ 1,\ldots, n }$, respectively. If $A$ represents a homomorphism $f : M \to N$ of free $R$-modules, then $\Lambda^k(A)$ represents the induced map $\Lambda^k(f) : \Lambda^k(M) \to \Lambda^k(N)$ on exterior powers. Since exterior powers are functorial, one has $\Lambda^k(AB) = \Lambda^k(A)\Lambda^k(B)$ (a generalization of the multiplicativity of the determinant), from which Lemma 1 follows easily.

(2) The initial Fitting ideal can be used to give a definition for the image of a morphism of schemes which behaves well in families. The idea is as follows: by the naturality of the construction of Fitting ideals of a module, it makes sense to attach a Fitting ideal sheaf to any sufficiently nice sheaf of modules on a scheme. Accordingly, the Fitting image of a morphism $f : X \to Y$ is defined to be the closed subscheme of $Y$ associated to the sheaf of ideals ${\rm Fit}_0(f_*({\mathcal O}_X))$. This point of view is explored in detail in the book “The Geometry of Syzygies” by Eisenbud.

(3) The Alexander polynomial of a knot can be defined as the initial Fitting ideal of the first homology (with integer coefficients) of the infinite cyclic cover of the complement of the knot, considered as a module over ${\mathbb Z}[t,t^{-1}]$.

(4) For more information on Hans Fitting (1906-1938), who was a student of Emmy Noether and died of bone cancer at age 31 see this biographical page. Among other things, he introduced the Fitting decomposition of a vector space with respect to an endomorphism, which I discussed in this post on the Jordan Canonical Form. His father Friedrich Fitting, who was also a mathematician, is best known today for his 1931 proof that there are exactly 880 magic squares of order 4.