Linear algebra over rings

Test your intuition: is the following true or false?

Assertion 1: If A is a square matrix over a commutative ring R, the rows of A are linearly independent over R if and only if the columns of A are linearly independent over R.

(All rings in this post will be nonzero commutative rings with identity.)

And how about the following generalization?

Assertion 2: If A is an m \times n matrix over a commutative ring R, the row rank of A (the maximum number of R-linearly independent rows) equals the column rank of A (the maximum number of R-linearly independent columns).

If you want to know the answers, read on…

A criterion for surjectivity of an R-linear map

My previous post explored the theory of Fitting ideals. A simple consequence of this theory is the following:

Theorem 1: Let A be an m \times n matrix with entries in a nonzero commutative ring R with identity. The corresponding map A : R^n \to R^m is surjective if and only if n \geq m and {\mathcal D}_m(A) = R.

(Recall that {\mathcal D}_k(A) denotes the ideal generated by all k \times k minors of A.)

Proof: Let M be the cokernel of A. On the one hand, A is surjective iff M = 0 iff {\rm Fit}_0(M) = R, where the last equivalence follows from the fact that {\rm Fit}_0(M) annihilates M. On the other hand, A is a presentation for M, so {\rm Fit}_0(M) = {\mathcal D}_m(A). Q.E.D.

The rest of this post will be about a complementary criterion for deciding when an R-linear map A : R^n \to R^m is injective

McCoy’s theorem

We have the following useful complement to Theorem 1:

Theorem 2 (McCoy): Let A be an m \times n matrix with entries in a nonzero commutative ring R with identity. The corresponding map A : R^n \to R^m is injective if and only if n \leq m and the annihilator of {\mathcal D}_n(A) is zero.

(Recall that the annihilator of an R-module M is the set of r \in R such that r \cdot m = 0 for all m \in M.)

Not the real McCoy

In particular:

Corollary 1: If n > m, a system Ax = 0 of m homogeneous linear equations in n variables with coefficients in a commutative ring R has a nonzero solution over R.

If R is an integral domain, the corollary follows from the corresponding fact in linear algebra by passing to the field of fractions, but for rings with zero-divisors it’s not at all clear how to reduce such a statement to the case of fields.

Some people (looking at you, Sheldon Axler) like to minimize the role of determinants in linear algebra, but I’m not aware of any method for proving Corollary 1 without using determinants. So while linear algebra over fields can largely be developed in a determinant-free manner, over rings determinants appear to be absolutely essential.

From Corollary 1 we immediately obtain:

Corollary 2: (“Invariance of domain”) R^n \cong R^m then n = m.

Additional consequences of McCoy’s theorem

Another corollary of McCoy’s theorem is the following, which shows that Assertion 1 at the beginning of this post is true:

Corollary 3: If A is a square matrix over a commutative ring R, then the rows of A are linearly independent over R if and only if the columns of A are linearly independent over R.

Indeed, by applying the theorem to A and A^T, both conditions are equivalent to the assertion that {\rm det}(A) is not a zero-divisor of R.

However, Assertion 2 is false: it is not true in general that the row rank of a matrix equals the column rank. For example, the row rank of the matrix [2 \; 3] over R = {\mathbb Z}/6{\mathbb Z} is 1 but the column rank is 0.

Corollary 4: Suppose M is a finitely generated R-module. Then the maximal size of a linearly independent set in M is at most the minimum size of a generating set of M, with equality if and only if M is free.

Proof: It suffices to prove that if \beta_1,\ldots,\beta_n generate M and \gamma_1,\ldots,\gamma_n \in M are linearly independent then \beta_1,\ldots,\beta_n are linearly independent (i.e., they form a basis for M). Since the \beta_i generate, we have (\gamma_1,\ldots,\gamma_n)= (\beta_1,\ldots,\beta_n) A for some n \times n matrix A over R. If Ax = 0 for some non-zero column vector x \in R^n then (\gamma_1,\ldots,\gamma_n) x= (\beta_1,\ldots,\beta_n) Ax = 0, contradicting the independence of the \gamma_i. By Theorem 1, it follows that d := {\rm det}(A) is not a zero divisor in R.

Consider the localization R' = R[d^{-1}]. Over R', the matrix A has an inverse B. If (\beta_1,\ldots,\beta_n) y = 0 for some column vector \beta \in R^n, then (with the obvious notation) \gamma B y = \beta y = 0. But the \gamma_i remain independent over R', since we can clear the denominator in any R'-linear dependence to get a dependence over R (this is where we use the fact that d is not a zero-divisor). Thus By = 0, hence y = 0. It follows that the \beta_i are linearly independent over R as claimed. Q.E.D.

Proof of McCoy’s Theorem

The proof will use the following well-known identities which are valid for any n \times n square matrix A = (a_{ij}) over any commutative ring R:

Laplace’s formula: For all i=1,\ldots,n, {\rm det}(A) = \sum_{j=1}^n (-1)^{i+j} a_{ij}M_{ij}, where M_{ij} is the minor of rank n-1 obtained from A by deleting row i and column j.

Cramer’s formula: There is an n \times n matrix B such that AB = BA = {\rm det}(A) I.

Proof of the theorem: The “if” direction is easy: suppose m \geq n and Ab = 0 with b \in R^n. Then for every rank n minor M = {\rm det}(A') of A, we have A' b = 0 and thus, by Cramer’s formula, Mb = 0. Since this holds for all M, the hypothesis of the theorem implies that b = 0. Thus A is injective.

For the “only if” direction, suppose A is injective. We may assume that m \geq n, since once the theorem has been proved for square matrices, if m<n we can add n-m zero rows and the resulting m \times m matrix is again injective but has determinant zero, a contradiction.

Let a \in R be such that aM = 0 for every rank n minor M of A. We will prove by backward induction on k that for all 1 \leq k \leq n,

(*)_k \; \; aM = 0 for every rank k minor M of A.

By assumption, (*)_n holds. Assume (*)_{k+1} holds and let M be a minor of A of rank k.

We may assume, without loss of generality, that M is the minor corresponding to the first k rows and first k columns of A. For 1 \leq j \leq k+1, let M_j be the rank k minor of A obtained by deleting column j from the matrix given by the first k rows and first k+1 columns of A. (So, in particular, M_{k+1}=M.)

Define b = a \cdot (M_1,-M_2,\ldots,(-1)^{k+1}M_{k+1},0\ldots,0)^T \in R^n.

Fix 1 \leq i \leq m. From the Laplace formula,

Ab = (-1)^{k+1} \sum_{j=1}^{k+1} (-1)^j a_{ij} a M_j = a \cdot {\rm det}(A'),

where A' is the matrix

If 1 \leq i \leq k, the determinant of A' is zero because it has two identical rows. If k+1 \leq i \leq m, the determinant of A' is zero by the inductive hypothesis (*)_{k+1}. Thus Ab = 0, and since A is injective we conclude that b=0. In particular, aM = aM_{k+1} = 0.

This completes the inductive step. In particular, (*)_1 holds, which means that aA = 0 and hence A (a,a,\ldots,a)^T = 0. Since A is injective, it follows that a = 0 as desired. Q.E.D.

Coda

Define the determinantal rank of an m \times n matrix A with entries in a nonzero commutative ring R with identity to be the largest nonnegative integer k such that there is a nonzero k \times k minor of A.

Motivated by McCoy’s theorem,  define the McCoy rank of A to be the largest nonnegative integer k such that the annihilator of {\mathcal D}_k(A) is equal to zero.

One can deduce from McCoy’s theorem that both the row rank and column rank of A are less than or equal to the McCoy rank of A, which is by definition less than or equal to the determinantal rank of A.

The following matrix P over R = {\mathbb Z}/210{\mathbb Z} has row rank, column rank is 2, McCoy rank 3, and determinantal rank 4, showing that all of these inequalities can in general be strict:

Over an integral domain, however, all four notions of rank coincide.

References

McCoy’s theorem was first published in “Remarks on Divisors of Zero” by N. H. McCoy, The American Mathematical Monthly, May, 1942, Vol. 49, No. 5, pp. 286–295.

Our proof of McCoy’s theorem follows the exposition in the paper “An Elementary Proof of the McCoy Theorem” by Zbigniew Blocki, UNIV. IAGEL. ACTA MATH. no. 30 (1993), 215–218.

The McCoy rank of a matrix is discussed in detail in D.G. Northcott’s book “Finite Free Resolutions”.

I found the matrix P in Stan Payne’s course notes entitled “A Second Semester of Linear Algebra”.

8 thoughts on “Linear algebra over rings

    • Oops, there was a typo! The ring should have been {\mathbf Z}/210{\mathbf Z}, not {\mathbf Z}/10{\mathbf Z}. I’ve fixed this now. Does this clear things up? The bottom two rows are linearly dependent, for example, because 70 \cdot {\rm Row 3} = 30 \cdot {\rm Row 4} = 0. And the determinant of the lower right 2 \times 2 minor is 21, which is now a zero-divisor.

      Reply
      • Phew! Yes, that totally clears things up.

        There’s also something counter-intuitive to me about the fact that under the projection map $\mathbf{Z}/210\mathbf{Z} \rightarrow \mathbf{Z}/10\mathbf{Z}$, the rank actually *increases*. It’s clear why that happens here (the coefficients in the linear combination you wrote down are in the kernel of the projection map, as presumably are any coefficients that would work), but is there some more geometric way to picture that?

  1. I’m confused by the example above Corollary 4; the lower right 2×2 minor has determinant 1, so shouldn’t that imply that the row and column ranks are both at least 2?

    Reply
  2. These are some of the apocryphal theorems in algebra that no one ever teaches and that get rediscovered by every generation anew. When I was an undergrad, I found it by trying to generalize the (fairly well-known) square matrix case: https://artofproblemsolving.com/community/c7h124137 . The easier “only if” direction of McCoy’s theorem also appears in Keith Conrad’s https://kconrad.math.uconn.edu/blurbs/linmultialg/extmod.pdf (Theorems 5.10 and 6.4).

    Note that an easier counterexample to Assertion 2 is the 1×2-matrix [2 3] over Z/6.

    Reply
    • Thanks, Darij. Yes, one of my goals in this blog is to provide expositions of basic results like this which no one ever learns in a course. And thanks for the easier counterexample, I thought I was being clever by providing just one example to illustrate everything in the post but I now see that this was simply lazy, and I’ve updated the post accordingly.

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s