In my current position as Director of Undergraduate Studies for the Georgia Tech School of Mathematics, I’ve been heavily involved with revamping our linear algebra curriculum. So I’ve spent a lot of time lately reading through various linear algebra books. The goal of this post is to give a self-contained proof of the existence and uniqueness of the Jordan Canonical Form which is somewhat different from the ‘usual’ proofs one finds in textbooks. I’m not claiming any novelty — I’m sure this approach has been discovered before — but I don’t know a good reference so I thought I’d record the details here.
The proof I give here does not use properties of polynomials (e.g. the Chinese Remainder Theorem), nor does it rely on the classification of finitely generated modules over a PID, so it might be of some pedagogical interest. The proof I give for the Generalized Eigenvector Decomposition is based on an auxiliary result — the Fitting Decomposition — which in my opinion ought to be better known. The proof I give of the structure theorem for nilpotent operators comes from these lecture notes of Curt McMullen (Theorem 5.19). It is particularly concise compared to some other arguments I’ve seen.
Jordan Canonical Form
The theorem we wish to prove is that, over an algebraically closed field , every matrix is similar to a matrix in Jordan Canonical Form, and the latter is unique up to rearranging the elementary Jordan blocks. Thus two matrices are similar over if and only if they have the same Jordan Canonical Forms (up to rearranging the blocks). More canonically, we want to show that if is a finite-dimensional vector space over and is a linear operator, then is direct sum of -invariant subspaces such that the matrix of restricted to (with respect to some ordered basis ) is an elementary Jordan block. The latter condition means that there is and a vector with such that is a basis for .
Let be a finite dimensional vector space over a field , and let be a linear endomorphism.
Theorem 1 (Fitting Decomposition): There is a unique direct sum decomposition with the following properties:
1. (1) Both and are -invariant.
2. (2) is nilpotent on .
3. (3) is invertible on .
Proof: The chain must stabilize to a -invariant subspace and the chain must stabilize to a -invariant subspace .
We claim that is nilpotent on and is invertible on . Indeed, it is easy to see that , which implies that is invertible on by the Rank-Nullity Theorem, and for some and thus is nilpotent on .
Next, we claim that . It is clear from what we have already shown that . So it suffices to show that every can be written as with and . Without loss of generality (replacing by a larger integer if necessary), we may assume that and for the same . Since , we have for some . Then so . Setting gives the desired decomposition of .
Finally, we show that the decomposition is unique. Suppose with nilpotent on and invertible on . Then for some positive integer , and for every positive integer . Thus and . By dimension considerations, we must have and .
Generalized Eigenvector Decomposition
Let be an eigenvalue of . A generalized eigenvector for with eigenvalue is a vector such that for some positive integer . The generalized eigenspace is the subspace of consisting of all generalized eigenvectors for with eigenvalue . Note that can also be characterized as the nilpotent part of the Fitting Decomposition of .
Lemma: If is nilpotent on and is non-zero, then is invertible on .
Proof: Suppose . Then , so for all positive integers . Choosing large enough so that , we have which implies that .
Theorem 2 (Generalized Eigenvector Decomposition): Assume that every eigenvalue of (over some extension of ) is defined over , and let be the distinct eigenvalues of . Then
Proof: Let be an eigenvalue of , and let be the Fitting Decomposition of relative to . Since , it follows by induction that is a direct sum of generalized eigenspaces for . We claim that the generalized eigenspaces of are precisely the for . Given this claim, the Theorem follows immediately. To prove the claim, we make the following observations:
(1) Every eigenvalue of restricted to is equal to for some .
This follows from the easily verified facts that every eigenvalue of is also an eigenvalue of , and is not an eigenvalue of .
(2) For , the generalized eigenspace of with respect to equals the generalized eigenspace of with respect to .
Any generalized eigenvector for is automatically a generalized eigenvector for . Conversely, if for some , then the restriction of to is nilpotent, so by the Lemma the restriction of to is invertible. It follows that for all positive integers , so by the proof of Theorem 1.
Nilpotent Operators and Cyclic Vectors
A subspace of is called cyclic with respect to if there is a vector and a positive integer such that is a basis for .
Theorem 3 (Cyclic Decomposition for Nilpotent Operators): If is nilpotent on , then is a direct sum of cyclic subspaces for .
Proof: Suppose that is nilpotent of exponent , i.e., on but . Choose with and let be the -invariant subspace of spanned by . We claim that has a -invariant complement, i.e., there exists a -invariant subspace of with . Assuming the claim, it follows by induction that is a direct sum of cyclic subspaces for and the Theorem is proved.
To prove the claim, let and let . Note that the restriction of to has exponent , so by induction there is a -invariant decomposition . Let . We make the following two observations:
To see this, note that if we have and for some . As , we have .
Indeed, implies that .
By (1) and (2), there exists a subspace with such that . (Augment a basis for to a basis of using vectors in .) Since , we can take to be the desired -invariant complement.
Combining Theorems 2 and 3, we deduce:
Theorem 4 (Jordan Canonical Form): There is an ordered basis for such that the matrix of with respect to is a block sum of elementary Jordan matrices . The number of Jordan blocks with a given size and given eigenvalue is independent of the choice of .
Proof: By Theorem 2, is a direct sum of generalized eigenspaces for , and by Theorem 3 each is a direct sum of cyclic subspaces for . Amalgamating the various ordered bases coming from these cyclic subspsaces gives an ordered basis for with the required property. The uniqueness part follows from the easily verified observation that the number of Jordan blocks of size at least and eigenvalue is equal to .
Corollary: Assume that is algebraically closed. Then two matrices and are similar over if and only if they have the same eigenvalues (say ) and for all and we have .