Canonical Forms Linear Algebra Notes

1 Introduction 

Canonical Forms 

Linear Algebra Notes 

Satya Mandal 

October 25, 2005 

Here F will denote a field and V will denote a vector space of dimension 

dim(V ) = n. (In this note, unless otherwise stated, n = dim(V )) 

We will study operatores T on V. The goal is to investigate if we 

can find a basis e1, . . . , en such that 

the matrix of T = diagonal(λ1, λ2, . . . , λn) 

is a diagonal matrix. This will mean that 

T (ei) = λiei. 

2 Characteristic Values 

2.1 Basic Definitions and Facts 

Here we will discuss basic facts. 

2.1 (Definition.) Let V be a vector space over a field F and T ∈ 

L(V, V ) be linear operator. 

1. A scalar λ ∈ F is said to be a characterisitic value of T, if 

T (e) = λe for some e ∈ V with e = 0. 

A characterisitic value is also known an eigen value. 

1

2. This non-zero element e ∈ V above is called a characterisitic 

vector of T associated to λ. A characterisitic vector is also 

known an eigen vector. 

3. Write 

N(λ) = {v ∈ V : T (v) = λv}. 

Then N(λ) is a subspace of V and is said to be the characterisitic 

space or eigen space of T associated to λ. 

2.2 (Lemma.) Let V be a vector space over a field F and T ∈ 

L(V, V ) Then T is singular if and only if det(T ) = 0. 

Proof. (⇒): We have T (e1) = 0 for some e1 ∈ V with e1 = 0. We 

can extend e1 to a basis e1, e2, . . . , en of V. Let A be the matrix of 

T with respect to this basis. Since, T (e1) = 0, the first row of A is 

zero. Therefore, 

det(T ) = det(A) = 0. 

So, this implication is established. 

(⇐): Suppose det(T ) = 0. Let e1, e2, . . . , en be a basis of V and 

A be the matrix of T with respect to this basis. So 

det(T ) = det(A) = 0. 

Therefore, A is not invertible. Hence AX = 0 for some non-zero 

column 

⎛ ⎞ 

X = 

⎜ 

⎝ 

c1 

c2 

. . . 

cn 

⎟ 

⎠ . 

Write v = c1e1 + c2e2 + · · · + cnen. Since not all ci is zero, v = 0. 

Also, 

T (v) = ciT (ei) 

⎛ ⎞ 

⎛ ⎞ 

⎜ 

= (T (e1), T (e2), . . . , T (en)) ⎜ 

⎝ 

So, T is singular. 

c1 

c2 

. . . 

cn 

⎟ 

⎠ = (e1, 

⎜ 

e2, . . . , en)A ⎜ 

⎝ 

The following are some equivalent conditions. 

2 

c1 

c2 

. . . 

cn 

⎟ 

⎠ = 0..

2.3 (Theorem.) Let V be a vector space over a field F and T ∈ 

L(V, V ) be linear operator. Let λ ∈ F be a scalar. Then the following 

are equivalent: 

1. λ is a characteristic value of T. 

2. The operator T − λI is singular (or is not invertible). 

3. det(T − λI) = 0. 

Proof. ((1) ⇒ (2)): We have (T − λI)(e) = 0 for some e ∈ V with 

e = 0. So, T − λI is singular and (2) is established. 

((2) ⇒ (1)): Since T − λI is singular, we have (T − λI)(e) = 0 

for some e ∈ V with e = 0. Therefore, (1) is established. 

((2) ⇔ (3)): Immediate from the above lemma. 

2.4 (Definition.) Let A ∈ Mn(F) be an n × n matrix with entries 

in a field F. 

1. A scalar λ ∈ F is said to be a characteristic value of A if 

det(A − λIn) = 0. Equivalently, λ ∈ F is said to be a characteristic 

value of A if the matrix (A − λIn) is not invertible. 

2. The monic polynomial det(XI − A) is said to be the characteristic 

polynomial of A. Therefore, characteristic values of 

A are the roots of the characteristic polynomial of A. 

2.5 (Lemma.) Let A, B ∈ Mn(F) be two n × n matrices with 

entries in a field F. If A and B are similar, then they have some 

characteristic polynomials. 

Proof. Suppose A, B are similar matrices. Then A = P BP −1 . The 

characteristic polynomial of A = det(XI−A) = det(XI−P BP −1 ) = 

det(P (XI − B)P −1 ) = det(XI − B) = the characteristic polynomial 

of B. 

2.6 (Definitions and Facts) Let V be a vector space over a field 

F and T ∈ L(V, V ) be linear operator. 

3

1. Let A be the matrix of T with respect to some basis E of 

V. We define the characteristic polynomial of T to be the 

characteristic polynomial of A. Note that this polynomial is 

well defined by the above lemma. 

2. We say that T is diagonalizable if the there is a basis e1, . . . , en 

such that each ei is a characteristic value of T. In this case, 

T (ei) = λiei for some λi ∈ F. Hence, with respect to this basis, 

the matrix of T = Diagonal(λ1, λ2, . . . , λn) 

Depending on how many of these eigen values λi are distinct, 

we can rewrite the matrix of T. 

3. Also note, if T is diagonilazable as above, the characteric polynomial 

of T = (X − λ1)(X − λ2) · · · (X − λn), which is completely 

factorizable. 

4. Suppose T is diagonalizable, as above. Depending on how 

many of these eigen values λi are distinct, we can rewrite the 

matrix of T. 

Now sSuppose T is diagonalizable c1, c2, . . . , cr are the distinct 

eigen values of T. Then the matrix of T with respect to some 

basis of V looks like: 

⎛ 

⎜ 

⎝ 

c1Id1 0 · · · 0 

0 c2Id2 · · · 0 

· · · · · · · · · · · · 

0 0 · · · crIdr 

where Ik is the identity matrix of order k. So, d1+d2+· · ·+dr = 

n = dim(V ). 

In this case, the characteristic polynomial of 

Further, 

(see 3 of 2.1.) 

⎞ 

⎟ 

⎠ 

T = (X − c1) d1 (X − c2) d2 · · · (X − cr) dr . 

di = dim(N(ci)). 

2.7 (Read Examples) Read Example 1 and 2 in page 184 

4

2.2 Decomposition of V 

2.8 (Definition) Suppose V is vector space of a field F with dim(V ) = 

n. Let f(X) = a0 + a1X + a2X 2 + · · · + arX r ∈ F[X] be polynomial 

and T ∈ L(V, V ) be linear operator. Then, by definition, 

f(T ) = a0Id + a1T + a2T 2 + · · · + arT r ∈ L(V, V ) 

is an operator. So, L(V, V ) becomes a module over F[X]. 

2.9 (Remark) Suppose V is vector space of a field F with dim(V ) = 

n. Let T ∈ L(V, V ) be linear operator. Let f(X) be a characteristic 

polynomial of T. We have understanable interest how f(T ) works. 

2.10 (Lemma) Suppose V is vector space of a field F with dim(V ) = 

n. Let T ∈ L(V, V ) be linear operator. Let f(X) ∈ F[X] be any 

polynomial. Suppose 

T (v) = λv 

for some v ∈ V and λ ∈ F. Then 

f(T )(v) = f(λ)v. 

The proof is obvious. This means if λ is an eigen value of T then 

f(λ) is an eigen value of f(T ) 

2.11 (Lemma) Suppose V is vector space of a field F with dim(V ) = 

n. Let T ∈ L(V, V ) be linear operator. Suppose c1, . . . , ck are the 

distinct eigen values of T. Let 

Wi = N(ci) 

be the eigen space of T associated to ci. Write 

Then 

Indeed, if 

W = W1 + W2 + · · · + Wk. 

dim(W ) = dim(W1) + dim(W2) + · · · + dim(Wk). 

is a basis of Wi, then 

is basis of W. 

Ei = {eij ∈ Wi : j = 1, . . . , di} 

E = {eij ∈ Wi : j = 1, . . . , di; j = 1, . . . , k} 

5

Proof. We only need to prove the last part. So, let 

λijeij = 0 

for some scalar λij ∈ F. 

Write 

Then ωi ∈ Wi and 

di 

ωi = 

j=1 

λijeij. 

ω1 + ω2 + · · · + ωk = 0 (I). 

We will first prove that ωi = 0. 

Since 

T (eij) = cieij 

for any polynomial f(X) ∈ F[X], we have 

Therefore, 

f(T )(eij) = f(ci)eij. 

di 

di 

 

 

f(T )(ωi) = λijf(T )(eij) = λijf(ci)eij = f(ci)ωi 

j=1 

j=i 

(II). 

Now let 

k i=2 (X − ci) 

g(X) = k i=2 (c1 − ci) 

Note g(X) is a polynomial. Also note this definition/ expression 

makes sense because c1, . . . , ck are distinct. And also g(c1) = 1 and 

g(c2) = f1(c3) = · · · = g(ck) = 0. 

Use (II) and apply to (I). We get 

0 = g(T )( 

k 

ωi) = 

i=1 

k 

g(T )(ωi) = 

i=1 

k 

g(ci)ωi. = ω1 

i=1 

Similarly, ωi = 0 for i = 1. . . . , k. This means 

di 

0 = ωi = 

6 

j=i 

λijeij

Since Ei is a basis, λij = 0 for all i, j and the proof is complete. 

Following is the final theorem in this section. 

2.12 (Theorem) Suppose V is vector space of a field F with dim(V ) = 



Wi = N(ci) 

be the eigen space of T associated to ci. Then the following are 

equivalent: 

1. T is diagonalizable. 

2. The characteristic polynomial for T is 

f = (X − c1) d1 (X − c2) d2 · · · (X − ck) dk 

and dim(Wi) = di for i = 1, . . . , k. 

3. dim(W1) + dim(W2) + · · · + dim(Wk) = dim(V ). 

Proof. ((1) ⇒ (2)): This is infact obvious. If c1, . . . , ck are the 

distinct eigen values and since T is diagonalizable, the matrix of T 

is as in (4) of (2.6). Therefore, we can compute the characteristic 

polynomial using this matrix and (2) is established. 

((2) ⇒ (3)): We have dim(V ) = degree(f). Therefore, 

dim(V ) = d1 + d2 + · · · + dk = dim(W1) + dim(W2) + · · · + dim(Wk). 

Hence (3) is established. 

((3) ⇒ (1)): Write W = W1 + · · · + Wk. Then, by lemma 2.11 

dim(W ) = dim(W1) + dim(W2) + · · · + dim(Wk). 

Therefore, by (3), dim(V ) = dim(W ). Hence (3) is established and 

the proof is complete. 

In fact, I would like to restate the ”final theorem” 2.12 in terms 

of direct sum of linear subspaces. So, I need to define direct sum of 

vector spaces. 

7

2.13 (Definition) Let V be a vector space over F and V1, V2 . . . , Vk 

be subspaces of V. We say that V is direct sum of V1, V2, . . . , Vk, 

if each element x ∈ V can be written uniquely as 

with ωi ∈ Vi. 

Equivalently, if 

x = ω1 + ω2 + · · · + ωk 

1. V = V1 + V2 + · · · + Vk, and 

2. ω1 + ω2 + · · · + ωk = 0 with ωi ∈ Vi implies that ωi = 0 for 

i = 1, . . . , k. 

If V is direct sum of V1, V2 . . . , Vk then we write 

V = V1 ⊕ V2 ⊕ · · · ⊕ Vk. 

Following is a proposition on direct sum decomposition. 

2.14 (Proposition) Let V be a vector space over F with dim(V ) = 

n < ∞. Let V1, V2 . . . , Vk be subspaces of V Then 

V = V1 ⊕ V2 ⊕ · · · ⊕ Vk 

if and only if V = V1 + V2 + · · · + Vk and 

dim(V ) = dim(V1) + dim(V2) + · · · + dim(Vk). 

Proof. (⇒): Obvious. 

(⇐): Let Ei = {eij : j = 1, . . . , di} be basis of Vi. Let E = {eij : 

j = 1, . . . , di; i = 1, . . . , k}. Since V = V1 + V2 + · · · + Vk, we have 

V = SpanE. Since dim(V ) = cardinlity(E), we have E forms a 

basis of V. Now it follows that if ω1 + · · · + ωk = 0 with ωi ∈ Wi then 

ωi = 0 ∀i. This completes the proof. 

Now we restate the final theorem 2.12 in terms of direct sum. 

8

2.15 (Theorem) Suppose V is vector space of a field F with dim(V ) = 



Wi = N(ci) 

be the eigen space of T associated to ci. Then the following are 

equivalent: 

1. T is diagonalizable. 

2. The characteristic polynomial for T is 

f = (X − c1) d1 (X − c2) d2 · · · (X − ck) dk 

and dim(Wi) = di for i = 1, . . . , k. 

3. dim(W1) + dim(W2) + · · · + dim(Wk) = dim(V ). 

4. V = W1 ⊕ W2 ⊕ · · · ⊕ Wk. 

Proof. Clearly, we proved 

(1) ⇐⇒ (2) ⇐⇒ (3). 

We will prove (3) ⇐⇒ (4). 

((4) ⇒ (3)): This part is obvious because we can combine bases of 

Wi to get a basis of V. 

((3) ⇒ (4)):Write W = W1 + W2 + · · · + Wk. Because of (4) and by 

lemma 2.11, dim(W ) = dim(Wi) = dim(V ). Therefore, V = W = 

W1 + W2 + · · · + Wk. 

Since dim(V ) = dim(Wi), by proposition 2.14 V = W1 ⊕W2 ⊕ 

· · · ⊕ Wk and the proof is complete. 

9

3 Annihilating Polynomials 

Suppose K is a commutative ring and M be a K−module. For 

x ∈ M, we define annihiltor of x as 

ann(x) = {λ ∈ K : λx = 0}. 

Note that ann(x) is an ideal of K. (That means 

ann(x) + ann(x) ⊆ ann(x) and K ∗ ann(x) ⊆ ann(x)). 

We shall consider annihilator of a linear operator, as follows. 

3.1 Minimal (monic) polynomials 

3.1 (Facts) Let V be a vector space over a field F with dim(V ) = 

n. 

Recall, we have seen that M = L(V, V ) is a F[X]−module. For 

f(X) ∈ F[X] and T ∈ L(V, V ), scalar multiplication is defined by 

f ∗ T = f(T ) ∈ L(V, V ) 

1. So, for a linear operator T ∈ L(V, V ), the annihilator of T is: 

ann(T ) = {f(X) ∈ F[X] : f(T ) = 0} 

is an ideal of the polynomial ring F[X]. 

2. Note that ann(T ) is a non-zero proper ideal. It is non-zero 

because dim(L(V, V )) = n 2 and hance 

is a linearly dependent set. 

1, T, T 2 , . . . , T n2 

3. Also recall that any ideal I of F[X] is a principal ideal, which 

means that I = F[X]p where p is the non-zero monic in I 

polynomial of least degree. 

10

4. Therefore, 

ann(T ) = F[X]p(X) 

where p(X) is the monic polynomial of least degree such that 

p(T ) = 0. 

This polynomial p(X) is defined to be the minimal monic 

polynomial (MMP) for T. 

5. Let us consider similar concepts for square matrices. 

(a) For an n × n matrix A, we define annihilator ann(A) of A 

and minimal monic polynomial of A is a similar way. 

(b) Suppose two n×n matrices A, B similar an d A = P BP −1 . 

Then for a polynomial f(X) ∈ F[X] we have 

f(A) = P f(B)P −1 . 

(c) Therefore ann(A) = ann(B). 

(d) Hence A and B have SAME minimal monic polynomial. 

3.2 Comparison of minimal monic and characteristic 

polynomials: 

Given a linear operator T we can think of two polynomials - the 

minimal monic polynomial and the characteristic polynomial of T. 

We will compare them. 

3.2 (Theorem) Let V be a vector space over a field F with dim(V ) = 

n. Suppose p(X) is the minimal monic polynomial of T and g(X) is 

the characteristic polynomial of T. Then p, g have the same roots in 

F. (Although multiplicity may differ.) 

Same statement holds for matrices. 

Proof. We will prove, for c ∈ F, 

p(c) = 0 ⇐⇒ g(c) = 0. 

11

Recall g(X) = det(XI − A), where A is the matrix of T with 

respect to some basis. Also by theorem 2.3, g(c) = 0 is and only if 

cI − T is singular. 

Now suppose p(c) = 0. So, p(X) = (X − c)q(x). for some q(X) ∈ 

F. Since degree(q) < degree(p), by minimality of p we have q(T ) = 0. 

Let v ∈ V be such that v = 0 and e = q(T )(v) = 0. Since, p(T ) = 0, 

we have (T − cI)q(T ) = 0. Hence 0 = (T − cI)q(T )(v) = (T − cI)(e). 

So, (T − cI) is singular and hence g(c) = 0. This estblishes the proof 

of this part. 

Now assume that g(c) = 0. Therefore, T − cI is singular. So, 

there is vector e ∈ V with e = 0 such that T (e) = ce. Applying this 

equation to p we have 

p(T )(e) = p(c)e 

(see lemma 2.10). Since p(T ) = 0 and e = 0, we have p(c) = 0 and 


The above theorem raises the question if these two polynomial 

are same? Answer is, not in general. But MMP devides the chpolynomial 

as follows. 

3.3 (Caley-Hamilton Theorem) Let V be a vector space over a 

field F with dim(V ) = n. Suppose Q(X) is the characteristic polynomial 

of T. Then Q(T ) = 0. 

In particular, if p(X) is the minimal monic polynomial of T, then 

Proof. Write 

Observe that 

p | Q. 

K = F[T ] = {f(T ) : f ∈ F[X], T ∈ L(V, V )}. 

F ⊆ K ⊆ L(V, V ). 

are subrings. Note Q(T ) ∈ K and we will prove Q(T ) = 0. 

Let e1, . . . , en be a basis of V and A = (aij) be the matrix of T. 

So, we have 

12

(T (e1), T (e2), . . . , T (en)) = (e1, e2, . . . , en)A. (I) 

Consider the following matrix, with entries in K : 

⎛ 

⎜ 

B = ⎜ 

⎝ 

Note that the 

T − a11I −a12I −a13I · · · −a1nI 

−a21I T − a22I −a23I · · · −a2nI 

−a31I −a32I T − a33I · · · −a3nI 

· · · · · · · · · · · · · · · 

−an1I −an2I −an3I · · · T − annI 

Q(X) = det(InX − A). 

⎞ 

⎟ 

⎠ . 

Therefore (I think this is the main point to understandin this proof.), 

Q(T ) = det(InT − A) = det(B). 

The above equation (I) says that 

(e1, e2, . . . , en)B = (0, 0, . . . , 0). 

Multiply this equation by Adj(B), and we get 

(e1, e2, . . . , en)BAdj(B) = (0, 0, . . . , 0)Adj(B) = (0, 0, . . . , 0). 

Therefore, 

Therefore, 

This implies that 

(e1, e2, . . . , en)(det(B))In = (0, 0, . . . , 0). 

(e1, e2, . . . , en)(Q(T ))In = (0, 0, . . . , 0). 

Q(T )(ei) = 0 ∀i = 1, . . . , n. 

Hence Q(T ) = 0 and the proof is complete. 

13

4 Invatiant Subspaces 

4.1 (Definition) Let V be a vector space over the field F and 

T : V → V be a linear operator. A subspace W of V is said to be 

invariant under T if 

T (W ) ⊆ W. 

4.2 (Examples) Let V be a vector space over the field F and 

be a linear operator. 

T : V → V 

1. (Trivial Examples) Then V and {0} are invariant under T. 

2. Suppose e be an eigen vector of T and W = Fe. Then W is 

invariant under T. 

3. Suppose λ be an eigen value of T and W = N(λ) be the eigen 

space of λ. Then W is invariant under T. 

4.3 (Remark) Let V be a vector space over the field F and 

T : V → V 

be a linear operator. Suppose W is an invariant subspace T. Then 

the restriction map 

T|W : W → W 

is an well defined linear operator on W. So, the following diagram 

commutes. 

W T |W 

 

 

V 

W 

 

T 

V 

4.4 (Remark) Let V be a vector space over the field F with dim(V ) = 

n < ∞. Let 

T : V → V 

be a linear operator. Suppose W is an invariant subspace T and 

is the restriciton of T. 

T|W : W → W 

14

1. Let p be the characteristic polynomial of T and q be the characteristic 

polynomial of T|W . Then q | p. 

2. Also let P be the minimal (monic) polynomial of T and Q be 

the minimal (monic) polynomial of T|W . Then Q | P. 

Proof. Proof of (2) is easier. Since P (T ) = 0 we also have P (T|W ) = 

0. Therefore 

P (X) ∈ ann(T|W ) = F[X]Q(X). 

Hence Q | P and proof of (2) is complete. 

To prove (1), let E = {e1, e2, . . . , er} be a basis of W. Extend 

this basis to a basis E = {e1, e2, . . . , er, er+1, . . . , en} of V. Let A be 

the matrix of T with respect to E and B be the matrix of T|W with 

respect to E. So, we have 

and 

(T (e1), . . . , T (er)) = (e1, . . . , er)B 

(T (e1), . . . , T (er), T (er+1), . . . , T (en)) = (e1, . . . er, er+1, . . . , , en)A. 

and 

Therefore, A can be written as blocks as follows: 

 

B C 

A = 

0 D 

So, 

P (X) = det(InX − A) = det(IrX − B) det(In−rX − D) 

Q(X) = det(IrX − B). 

Therefore Q | P. The proof is complete. 

4.5 (Definitions and Remarks) 1. Suppose F is a field. Recall 

an n × n matrix A = (aij) is call an upper triangular 

matrix if aij = 0 for all i, j with 1 ≤ i < j ≤ n. Similarly, we 

define lower triangular matrices. 

15

2. Now let V be a vector space over F with dim V = n < ∞. A 

linear operator T : V → V is said to be triangulable, if there 

is a basis E = {e1, . . . , en} of V such that the matrix of V is 

an (upper) triangular matrix. (Note that it does not make a 

difference if we say ”upper” or ”lower” trinagular. To avoid 

confusion, we will assume upper triangular.) 

3. Now suppose a linear operator T is triangulable. So, for a basis 

E = {e1, . . . , en} we have (T (e1), . . . , T (en)) = (e1, . . . , en)A 

for some triangular matrix A = (aij). We assume that A is 

upper triangular. For 1 ≤ r ≤ n, write Wr = span(e1, . . . , er). 

Then Wr is invariant under T. 

4. (Factorization.) Suppose T ∈ L(V, V ) is triangulable. So, 

the matrix of T, with respect to a basis e1, . . . , en, is an upper 

triangular matrix A = (aij). Note that the characterictic 

polynomial q of T is given by 

q(X) = det(IX − A) = (X − a11)(X − a22) · · · (X − ann). 

Therefore, q is completely factorizable. So, we have 

q(X) = (X − c1) d1 (X − c2) d2 · · · (X − ck) dk . 

where d1 + d2 + · · · + dk = dim V and c1, . . . , ck are the distinct 

eigen values of T. 

Also, since the minimal monic polynomial p of T divides q, it 

follows that p is also completely factorizable. Therefore, 

p(X) = (X − c1) r1 (X − c2) r2 · · · (X − ck) rk . 

where ri ≤ di for i = 1, . . . , k. 

4.6 (Theorem) Let V be a vector space over F with with finite 

dimension dim V = n and T : V → V be a linear operator on V. 

Then T is triangulable if and only if the minimal polynomial p of T 

is a product of linear factors. 

Proof. (⇒): We have already shown in (4) of Remark 4.5, that if 

T is triangulable then the MMP p factors into linear factors. 

16

(⇐): Now assume that the MMP p factors as 

p(X) = (X − c1) r1 (X − c2) r2 · · · (X − ck) rk . 

Let q denote the characteristic polynomial of T. Since p and q have 

the same roots, q(c1) = q(c2) = · · · = q(ck) = 0. Now we will split 

the proof into several steps. 

Step-1: Write λ1 = c1. By (2.3), λ1 is an eigen value of T. So, 

there is a non-zero vector e1 ∈ V such that T (e1) = λ1e1. Write 

W1 = Span(e1). 

Step-2: Extend e1 to a basis e1, E2, . . . , En of V. Write V1 = Span(E2, . . . , En). 

Note that 

1. e1 is linearly independent and dim W1 = 1. 

2. W1 is invariant under T. 

3. dim V1 = n − 1 and V = W1 ⊕ V1. 

Let v ∈ V1 and T (v) = λ1e1 + λ2E2 + · · · + λnEn, for some 

λ1, . . . , λn ∈ F. Define T1(v) = λ2E2 + · · · + λnEn ∈ V1. Then 

T1 : V1 → V1 

is a well defined linear operator on V1. Diagramatically, T1 is given 

by 

V1 

 

V = W1 ⊕ V1 

T1 

V1 

pr 

T 

V = W1 ⊕ V1 

where pr : V = W1 ⊕ V1 → V1 is the projection map. Let p1 be the 

MMP of T1. Now, we proceed to prove that p1 | p. 

Claim : ann(T ) ⊆ ann(T1). 

To prove this claim, let A be the matrix of T with respect to e1, E2, . . . , En 

and B be the matrix of T1 with respect to E2, . . . , En. Since W1 is 

invariant under T, we have 

A = 

λ1 C 

0 B 

17 

 

.

Therefore, the matrix of T m is given by 

A m 

m λ1 Cm 

= 

0 Bm For some matrix Cm. Therefore, for a polynomial f(X) ∈ F[X] that 

matrix f(A) of f(T ) is given by 

 

f(λ1) C∗ 

f(A) = 

. 

0 f(B) 

some matrix C∗. So, if f(X) ∈ ann(T ) then f(T ) = 0. Hence 

f(A) = 0. This implies f(B) = 0 and hence f(T1) = 0. So, ann(T ) ⊆ 

ann(T1) and the claim is established. 

Therefore, p1 | p. So, p1 satisfies the hypothesis of the theorem. 

So, there is a an element e2 ∈ V1 such that T1(e2) = λ2e2 where 

(X − λ2) | p1 | p. 

Also follows that T (e2) = ae1 + λ2e2. 

Step-3 Write W2 = Span(e1, e2). 

Note that 

 

. 

1. e1, e2 are linearly independent and dim W2 = 2. 

2. W2 is invariant under T. 

3. Also 

 

λ1 a12 

(T (e1), T (e2)) = (e1, e2) 

0 λ2 

Step-4 If W2 = V (that is if 2 < n), the process will continue. We 

extend e1, e2 to a basis e1, e2, E3, . . . , En of V (Well, they are different 

Ei, not the same as in previous steps.) Write V2 = Span(E3, . . . , En). 

Note 

1. dim(V2) = n − 2 

2. V = W2 ⊕ V2. 

As in the previous steps, define T2 : V2 → V2as in the diagram 

(you should define explicitly): 

18 

 

.

V2 

 

V = W2 ⊕ V2 

T2 

V2 

pr 

T 

V = W2 ⊕ V2 

where pr : V = W2 ⊕ V2 → V2 is the projection map. 

Let p2 be the MMP of T2. Using same argument, we will prove 

p2 | p. Then we can find λ3 ∈ F and e3 ∈ V2 such that T3(e3) = λ3e3 

where (X − λ3) | p2 | p. Therefore T (e3) = a13e2 + a23e2 + λ3e3. 

So, we have 

⎛ 

(T (e1), T (e2), T (e3)) = (e1, e2, e3) ⎝ 

λ1 a12 a13 

0 λ2 a23 

0 0 λ3 

Final Step: The process continues for n steps and we get linearly 

independent set (basis) e1, e2, . . . , en such that 

(T (e1), T (e2), T (e3), . . . , T (en)) = 

⎛ 

⎜ 

(e1, e2, e3, . . . , en) ⎜ 

⎝ 

This completes the proof. 

λ1 a12 a13 . . . a1n 

0 λ2 a23 . . . a2n 

0 0 λ3 . . . a3n 

. . . . . . . . . . . . . . . 

0 0 0 . . . λn 

19 

⎞ 

⎟ 

⎠ . 

⎞ 

⎠ .

Recall a field F is said to be a an algebraically closed field if 

every non-constant polynomial f ∈ F[X] has a root in F. It follows 

that k is an algebraically closed field if and only if every non-constant 

polynomial f ∈ F[X] product linear polynomials. 

4.7 (Theorem) Suppose F is an algebraically closed field. Then 

every n × n matrix over F is similar to a triangular matrix. 

Proof. Consider the operation 

T : F n → F n 

such that T (X) = AX. Now use the above theorem. 

4.8 (Theorem) Let V be a vector space over F with with finite 

dimension dim V = n and T : V → V be a linear operator on V. 

Then T is diagonalizable if and only if the minimal polynomial p of 

T is of the form 

p = (X − c1)(X − c2) · · · (X − ck) 

where c1, c2, . . . , ck are the distinct eigen values of T. 

Proof. (⇒): Suppose T is diagonalizable. Then, there is a basis 

e1, . . . , en of V such that 

(T (e1), T (e2), . . . , T (en) = 

⎛ 

c1Id1 

⎜ 0 

(e1, . . . , en) ⎜ 0 

⎝ . . . 

0 

c2Id2 

0 

. . . 

0 

0 

c3Id3 

. . . 

. . . 

. . . 

. . . 

. . . 

0 

0 

0 

. . . 

⎞ 

⎟ 

⎠ 

0 0 0 . . . cnIdk 

. 

Write g(X) = (X−c1)(X−c2) · · · (X−ck) we will prove g(T ) = 0. 

For, i = 1, . . . , d1 we have (T − c1)(ei) = 0. Therefore, 

g(T )(ei) = (T − c1)(T − c2) · · · (T − ck)(e1) = 0. 

Similarly, g(T )(ei) = 0 for all i = 1, . . . , n. So, g(T ) = 0. Hence p | g. 

Since c1, . . . , ck are roots of both, we have p = g. Hence this part of 


20

(⇐): We we asssume that p(X) = (X − c1)(X − c2) · · · (X − ck) and 

prove that T is digonalizable. Let Wi = N(ci) be a eigen space of ci. 

Let W = k i=1 Wi be the sum of eigen spaces. Assume that W = V. 

Now we will repeat some protions of the proof of theorem 4.6 and 

get a contradiction. Let e1, . . . , em be a basis of W and e1, . . . , em, Em+1, . . . , En 

be a basis of V. Write V ′ = Span(Em+1, . . . , En). Note 

1. W is invariant under T. 

2. V = W ⊕ V ′ . 

Define T ′ : V ′ → V ′ according to the diagram: 

V ′ 

T ′ 

 

V = W ⊕ V ′ 

T 

pr 

′ 

V = W ⊕ V 

where pr : V = W ⊕ V ′ → V ′ is the projection map. 

As in the prrof of theorem 4.6, the MMP p ′ of T ′ divides p. 

Therefore, there is an element e ∈ V ′ such that T ′ (e) = λe for some 

λ ∈ {c1, c2, . . . , ck}. We assume λ = c1. Hence 

 

V ′ 

 

T (e) = a1e1 + · · · + anem + c1e 

where ai ∈ F. We can rewrite this equation as 

T (e) = β + c1e 

where β = ω1 + ω2 + · · · + ωk ∈ W and ωi ∈ Wi. So, 

So, 

(T − c1)(e) = β. 

Since T (W ) ⊆ W , for h(X) ∈ F[X] we have h(T )(β) ∈ W. Write 

p = (X − c1)q and q(X) − q(c1) = h(X)(X − c1). 

is in W. Also 

(q(T ) − q(c1))(e) = h(T )(T − c1)(e) = h(T )(β) 

0 = p(T )(e) = (T − c1)q((T )(e) 

Therefore q((T )(e) ∈ W1 ⊆ W. So, q(c1)e = q((T )(e) − (q(T ) − 

q(c1))(e) is in W. Since q(c1) = 0 we get e ∈ W. This is a contradiction 

and the proof is complete. 

21

5 Simultaneous Triangulation and Diagonilization 

Suppose F ⊆ L(V, V ) is a family of linear operators on a vector space 

V a field F. We say that F is a commuting family if T U = UT 

for all U, T ∈ F. 

In this section we try to find a basis E of V so that, for all T in 

a family F the matrix of T is diagonal (or triangular) with respect 

to E. Following are the main theorems. 

5.1 (Theorem) Let V be a finite dimensional vector space with 

dim V = n over a field F. Let F ⊆ L(V, V ) be a commuting and 

triangulable family of operators on V. Then there is a basis E = 

{e1, . . . , en} such that, for every T ∈ F, the matrix of T with respect 

to E is a triangular matrix. 

Proof. The proof is some fairly similar to that of theorem 4.6. We 

will omit the proof. You can work it out when you need. 

Following is the matrix version of the above theorem. 

5.2 (Theorem) Let F ⊆ Mnn(F) be a commuting and triangulable 

family of n × n matrices. Then there is an invertible matrix P 

such that, for every A ∈ F, we have P AP −1 is an upper triangular 

matrix. 

5.3 (Theorem) Let V be a finite dimensional vector space with 

dim V = n over a field F. Let F ⊆ L(V, V ) be a commuting and 

diagonalizable family of operators on V. Then there is a basis E = 

{e1, . . . , en} such that, for every T ∈ F, the matrix of T with respect 

to E is a diagonal matrix. 

Proof. We will omit the proof. You can work it out when you need. 

22

6 Direct Sum 

Part of this section we already touched. We gave the definition 2.13 

of direct sum of subspaces. Following is an exercise. Note that we 

can make the same definition for any subspace W. 

6.1 (Exercise) Let V be a finite dimensional vector space over a 

field F. Let W1, . . . , Wk be subspaces of V. Then V = W1⊕W2⊕· · ·⊕ 

Wk if and only if V = W1 + W2 + · · · + Wk and for each j = 2, . . . , k, 

we have 

(W1 + · · · + Wj−1) ∩ Wj = {0}. 

6.2 (Examples) (1) R 2 = Re1 ⊕ Re2 where e1 = (1, 0), e2 = (0, 1). 

(2) Let V = Mnn(F). Let U be the subspace of all upper triangular 

matrices. Let L be subspace of all strictly lower triangular matrices 

(that means diagonal entries are zero). Then V = U ⊕ L. 

(3) Recall theorem 2.15 that V is direct sum of eigen spaces of diagonizable 

operators T. 

We used the word ’projection’ before in the context of direct sum. 

Here we define projections. 

6.3 (Definition) Let V be a finite dimensional vector space over 

a field F. An linear operator E : V → V is said to be a projection 

if E 2 = E. 

6.4 (observations) Let V be a finite dimensional vector space 

over a field F. Let E : V → V be a projection. Let R = range(E) 

and N = NE be the null space of E. Then 

1. For v ∈ V, we have x ∈ R ⇔ E(x) = x. 

2. V = N ⊕ R. 

3. For v ∈ V, we have v = (v − E(v)) + E(v) ∈ N ⊕ R. 

4. Let V = W1 ⊕ W2 · · · ⊕ Wk be direct sum of subspaces Wi. 

Define operators Ei : V → V by 

Ei(v) = vi where v = v1 + · · · + vk, vi ∈ Wi. 

23

Note Ei are well defined projections with 

range(Ei) = Wi and NEi = Wi 

where Wi = W1 ⊕ · · · ⊕ Wi−1 ⊕ Wi+1 ⊕ · · · ⊕ Wk. 

Following is a theorem on projections. 

6.5 (Theorem) Let V be a finite dimensional vector space over a 

field F. Suppose V = W1 ⊕ W2 ⊕ · · · ⊕ Wk be direct sum of subspaces 

Wi. Then there are k linear operators E1, . . . , Ek on V such that 

1. each Ei is a projection (i. e. E 2 i = Ei). 

2. EiEj = 0 ∀ i = j. 

3. E1 + E2 + · · · + Ek = I. 

4. range(Ei) = Wi. 

Conversely, if E1, . . . , Ek are k linear operators on V satisfying all 

the conditions (2)-(3) above then Ei is a projection (i.e. (1) holds) 

and with Wi = Ei(V ) we have V = W1 ⊕ W2 ⊕ · · · ⊕ Wk. 

Proof. The proof is easy and left as an exercise. First, try with 

k = 2 operators, if you like. 

Homework: page 213, Exercise 1, 3, 4-7, 9. 

24

7 Invariant Direct Sums 

This section deals with some of the very natural concepts. Suppose 

V is a vector space over a filed F and V = W1 ⊕ W2 ⊕ · · · ⊕ Wk. 

where Wi are subspaces. Suppose for each i = 1, . . . , k we are given 

linear operators Ti ∈ L(Wi, Wi) on Wi, Then we can define a linear 

operator T : V → such that 

T ( 

k 

vi) = 

i=1 

k 

Ti(vi) for vi ∈ Wi. 

i=1 

So the restriction T|Wi = Ti. This means that the diagram 

commute. 

Wi 

 

V 

Ti 

Wi 

 

T 

V 

Conversely, Suppose V is a vector space over a filed F and V = 

W1 ⊕ W2 ⊕ · · · ⊕ Wk. where Wi are subspaces. Let T ∈ L(V, V ) be 

a linear operator. Assume that Wi are invariant under T. Then we 

can define linear operatop Ti : Wi → Wi by Ti(v) = T (v) for v ∈ Wi. 

Therefore, the above diagram commutes and T can be reconstructed 

from T1, . . . , Tk, in the same way as above. 

25

8 Primary Decomposition 

We studied linear operators T on V under the assumption that the 

characteristic polynomial q or the MMP p splits completely in to 

linear factors. In this section we will not have this assumtion. Here 

we will exploit the fact q, p have unique factorization. 

8.1 (Primary Decomposition Theorem) Let V be a vector space 

over F with finite dimension dim V = n and T : V → V be a linear 

operator on V. Let p be the minimal monic polynomial (MMP) of T 

and 

p = p r1 

1 p r2 

2 · · · p rk 

k 

where ri > 0 and pi are distinct irreducible monic polynomials in 

F[X]. Let 

Wi = {v ∈ V : pi(T ) ri (v) = 0} 

be the null space of pi(T ) ri . Then 

1. V = W1 ⊕ · · · ⊕ Wk; 

2. each Wi is invariant under T ; 

3. Let Ti = T|Wi : Wi → Wi be the operator on Wi induced by T. 

Then the minimal monic polynomial of Ti is p ri 

i . 

Proof. Write 

fi = p 

p ri 

i 

= 

j=i 

p rj 

j . 

Note that f1, f2, . . . , fk have no common factor. So 

Therefore 

CGD(f1, f2, . . . , fk) = 1. 

f1g1 + f2g2 + · · · + fkgk = 1 

for some gi ∈ F[X]. 

For i = 1, . . . , k, let hi = figi and Ei = hi(T )) ∈ L(V, V ). Then 

E1 + E2 + · · · + Ek = hi(T ) = Id. (I) 

26

Also, for i = j note that p | hihj. Since p(T ) = 0 we have 

EiEj = hi(T )hj(T ) = 0. (II) 

Write W ′ 

i = Ei(V ) the range of Ei. By converse part of theorem 6.5, 

it follows that V = W ′ 1 ⊕ · · · ⊕ W ′ k . 

By (I), we have T = T E1 + T E2 + · · · + T Ek. So 

T (W ′ 

i ) = T (Ei(V )) = 

k 

j=1 

T EjEi(V ) = T E 2 i (V ) = T Ei(V ) = 

EiT (V ) ⊆ Ei(V ) = W ′ 

i . 

Therefore, W ′ 

i is invariant under T. We will show that W ′ 

i = Wi is 

the null space of pi(T ) ri . 

We have 

pi(T ) ri (W ′ 

i ) = pi(T ) ri fi(T )gi(T )(V ) = p(T )gi(T )(V ) = 0. 

So, (W ′ 

i ) ⊆ Wi the null space of pi(T ) ri . 

Now suppose w ∈ Wi. So, pi(T ) ri (w) = 0. For j = i, we have 

p ri 

i | fjgj = hj and hence, Ej(v) = hj(T )(w) = 0. Therefore w = 

k 

j=1 Ej(w) = Ei(w) is in W ′ 

i . So, Wi ⊆ W ′ 

i . Therfore Wi = W ′ 

i and 

(1) and (2) are established. 

Now Ti : Wi → Wi is the restriction of T to Wi. It remains to 

show that MMP of Ti is p ri 

i . It is enough to show this for i = 1 or 

that is MMP of T1 is p r1 

1 . 

We have p1(T1) r1 = 0, because W1 is the null space p1(T ) r1 . 

Therefore p r1 

1 ∈ ann(T1). 

Now suppose g ∈ ann(T1). So, g(T1) = 0. Then 

g(T )f1(T ) = g(T ) 

k 

j=2 

p rj 

j . 

Since g(T )|W1 = g(T1) = 0, we have g(T ) vanishes on W1 and also for 

j = 2, . . . , k we have pj(T ) rj vanished on Wj. Therefore, g(T )f1(T ) = 

0. Hence p | gf1. Hence p r1 = p 

f1 | g. Therefore pr1 is the MMP of T1 

and the proof is complete. 

27

Remarks. (1) Note that the projections Ei = hi(T ) in the above 

theorem are polynomials in T. 

(2) Also think what it means if some (or all) of the irreducible 

factors pi = (X − λi) are linear. 

28

Canonical Forms Linear Algebra Notes

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?