24.03.2013 Views

Canonical Forms Linear Algebra Notes

Canonical Forms Linear Algebra Notes

Canonical Forms Linear Algebra Notes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1 Introduction<br />

<strong>Canonical</strong> <strong>Forms</strong><br />

<strong>Linear</strong> <strong>Algebra</strong> <strong>Notes</strong><br />

Satya Mandal<br />

October 25, 2005<br />

Here F will denote a field and V will denote a vector space of dimension<br />

dim(V ) = n. (In this note, unless otherwise stated, n = dim(V ))<br />

We will study operatores T on V. The goal is to investigate if we<br />

can find a basis e1, . . . , en such that<br />

the matrix of T = diagonal(λ1, λ2, . . . , λn)<br />

is a diagonal matrix. This will mean that<br />

T (ei) = λiei.<br />

2 Characteristic Values<br />

2.1 Basic Definitions and Facts<br />

Here we will discuss basic facts.<br />

2.1 (Definition.) Let V be a vector space over a field F and T ∈<br />

L(V, V ) be linear operator.<br />

1. A scalar λ ∈ F is said to be a characterisitic value of T, if<br />

T (e) = λe for some e ∈ V with e = 0.<br />

A characterisitic value is also known an eigen value.<br />

1


2. This non-zero element e ∈ V above is called a characterisitic<br />

vector of T associated to λ. A characterisitic vector is also<br />

known an eigen vector.<br />

3. Write<br />

N(λ) = {v ∈ V : T (v) = λv}.<br />

Then N(λ) is a subspace of V and is said to be the characterisitic<br />

space or eigen space of T associated to λ.<br />

2.2 (Lemma.) Let V be a vector space over a field F and T ∈<br />

L(V, V ) Then T is singular if and only if det(T ) = 0.<br />

Proof. (⇒): We have T (e1) = 0 for some e1 ∈ V with e1 = 0. We<br />

can extend e1 to a basis e1, e2, . . . , en of V. Let A be the matrix of<br />

T with respect to this basis. Since, T (e1) = 0, the first row of A is<br />

zero. Therefore,<br />

det(T ) = det(A) = 0.<br />

So, this implication is established.<br />

(⇐): Suppose det(T ) = 0. Let e1, e2, . . . , en be a basis of V and<br />

A be the matrix of T with respect to this basis. So<br />

det(T ) = det(A) = 0.<br />

Therefore, A is not invertible. Hence AX = 0 for some non-zero<br />

column<br />

⎛ ⎞<br />

X =<br />

⎜<br />

⎝<br />

c1<br />

c2<br />

. . .<br />

cn<br />

⎟<br />

⎠ .<br />

Write v = c1e1 + c2e2 + · · · + cnen. Since not all ci is zero, v = 0.<br />

Also,<br />

T (v) = ciT (ei)<br />

⎛ ⎞<br />

⎛ ⎞<br />

⎜<br />

= (T (e1), T (e2), . . . , T (en)) ⎜<br />

⎝<br />

So, T is singular.<br />

c1<br />

c2<br />

. . .<br />

cn<br />

⎟<br />

⎠ = (e1,<br />

⎜<br />

e2, . . . , en)A ⎜<br />

⎝<br />

The following are some equivalent conditions.<br />

2<br />

c1<br />

c2<br />

. . .<br />

cn<br />

⎟<br />

⎠ = 0..


2.3 (Theorem.) Let V be a vector space over a field F and T ∈<br />

L(V, V ) be linear operator. Let λ ∈ F be a scalar. Then the following<br />

are equivalent:<br />

1. λ is a characteristic value of T.<br />

2. The operator T − λI is singular (or is not invertible).<br />

3. det(T − λI) = 0.<br />

Proof. ((1) ⇒ (2)): We have (T − λI)(e) = 0 for some e ∈ V with<br />

e = 0. So, T − λI is singular and (2) is established.<br />

((2) ⇒ (1)): Since T − λI is singular, we have (T − λI)(e) = 0<br />

for some e ∈ V with e = 0. Therefore, (1) is established.<br />

((2) ⇔ (3)): Immediate from the above lemma.<br />

2.4 (Definition.) Let A ∈ Mn(F) be an n × n matrix with entries<br />

in a field F.<br />

1. A scalar λ ∈ F is said to be a characteristic value of A if<br />

det(A − λIn) = 0. Equivalently, λ ∈ F is said to be a characteristic<br />

value of A if the matrix (A − λIn) is not invertible.<br />

2. The monic polynomial det(XI − A) is said to be the characteristic<br />

polynomial of A. Therefore, characteristic values of<br />

A are the roots of the characteristic polynomial of A.<br />

2.5 (Lemma.) Let A, B ∈ Mn(F) be two n × n matrices with<br />

entries in a field F. If A and B are similar, then they have some<br />

characteristic polynomials.<br />

Proof. Suppose A, B are similar matrices. Then A = P BP −1 . The<br />

characteristic polynomial of A = det(XI−A) = det(XI−P BP −1 ) =<br />

det(P (XI − B)P −1 ) = det(XI − B) = the characteristic polynomial<br />

of B.<br />

2.6 (Definitions and Facts) Let V be a vector space over a field<br />

F and T ∈ L(V, V ) be linear operator.<br />

3


1. Let A be the matrix of T with respect to some basis E of<br />

V. We define the characteristic polynomial of T to be the<br />

characteristic polynomial of A. Note that this polynomial is<br />

well defined by the above lemma.<br />

2. We say that T is diagonalizable if the there is a basis e1, . . . , en<br />

such that each ei is a characteristic value of T. In this case,<br />

T (ei) = λiei for some λi ∈ F. Hence, with respect to this basis,<br />

the matrix of T = Diagonal(λ1, λ2, . . . , λn)<br />

Depending on how many of these eigen values λi are distinct,<br />

we can rewrite the matrix of T.<br />

3. Also note, if T is diagonilazable as above, the characteric polynomial<br />

of T = (X − λ1)(X − λ2) · · · (X − λn), which is completely<br />

factorizable.<br />

4. Suppose T is diagonalizable, as above. Depending on how<br />

many of these eigen values λi are distinct, we can rewrite the<br />

matrix of T.<br />

Now sSuppose T is diagonalizable c1, c2, . . . , cr are the distinct<br />

eigen values of T. Then the matrix of T with respect to some<br />

basis of V looks like:<br />

⎛<br />

⎜<br />

⎝<br />

c1Id1 0 · · · 0<br />

0 c2Id2 · · · 0<br />

· · · · · · · · · · · ·<br />

0 0 · · · crIdr<br />

where Ik is the identity matrix of order k. So, d1+d2+· · ·+dr =<br />

n = dim(V ).<br />

In this case, the characteristic polynomial of<br />

Further,<br />

(see 3 of 2.1.)<br />

⎞<br />

⎟<br />

⎠<br />

T = (X − c1) d1 (X − c2) d2 · · · (X − cr) dr .<br />

di = dim(N(ci)).<br />

2.7 (Read Examples) Read Example 1 and 2 in page 184<br />

4


2.2 Decomposition of V<br />

2.8 (Definition) Suppose V is vector space of a field F with dim(V ) =<br />

n. Let f(X) = a0 + a1X + a2X 2 + · · · + arX r ∈ F[X] be polynomial<br />

and T ∈ L(V, V ) be linear operator. Then, by definition,<br />

f(T ) = a0Id + a1T + a2T 2 + · · · + arT r ∈ L(V, V )<br />

is an operator. So, L(V, V ) becomes a module over F[X].<br />

2.9 (Remark) Suppose V is vector space of a field F with dim(V ) =<br />

n. Let T ∈ L(V, V ) be linear operator. Let f(X) be a characteristic<br />

polynomial of T. We have understanable interest how f(T ) works.<br />

2.10 (Lemma) Suppose V is vector space of a field F with dim(V ) =<br />

n. Let T ∈ L(V, V ) be linear operator. Let f(X) ∈ F[X] be any<br />

polynomial. Suppose<br />

T (v) = λv<br />

for some v ∈ V and λ ∈ F. Then<br />

f(T )(v) = f(λ)v.<br />

The proof is obvious. This means if λ is an eigen value of T then<br />

f(λ) is an eigen value of f(T )<br />

2.11 (Lemma) Suppose V is vector space of a field F with dim(V ) =<br />

n. Let T ∈ L(V, V ) be linear operator. Suppose c1, . . . , ck are the<br />

distinct eigen values of T. Let<br />

Wi = N(ci)<br />

be the eigen space of T associated to ci. Write<br />

Then<br />

Indeed, if<br />

W = W1 + W2 + · · · + Wk.<br />

dim(W ) = dim(W1) + dim(W2) + · · · + dim(Wk).<br />

is a basis of Wi, then<br />

is basis of W.<br />

Ei = {eij ∈ Wi : j = 1, . . . , di}<br />

E = {eij ∈ Wi : j = 1, . . . , di; j = 1, . . . , k}<br />

5


Proof. We only need to prove the last part. So, let<br />

λijeij = 0<br />

for some scalar λij ∈ F.<br />

Write<br />

Then ωi ∈ Wi and<br />

di <br />

ωi =<br />

j=1<br />

λijeij.<br />

ω1 + ω2 + · · · + ωk = 0 (I).<br />

We will first prove that ωi = 0.<br />

Since<br />

T (eij) = cieij<br />

for any polynomial f(X) ∈ F[X], we have<br />

Therefore,<br />

f(T )(eij) = f(ci)eij.<br />

di<br />

di<br />

<br />

<br />

f(T )(ωi) = λijf(T )(eij) = λijf(ci)eij = f(ci)ωi<br />

j=1<br />

j=i<br />

(II).<br />

Now let<br />

k i=2 (X − ci)<br />

g(X) = k i=2 (c1 − ci)<br />

Note g(X) is a polynomial. Also note this definition/ expression<br />

makes sense because c1, . . . , ck are distinct. And also g(c1) = 1 and<br />

g(c2) = f1(c3) = · · · = g(ck) = 0.<br />

Use (II) and apply to (I). We get<br />

0 = g(T )(<br />

k<br />

ωi) =<br />

i=1<br />

k<br />

g(T )(ωi) =<br />

i=1<br />

k<br />

g(ci)ωi. = ω1<br />

i=1<br />

Similarly, ωi = 0 for i = 1. . . . , k. This means<br />

di <br />

0 = ωi =<br />

6<br />

j=i<br />

λijeij


Since Ei is a basis, λij = 0 for all i, j and the proof is complete.<br />

Following is the final theorem in this section.<br />

2.12 (Theorem) Suppose V is vector space of a field F with dim(V ) =<br />

n. Let T ∈ L(V, V ) be linear operator. Suppose c1, . . . , ck are the<br />

distinct eigen values of T. Let<br />

Wi = N(ci)<br />

be the eigen space of T associated to ci. Then the following are<br />

equivalent:<br />

1. T is diagonalizable.<br />

2. The characteristic polynomial for T is<br />

f = (X − c1) d1 (X − c2) d2 · · · (X − ck) dk<br />

and dim(Wi) = di for i = 1, . . . , k.<br />

3. dim(W1) + dim(W2) + · · · + dim(Wk) = dim(V ).<br />

Proof. ((1) ⇒ (2)): This is infact obvious. If c1, . . . , ck are the<br />

distinct eigen values and since T is diagonalizable, the matrix of T<br />

is as in (4) of (2.6). Therefore, we can compute the characteristic<br />

polynomial using this matrix and (2) is established.<br />

((2) ⇒ (3)): We have dim(V ) = degree(f). Therefore,<br />

dim(V ) = d1 + d2 + · · · + dk = dim(W1) + dim(W2) + · · · + dim(Wk).<br />

Hence (3) is established.<br />

((3) ⇒ (1)): Write W = W1 + · · · + Wk. Then, by lemma 2.11<br />

dim(W ) = dim(W1) + dim(W2) + · · · + dim(Wk).<br />

Therefore, by (3), dim(V ) = dim(W ). Hence (3) is established and<br />

the proof is complete.<br />

In fact, I would like to restate the ”final theorem” 2.12 in terms<br />

of direct sum of linear subspaces. So, I need to define direct sum of<br />

vector spaces.<br />

7


2.13 (Definition) Let V be a vector space over F and V1, V2 . . . , Vk<br />

be subspaces of V. We say that V is direct sum of V1, V2, . . . , Vk,<br />

if each element x ∈ V can be written uniquely as<br />

with ωi ∈ Vi.<br />

Equivalently, if<br />

x = ω1 + ω2 + · · · + ωk<br />

1. V = V1 + V2 + · · · + Vk, and<br />

2. ω1 + ω2 + · · · + ωk = 0 with ωi ∈ Vi implies that ωi = 0 for<br />

i = 1, . . . , k.<br />

If V is direct sum of V1, V2 . . . , Vk then we write<br />

V = V1 ⊕ V2 ⊕ · · · ⊕ Vk.<br />

Following is a proposition on direct sum decomposition.<br />

2.14 (Proposition) Let V be a vector space over F with dim(V ) =<br />

n < ∞. Let V1, V2 . . . , Vk be subspaces of V Then<br />

V = V1 ⊕ V2 ⊕ · · · ⊕ Vk<br />

if and only if V = V1 + V2 + · · · + Vk and<br />

dim(V ) = dim(V1) + dim(V2) + · · · + dim(Vk).<br />

Proof. (⇒): Obvious.<br />

(⇐): Let Ei = {eij : j = 1, . . . , di} be basis of Vi. Let E = {eij :<br />

j = 1, . . . , di; i = 1, . . . , k}. Since V = V1 + V2 + · · · + Vk, we have<br />

V = SpanE. Since dim(V ) = cardinlity(E), we have E forms a<br />

basis of V. Now it follows that if ω1 + · · · + ωk = 0 with ωi ∈ Wi then<br />

ωi = 0 ∀i. This completes the proof.<br />

Now we restate the final theorem 2.12 in terms of direct sum.<br />

8


2.15 (Theorem) Suppose V is vector space of a field F with dim(V ) =<br />

n. Let T ∈ L(V, V ) be linear operator. Suppose c1, . . . , ck are the<br />

distinct eigen values of T. Let<br />

Wi = N(ci)<br />

be the eigen space of T associated to ci. Then the following are<br />

equivalent:<br />

1. T is diagonalizable.<br />

2. The characteristic polynomial for T is<br />

f = (X − c1) d1 (X − c2) d2 · · · (X − ck) dk<br />

and dim(Wi) = di for i = 1, . . . , k.<br />

3. dim(W1) + dim(W2) + · · · + dim(Wk) = dim(V ).<br />

4. V = W1 ⊕ W2 ⊕ · · · ⊕ Wk.<br />

Proof. Clearly, we proved<br />

(1) ⇐⇒ (2) ⇐⇒ (3).<br />

We will prove (3) ⇐⇒ (4).<br />

((4) ⇒ (3)): This part is obvious because we can combine bases of<br />

Wi to get a basis of V.<br />

((3) ⇒ (4)):Write W = W1 + W2 + · · · + Wk. Because of (4) and by<br />

lemma 2.11, dim(W ) = dim(Wi) = dim(V ). Therefore, V = W =<br />

W1 + W2 + · · · + Wk.<br />

Since dim(V ) = dim(Wi), by proposition 2.14 V = W1 ⊕W2 ⊕<br />

· · · ⊕ Wk and the proof is complete.<br />

9


3 Annihilating Polynomials<br />

Suppose K is a commutative ring and M be a K−module. For<br />

x ∈ M, we define annihiltor of x as<br />

ann(x) = {λ ∈ K : λx = 0}.<br />

Note that ann(x) is an ideal of K. (That means<br />

ann(x) + ann(x) ⊆ ann(x) and K ∗ ann(x) ⊆ ann(x)).<br />

We shall consider annihilator of a linear operator, as follows.<br />

3.1 Minimal (monic) polynomials<br />

3.1 (Facts) Let V be a vector space over a field F with dim(V ) =<br />

n.<br />

Recall, we have seen that M = L(V, V ) is a F[X]−module. For<br />

f(X) ∈ F[X] and T ∈ L(V, V ), scalar multiplication is defined by<br />

f ∗ T = f(T ) ∈ L(V, V )<br />

1. So, for a linear operator T ∈ L(V, V ), the annihilator of T is:<br />

ann(T ) = {f(X) ∈ F[X] : f(T ) = 0}<br />

is an ideal of the polynomial ring F[X].<br />

2. Note that ann(T ) is a non-zero proper ideal. It is non-zero<br />

because dim(L(V, V )) = n 2 and hance<br />

is a linearly dependent set.<br />

1, T, T 2 , . . . , T n2<br />

3. Also recall that any ideal I of F[X] is a principal ideal, which<br />

means that I = F[X]p where p is the non-zero monic in I<br />

polynomial of least degree.<br />

10


4. Therefore,<br />

ann(T ) = F[X]p(X)<br />

where p(X) is the monic polynomial of least degree such that<br />

p(T ) = 0.<br />

This polynomial p(X) is defined to be the minimal monic<br />

polynomial (MMP) for T.<br />

5. Let us consider similar concepts for square matrices.<br />

(a) For an n × n matrix A, we define annihilator ann(A) of A<br />

and minimal monic polynomial of A is a similar way.<br />

(b) Suppose two n×n matrices A, B similar an d A = P BP −1 .<br />

Then for a polynomial f(X) ∈ F[X] we have<br />

f(A) = P f(B)P −1 .<br />

(c) Therefore ann(A) = ann(B).<br />

(d) Hence A and B have SAME minimal monic polynomial.<br />

3.2 Comparison of minimal monic and characteristic<br />

polynomials:<br />

Given a linear operator T we can think of two polynomials - the<br />

minimal monic polynomial and the characteristic polynomial of T.<br />

We will compare them.<br />

3.2 (Theorem) Let V be a vector space over a field F with dim(V ) =<br />

n. Suppose p(X) is the minimal monic polynomial of T and g(X) is<br />

the characteristic polynomial of T. Then p, g have the same roots in<br />

F. (Although multiplicity may differ.)<br />

Same statement holds for matrices.<br />

Proof. We will prove, for c ∈ F,<br />

p(c) = 0 ⇐⇒ g(c) = 0.<br />

11


Recall g(X) = det(XI − A), where A is the matrix of T with<br />

respect to some basis. Also by theorem 2.3, g(c) = 0 is and only if<br />

cI − T is singular.<br />

Now suppose p(c) = 0. So, p(X) = (X − c)q(x). for some q(X) ∈<br />

F. Since degree(q) < degree(p), by minimality of p we have q(T ) = 0.<br />

Let v ∈ V be such that v = 0 and e = q(T )(v) = 0. Since, p(T ) = 0,<br />

we have (T − cI)q(T ) = 0. Hence 0 = (T − cI)q(T )(v) = (T − cI)(e).<br />

So, (T − cI) is singular and hence g(c) = 0. This estblishes the proof<br />

of this part.<br />

Now assume that g(c) = 0. Therefore, T − cI is singular. So,<br />

there is vector e ∈ V with e = 0 such that T (e) = ce. Applying this<br />

equation to p we have<br />

p(T )(e) = p(c)e<br />

(see lemma 2.10). Since p(T ) = 0 and e = 0, we have p(c) = 0 and<br />

the proof is complete.<br />

The above theorem raises the question if these two polynomial<br />

are same? Answer is, not in general. But MMP devides the chpolynomial<br />

as follows.<br />

3.3 (Caley-Hamilton Theorem) Let V be a vector space over a<br />

field F with dim(V ) = n. Suppose Q(X) is the characteristic polynomial<br />

of T. Then Q(T ) = 0.<br />

In particular, if p(X) is the minimal monic polynomial of T, then<br />

Proof. Write<br />

Observe that<br />

p | Q.<br />

K = F[T ] = {f(T ) : f ∈ F[X], T ∈ L(V, V )}.<br />

F ⊆ K ⊆ L(V, V ).<br />

are subrings. Note Q(T ) ∈ K and we will prove Q(T ) = 0.<br />

Let e1, . . . , en be a basis of V and A = (aij) be the matrix of T.<br />

So, we have<br />

12


(T (e1), T (e2), . . . , T (en)) = (e1, e2, . . . , en)A. (I)<br />

Consider the following matrix, with entries in K :<br />

⎛<br />

⎜<br />

B = ⎜<br />

⎝<br />

Note that the<br />

T − a11I −a12I −a13I · · · −a1nI<br />

−a21I T − a22I −a23I · · · −a2nI<br />

−a31I −a32I T − a33I · · · −a3nI<br />

· · · · · · · · · · · · · · ·<br />

−an1I −an2I −an3I · · · T − annI<br />

Q(X) = det(InX − A).<br />

⎞<br />

⎟<br />

⎠ .<br />

Therefore (I think this is the main point to understandin this proof.),<br />

Q(T ) = det(InT − A) = det(B).<br />

The above equation (I) says that<br />

(e1, e2, . . . , en)B = (0, 0, . . . , 0).<br />

Multiply this equation by Adj(B), and we get<br />

(e1, e2, . . . , en)BAdj(B) = (0, 0, . . . , 0)Adj(B) = (0, 0, . . . , 0).<br />

Therefore,<br />

Therefore,<br />

This implies that<br />

(e1, e2, . . . , en)(det(B))In = (0, 0, . . . , 0).<br />

(e1, e2, . . . , en)(Q(T ))In = (0, 0, . . . , 0).<br />

Q(T )(ei) = 0 ∀i = 1, . . . , n.<br />

Hence Q(T ) = 0 and the proof is complete.<br />

13


4 Invatiant Subspaces<br />

4.1 (Definition) Let V be a vector space over the field F and<br />

T : V → V be a linear operator. A subspace W of V is said to be<br />

invariant under T if<br />

T (W ) ⊆ W.<br />

4.2 (Examples) Let V be a vector space over the field F and<br />

be a linear operator.<br />

T : V → V<br />

1. (Trivial Examples) Then V and {0} are invariant under T.<br />

2. Suppose e be an eigen vector of T and W = Fe. Then W is<br />

invariant under T.<br />

3. Suppose λ be an eigen value of T and W = N(λ) be the eigen<br />

space of λ. Then W is invariant under T.<br />

4.3 (Remark) Let V be a vector space over the field F and<br />

T : V → V<br />

be a linear operator. Suppose W is an invariant subspace T. Then<br />

the restriction map<br />

T|W : W → W<br />

is an well defined linear operator on W. So, the following diagram<br />

commutes.<br />

W T |W<br />

<br />

<br />

V<br />

W<br />

<br />

T <br />

V<br />

4.4 (Remark) Let V be a vector space over the field F with dim(V ) =<br />

n < ∞. Let<br />

T : V → V<br />

be a linear operator. Suppose W is an invariant subspace T and<br />

is the restriciton of T.<br />

T|W : W → W<br />

14


1. Let p be the characteristic polynomial of T and q be the characteristic<br />

polynomial of T|W . Then q | p.<br />

2. Also let P be the minimal (monic) polynomial of T and Q be<br />

the minimal (monic) polynomial of T|W . Then Q | P.<br />

Proof. Proof of (2) is easier. Since P (T ) = 0 we also have P (T|W ) =<br />

0. Therefore<br />

P (X) ∈ ann(T|W ) = F[X]Q(X).<br />

Hence Q | P and proof of (2) is complete.<br />

To prove (1), let E = {e1, e2, . . . , er} be a basis of W. Extend<br />

this basis to a basis E = {e1, e2, . . . , er, er+1, . . . , en} of V. Let A be<br />

the matrix of T with respect to E and B be the matrix of T|W with<br />

respect to E. So, we have<br />

and<br />

(T (e1), . . . , T (er)) = (e1, . . . , er)B<br />

(T (e1), . . . , T (er), T (er+1), . . . , T (en)) = (e1, . . . er, er+1, . . . , , en)A.<br />

and<br />

Therefore, A can be written as blocks as follows:<br />

<br />

B C<br />

A =<br />

0 D<br />

So,<br />

P (X) = det(InX − A) = det(IrX − B) det(In−rX − D)<br />

Q(X) = det(IrX − B).<br />

Therefore Q | P. The proof is complete.<br />

4.5 (Definitions and Remarks) 1. Suppose F is a field. Recall<br />

an n × n matrix A = (aij) is call an upper triangular<br />

matrix if aij = 0 for all i, j with 1 ≤ i < j ≤ n. Similarly, we<br />

define lower triangular matrices.<br />

15


2. Now let V be a vector space over F with dim V = n < ∞. A<br />

linear operator T : V → V is said to be triangulable, if there<br />

is a basis E = {e1, . . . , en} of V such that the matrix of V is<br />

an (upper) triangular matrix. (Note that it does not make a<br />

difference if we say ”upper” or ”lower” trinagular. To avoid<br />

confusion, we will assume upper triangular.)<br />

3. Now suppose a linear operator T is triangulable. So, for a basis<br />

E = {e1, . . . , en} we have (T (e1), . . . , T (en)) = (e1, . . . , en)A<br />

for some triangular matrix A = (aij). We assume that A is<br />

upper triangular. For 1 ≤ r ≤ n, write Wr = span(e1, . . . , er).<br />

Then Wr is invariant under T.<br />

4. (Factorization.) Suppose T ∈ L(V, V ) is triangulable. So,<br />

the matrix of T, with respect to a basis e1, . . . , en, is an upper<br />

triangular matrix A = (aij). Note that the characterictic<br />

polynomial q of T is given by<br />

q(X) = det(IX − A) = (X − a11)(X − a22) · · · (X − ann).<br />

Therefore, q is completely factorizable. So, we have<br />

q(X) = (X − c1) d1 (X − c2) d2 · · · (X − ck) dk .<br />

where d1 + d2 + · · · + dk = dim V and c1, . . . , ck are the distinct<br />

eigen values of T.<br />

Also, since the minimal monic polynomial p of T divides q, it<br />

follows that p is also completely factorizable. Therefore,<br />

p(X) = (X − c1) r1 (X − c2) r2 · · · (X − ck) rk .<br />

where ri ≤ di for i = 1, . . . , k.<br />

4.6 (Theorem) Let V be a vector space over F with with finite<br />

dimension dim V = n and T : V → V be a linear operator on V.<br />

Then T is triangulable if and only if the minimal polynomial p of T<br />

is a product of linear factors.<br />

Proof. (⇒): We have already shown in (4) of Remark 4.5, that if<br />

T is triangulable then the MMP p factors into linear factors.<br />

16


(⇐): Now assume that the MMP p factors as<br />

p(X) = (X − c1) r1 (X − c2) r2 · · · (X − ck) rk .<br />

Let q denote the characteristic polynomial of T. Since p and q have<br />

the same roots, q(c1) = q(c2) = · · · = q(ck) = 0. Now we will split<br />

the proof into several steps.<br />

Step-1: Write λ1 = c1. By (2.3), λ1 is an eigen value of T. So,<br />

there is a non-zero vector e1 ∈ V such that T (e1) = λ1e1. Write<br />

W1 = Span(e1).<br />

Step-2: Extend e1 to a basis e1, E2, . . . , En of V. Write V1 = Span(E2, . . . , En).<br />

Note that<br />

1. e1 is linearly independent and dim W1 = 1.<br />

2. W1 is invariant under T.<br />

3. dim V1 = n − 1 and V = W1 ⊕ V1.<br />

Let v ∈ V1 and T (v) = λ1e1 + λ2E2 + · · · + λnEn, for some<br />

λ1, . . . , λn ∈ F. Define T1(v) = λ2E2 + · · · + λnEn ∈ V1. Then<br />

T1 : V1 → V1<br />

is a well defined linear operator on V1. Diagramatically, T1 is given<br />

by<br />

V1<br />

<br />

V = W1 ⊕ V1<br />

T1 <br />

V1 <br />

pr<br />

T <br />

V = W1 ⊕ V1<br />

where pr : V = W1 ⊕ V1 → V1 is the projection map. Let p1 be the<br />

MMP of T1. Now, we proceed to prove that p1 | p.<br />

Claim : ann(T ) ⊆ ann(T1).<br />

To prove this claim, let A be the matrix of T with respect to e1, E2, . . . , En<br />

and B be the matrix of T1 with respect to E2, . . . , En. Since W1 is<br />

invariant under T, we have<br />

A =<br />

λ1 C<br />

0 B<br />

17<br />

<br />

.


Therefore, the matrix of T m is given by<br />

A m <br />

m λ1 Cm<br />

=<br />

0 Bm For some matrix Cm. Therefore, for a polynomial f(X) ∈ F[X] that<br />

matrix f(A) of f(T ) is given by<br />

<br />

f(λ1) C∗<br />

f(A) =<br />

.<br />

0 f(B)<br />

some matrix C∗. So, if f(X) ∈ ann(T ) then f(T ) = 0. Hence<br />

f(A) = 0. This implies f(B) = 0 and hence f(T1) = 0. So, ann(T ) ⊆<br />

ann(T1) and the claim is established.<br />

Therefore, p1 | p. So, p1 satisfies the hypothesis of the theorem.<br />

So, there is a an element e2 ∈ V1 such that T1(e2) = λ2e2 where<br />

(X − λ2) | p1 | p.<br />

Also follows that T (e2) = ae1 + λ2e2.<br />

Step-3 Write W2 = Span(e1, e2).<br />

Note that<br />

<br />

.<br />

1. e1, e2 are linearly independent and dim W2 = 2.<br />

2. W2 is invariant under T.<br />

3. Also<br />

<br />

λ1 a12<br />

(T (e1), T (e2)) = (e1, e2)<br />

0 λ2<br />

Step-4 If W2 = V (that is if 2 < n), the process will continue. We<br />

extend e1, e2 to a basis e1, e2, E3, . . . , En of V (Well, they are different<br />

Ei, not the same as in previous steps.) Write V2 = Span(E3, . . . , En).<br />

Note<br />

1. dim(V2) = n − 2<br />

2. V = W2 ⊕ V2.<br />

As in the previous steps, define T2 : V2 → V2as in the diagram<br />

(you should define explicitly):<br />

18<br />

<br />

.


V2<br />

<br />

V = W2 ⊕ V2<br />

T2 <br />

V2 <br />

pr<br />

T <br />

V = W2 ⊕ V2<br />

where pr : V = W2 ⊕ V2 → V2 is the projection map.<br />

Let p2 be the MMP of T2. Using same argument, we will prove<br />

p2 | p. Then we can find λ3 ∈ F and e3 ∈ V2 such that T3(e3) = λ3e3<br />

where (X − λ3) | p2 | p. Therefore T (e3) = a13e2 + a23e2 + λ3e3.<br />

So, we have<br />

⎛<br />

(T (e1), T (e2), T (e3)) = (e1, e2, e3) ⎝<br />

λ1 a12 a13<br />

0 λ2 a23<br />

0 0 λ3<br />

Final Step: The process continues for n steps and we get linearly<br />

independent set (basis) e1, e2, . . . , en such that<br />

(T (e1), T (e2), T (e3), . . . , T (en)) =<br />

⎛<br />

⎜<br />

(e1, e2, e3, . . . , en) ⎜<br />

⎝<br />

This completes the proof.<br />

λ1 a12 a13 . . . a1n<br />

0 λ2 a23 . . . a2n<br />

0 0 λ3 . . . a3n<br />

. . . . . . . . . . . . . . .<br />

0 0 0 . . . λn<br />

19<br />

⎞<br />

⎟<br />

⎠ .<br />

⎞<br />

⎠ .


Recall a field F is said to be a an algebraically closed field if<br />

every non-constant polynomial f ∈ F[X] has a root in F. It follows<br />

that k is an algebraically closed field if and only if every non-constant<br />

polynomial f ∈ F[X] product linear polynomials.<br />

4.7 (Theorem) Suppose F is an algebraically closed field. Then<br />

every n × n matrix over F is similar to a triangular matrix.<br />

Proof. Consider the operation<br />

T : F n → F n<br />

such that T (X) = AX. Now use the above theorem.<br />

4.8 (Theorem) Let V be a vector space over F with with finite<br />

dimension dim V = n and T : V → V be a linear operator on V.<br />

Then T is diagonalizable if and only if the minimal polynomial p of<br />

T is of the form<br />

p = (X − c1)(X − c2) · · · (X − ck)<br />

where c1, c2, . . . , ck are the distinct eigen values of T.<br />

Proof. (⇒): Suppose T is diagonalizable. Then, there is a basis<br />

e1, . . . , en of V such that<br />

(T (e1), T (e2), . . . , T (en) =<br />

⎛<br />

c1Id1<br />

⎜ 0<br />

(e1, . . . , en) ⎜ 0<br />

⎝ . . .<br />

0<br />

c2Id2<br />

0<br />

. . .<br />

0<br />

0<br />

c3Id3<br />

. . .<br />

. . .<br />

. . .<br />

. . .<br />

. . .<br />

0<br />

0<br />

0<br />

. . .<br />

⎞<br />

⎟<br />

⎠<br />

0 0 0 . . . cnIdk<br />

.<br />

Write g(X) = (X−c1)(X−c2) · · · (X−ck) we will prove g(T ) = 0.<br />

For, i = 1, . . . , d1 we have (T − c1)(ei) = 0. Therefore,<br />

g(T )(ei) = (T − c1)(T − c2) · · · (T − ck)(e1) = 0.<br />

Similarly, g(T )(ei) = 0 for all i = 1, . . . , n. So, g(T ) = 0. Hence p | g.<br />

Since c1, . . . , ck are roots of both, we have p = g. Hence this part of<br />

the proof is complete.<br />

20


(⇐): We we asssume that p(X) = (X − c1)(X − c2) · · · (X − ck) and<br />

prove that T is digonalizable. Let Wi = N(ci) be a eigen space of ci.<br />

Let W = k i=1 Wi be the sum of eigen spaces. Assume that W = V.<br />

Now we will repeat some protions of the proof of theorem 4.6 and<br />

get a contradiction. Let e1, . . . , em be a basis of W and e1, . . . , em, Em+1, . . . , En<br />

be a basis of V. Write V ′ = Span(Em+1, . . . , En). Note<br />

1. W is invariant under T.<br />

2. V = W ⊕ V ′ .<br />

Define T ′ : V ′ → V ′ according to the diagram:<br />

V ′<br />

T ′<br />

<br />

V = W ⊕ V ′ <br />

T<br />

pr<br />

′<br />

V = W ⊕ V<br />

where pr : V = W ⊕ V ′ → V ′ is the projection map.<br />

As in the prrof of theorem 4.6, the MMP p ′ of T ′ divides p.<br />

Therefore, there is an element e ∈ V ′ such that T ′ (e) = λe for some<br />

λ ∈ {c1, c2, . . . , ck}. We assume λ = c1. Hence<br />

<br />

V ′<br />

<br />

T (e) = a1e1 + · · · + anem + c1e<br />

where ai ∈ F. We can rewrite this equation as<br />

T (e) = β + c1e<br />

where β = ω1 + ω2 + · · · + ωk ∈ W and ωi ∈ Wi. So,<br />

So,<br />

(T − c1)(e) = β.<br />

Since T (W ) ⊆ W , for h(X) ∈ F[X] we have h(T )(β) ∈ W. Write<br />

p = (X − c1)q and q(X) − q(c1) = h(X)(X − c1).<br />

is in W. Also<br />

(q(T ) − q(c1))(e) = h(T )(T − c1)(e) = h(T )(β)<br />

0 = p(T )(e) = (T − c1)q((T )(e)<br />

Therefore q((T )(e) ∈ W1 ⊆ W. So, q(c1)e = q((T )(e) − (q(T ) −<br />

q(c1))(e) is in W. Since q(c1) = 0 we get e ∈ W. This is a contradiction<br />

and the proof is complete.<br />

21


5 Simultaneous Triangulation and Diagonilization<br />

Suppose F ⊆ L(V, V ) is a family of linear operators on a vector space<br />

V a field F. We say that F is a commuting family if T U = UT<br />

for all U, T ∈ F.<br />

In this section we try to find a basis E of V so that, for all T in<br />

a family F the matrix of T is diagonal (or triangular) with respect<br />

to E. Following are the main theorems.<br />

5.1 (Theorem) Let V be a finite dimensional vector space with<br />

dim V = n over a field F. Let F ⊆ L(V, V ) be a commuting and<br />

triangulable family of operators on V. Then there is a basis E =<br />

{e1, . . . , en} such that, for every T ∈ F, the matrix of T with respect<br />

to E is a triangular matrix.<br />

Proof. The proof is some fairly similar to that of theorem 4.6. We<br />

will omit the proof. You can work it out when you need.<br />

Following is the matrix version of the above theorem.<br />

5.2 (Theorem) Let F ⊆ Mnn(F) be a commuting and triangulable<br />

family of n × n matrices. Then there is an invertible matrix P<br />

such that, for every A ∈ F, we have P AP −1 is an upper triangular<br />

matrix.<br />

5.3 (Theorem) Let V be a finite dimensional vector space with<br />

dim V = n over a field F. Let F ⊆ L(V, V ) be a commuting and<br />

diagonalizable family of operators on V. Then there is a basis E =<br />

{e1, . . . , en} such that, for every T ∈ F, the matrix of T with respect<br />

to E is a diagonal matrix.<br />

Proof. We will omit the proof. You can work it out when you need.<br />

22


6 Direct Sum<br />

Part of this section we already touched. We gave the definition 2.13<br />

of direct sum of subspaces. Following is an exercise. Note that we<br />

can make the same definition for any subspace W.<br />

6.1 (Exercise) Let V be a finite dimensional vector space over a<br />

field F. Let W1, . . . , Wk be subspaces of V. Then V = W1⊕W2⊕· · ·⊕<br />

Wk if and only if V = W1 + W2 + · · · + Wk and for each j = 2, . . . , k,<br />

we have<br />

(W1 + · · · + Wj−1) ∩ Wj = {0}.<br />

6.2 (Examples) (1) R 2 = Re1 ⊕ Re2 where e1 = (1, 0), e2 = (0, 1).<br />

(2) Let V = Mnn(F). Let U be the subspace of all upper triangular<br />

matrices. Let L be subspace of all strictly lower triangular matrices<br />

(that means diagonal entries are zero). Then V = U ⊕ L.<br />

(3) Recall theorem 2.15 that V is direct sum of eigen spaces of diagonizable<br />

operators T.<br />

We used the word ’projection’ before in the context of direct sum.<br />

Here we define projections.<br />

6.3 (Definition) Let V be a finite dimensional vector space over<br />

a field F. An linear operator E : V → V is said to be a projection<br />

if E 2 = E.<br />

6.4 (observations) Let V be a finite dimensional vector space<br />

over a field F. Let E : V → V be a projection. Let R = range(E)<br />

and N = NE be the null space of E. Then<br />

1. For v ∈ V, we have x ∈ R ⇔ E(x) = x.<br />

2. V = N ⊕ R.<br />

3. For v ∈ V, we have v = (v − E(v)) + E(v) ∈ N ⊕ R.<br />

4. Let V = W1 ⊕ W2 · · · ⊕ Wk be direct sum of subspaces Wi.<br />

Define operators Ei : V → V by<br />

Ei(v) = vi where v = v1 + · · · + vk, vi ∈ Wi.<br />

23


Note Ei are well defined projections with<br />

range(Ei) = Wi and NEi = Wi<br />

where Wi = W1 ⊕ · · · ⊕ Wi−1 ⊕ Wi+1 ⊕ · · · ⊕ Wk.<br />

Following is a theorem on projections.<br />

6.5 (Theorem) Let V be a finite dimensional vector space over a<br />

field F. Suppose V = W1 ⊕ W2 ⊕ · · · ⊕ Wk be direct sum of subspaces<br />

Wi. Then there are k linear operators E1, . . . , Ek on V such that<br />

1. each Ei is a projection (i. e. E 2 i = Ei).<br />

2. EiEj = 0 ∀ i = j.<br />

3. E1 + E2 + · · · + Ek = I.<br />

4. range(Ei) = Wi.<br />

Conversely, if E1, . . . , Ek are k linear operators on V satisfying all<br />

the conditions (2)-(3) above then Ei is a projection (i.e. (1) holds)<br />

and with Wi = Ei(V ) we have V = W1 ⊕ W2 ⊕ · · · ⊕ Wk.<br />

Proof. The proof is easy and left as an exercise. First, try with<br />

k = 2 operators, if you like.<br />

Homework: page 213, Exercise 1, 3, 4-7, 9.<br />

24


7 Invariant Direct Sums<br />

This section deals with some of the very natural concepts. Suppose<br />

V is a vector space over a filed F and V = W1 ⊕ W2 ⊕ · · · ⊕ Wk.<br />

where Wi are subspaces. Suppose for each i = 1, . . . , k we are given<br />

linear operators Ti ∈ L(Wi, Wi) on Wi, Then we can define a linear<br />

operator T : V → such that<br />

T (<br />

k<br />

vi) =<br />

i=1<br />

k<br />

Ti(vi) for vi ∈ Wi.<br />

i=1<br />

So the restriction T|Wi = Ti. This means that the diagram<br />

commute.<br />

Wi<br />

<br />

V<br />

Ti <br />

Wi<br />

<br />

T <br />

V<br />

Conversely, Suppose V is a vector space over a filed F and V =<br />

W1 ⊕ W2 ⊕ · · · ⊕ Wk. where Wi are subspaces. Let T ∈ L(V, V ) be<br />

a linear operator. Assume that Wi are invariant under T. Then we<br />

can define linear operatop Ti : Wi → Wi by Ti(v) = T (v) for v ∈ Wi.<br />

Therefore, the above diagram commutes and T can be reconstructed<br />

from T1, . . . , Tk, in the same way as above.<br />

25


8 Primary Decomposition<br />

We studied linear operators T on V under the assumption that the<br />

characteristic polynomial q or the MMP p splits completely in to<br />

linear factors. In this section we will not have this assumtion. Here<br />

we will exploit the fact q, p have unique factorization.<br />

8.1 (Primary Decomposition Theorem) Let V be a vector space<br />

over F with finite dimension dim V = n and T : V → V be a linear<br />

operator on V. Let p be the minimal monic polynomial (MMP) of T<br />

and<br />

p = p r1<br />

1 p r2<br />

2 · · · p rk<br />

k<br />

where ri > 0 and pi are distinct irreducible monic polynomials in<br />

F[X]. Let<br />

Wi = {v ∈ V : pi(T ) ri (v) = 0}<br />

be the null space of pi(T ) ri . Then<br />

1. V = W1 ⊕ · · · ⊕ Wk;<br />

2. each Wi is invariant under T ;<br />

3. Let Ti = T|Wi : Wi → Wi be the operator on Wi induced by T.<br />

Then the minimal monic polynomial of Ti is p ri<br />

i .<br />

Proof. Write<br />

fi = p<br />

p ri<br />

i<br />

= <br />

j=i<br />

p rj<br />

j .<br />

Note that f1, f2, . . . , fk have no common factor. So<br />

Therefore<br />

CGD(f1, f2, . . . , fk) = 1.<br />

f1g1 + f2g2 + · · · + fkgk = 1<br />

for some gi ∈ F[X].<br />

For i = 1, . . . , k, let hi = figi and Ei = hi(T )) ∈ L(V, V ). Then<br />

E1 + E2 + · · · + Ek = hi(T ) = Id. (I)<br />

26


Also, for i = j note that p | hihj. Since p(T ) = 0 we have<br />

EiEj = hi(T )hj(T ) = 0. (II)<br />

Write W ′<br />

i = Ei(V ) the range of Ei. By converse part of theorem 6.5,<br />

it follows that V = W ′ 1 ⊕ · · · ⊕ W ′ k .<br />

By (I), we have T = T E1 + T E2 + · · · + T Ek. So<br />

T (W ′<br />

i ) = T (Ei(V )) =<br />

k<br />

j=1<br />

T EjEi(V ) = T E 2 i (V ) = T Ei(V ) =<br />

EiT (V ) ⊆ Ei(V ) = W ′<br />

i .<br />

Therefore, W ′<br />

i is invariant under T. We will show that W ′<br />

i = Wi is<br />

the null space of pi(T ) ri .<br />

We have<br />

pi(T ) ri (W ′<br />

i ) = pi(T ) ri fi(T )gi(T )(V ) = p(T )gi(T )(V ) = 0.<br />

So, (W ′<br />

i ) ⊆ Wi the null space of pi(T ) ri .<br />

Now suppose w ∈ Wi. So, pi(T ) ri (w) = 0. For j = i, we have<br />

p ri<br />

i | fjgj = hj and hence, Ej(v) = hj(T )(w) = 0. Therefore w =<br />

k<br />

j=1 Ej(w) = Ei(w) is in W ′<br />

i . So, Wi ⊆ W ′<br />

i . Therfore Wi = W ′<br />

i and<br />

(1) and (2) are established.<br />

Now Ti : Wi → Wi is the restriction of T to Wi. It remains to<br />

show that MMP of Ti is p ri<br />

i . It is enough to show this for i = 1 or<br />

that is MMP of T1 is p r1<br />

1 .<br />

We have p1(T1) r1 = 0, because W1 is the null space p1(T ) r1 .<br />

Therefore p r1<br />

1 ∈ ann(T1).<br />

Now suppose g ∈ ann(T1). So, g(T1) = 0. Then<br />

g(T )f1(T ) = g(T )<br />

k<br />

j=2<br />

p rj<br />

j .<br />

Since g(T )|W1 = g(T1) = 0, we have g(T ) vanishes on W1 and also for<br />

j = 2, . . . , k we have pj(T ) rj vanished on Wj. Therefore, g(T )f1(T ) =<br />

0. Hence p | gf1. Hence p r1 = p<br />

f1 | g. Therefore pr1 is the MMP of T1<br />

and the proof is complete.<br />

27


Remarks. (1) Note that the projections Ei = hi(T ) in the above<br />

theorem are polynomials in T.<br />

(2) Also think what it means if some (or all) of the irreducible<br />

factors pi = (X − λi) are linear.<br />

28

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!