summaryrefslogtreecommitdiff
path: root/math/linear-algebra.md
blob: 3d58b39b64c969fe6ec13d08a2810a2509e693f4 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
---
layout: algebra
title: mathematics/linear algebra
---

# Linear Algebra

$$ℝ^n$$

## Fields and Vector Spaces

A **field** $F$ is a *set* with two *binary operation* $+$ and $×$ satisfying the following axioms:
- $(F, +)$ is a *commutative group*:
  - associativity: $∀a,b,c : a + (b + c) = (a + b) + c$
  - additive identity: $∃0, ∀a : 0 + a = a + 0 = a$
  - additive inverse: $∀a, ∃-a : a + (-a) = 0$
  - commutativity: $∀a,b : a + b = b + a$
- $(F, ×)$ is a *commutative group*
  - associativity: $∀a,b,c : a(bc) = (ab)c$
  - multiplicative identity: $∃1, ∀a : 1a = a1 = a$
  - multiplicative inverse: $∀a ≠ 0, ∃\frac{1}{a} : a\frac{1}{a} = 1$
  - commutativity: $∀a,b : ab = ba$
- $×$ is distributive with respect to $+$
  - distributivity: $∀a,b,c : a(b + c) = (ab) + (ac)$

Intuitively, a field is a set on which addition, subtraction, multiplication and division are defined and behave as they do on $ℝ$. We often omit the multiplication sign, and write $a × a$ as simply $aa$. It can also be thought of as a *commutative ring* with a multiplicative inverse (sans 0).

A **vector space** $V$ *over* a field $F$ is a set with a binary operation $+$ and a binary function satisfying the following axioms:
- $(V, +)$ is a *commutative group*:
  - associativity: $∀u,v,w : u + (v + w) = (u + v) + w$
  - additive identity: $∃0, ∀v : 0 + v = v + 0 = v$
  - additive inverse: $∀v, ∃-v : v + (-v) = 0$
  - commutativity: $∀u,v : u + v = v + u$
- $(V,)$ is a *scalar operation*
  - scalar identity: $∃1 ∈ F, ∀v : 1v = v1 = v$
  - commutativity: $∀a,b ∈ F, ∀v : (ab)v = a(bv)$
- The *distributive laws* hold:
  - $∀a ∈ F, ∀u,v ∈ V : a(u + v) = au + av$
  - $∀a,b ∈ F, ∀v ∈ V : (a + b)v = av + bv$

Our definition of our vector space leads us to some facts:
- The zero vector is *unique* and always present.
  - Proof: Suppose there were two zero vectors: $0$ and $0'$. Then $0' = 0 + 0' = 0' + 0 = 0$.
- Vector spaces are *non-empty*.
  - Proof: By definition, the zero vector exists.
- The additive inverse for some $x$ is *unique*.
  - Proof: Suppose there were two inverses: $-x$ and $-x'$. Then $-x + x = 0$ and $-x + x + -x' = -x'$, and so as $x + -x' = 0$ $-x = -x'$.
- If $V$ is a vector space over $F$ and $V ≠ \\{0\\}$, then $V$ is an *infinite set* over $F$.
  - Proof: you can just keep adding things

<details markdown="block">
<summary>Examples</summary>

Let $S = \\{(a_1, a_2) : a_1, a_2 ∈ ℝ\\}$.
For $(a_1, a_2), (b_1, b_2) ∈ S$ and $c ∈ ℝ$, we define:
- $(a_1, a_2) + (b_1, b_2) = (a_1 + b_1, a_2 - b_2)$
- $c(a_1, a_2) = (ca_1, ca_2)$.

This fails commutativity! It is thus not a vector space.

Let $S = \\{(a_1, a_2) : a_1, a_2 ∈ ℝ\\}$. We define:
- $(a_1, a_2) + (b_1, b_2) = (a_1 + b_1)$
- $c(a_1, a_2) = (ca_1, 0)$

This fails the existence of zero! It is thus not a vector space.
</details>

## Subspaces

A subset $W$ of a vector space $V$ over a field 𝔽 is called a **subspace** of $V$ if $W$ is a *vector space* over 𝔽 with the operations of addition and scalar multiplication from $V$.
- .
- .

A subset of $V$ is a **subspace** of V iff:
- the subset is non-empty
- the subset contains the zero vector
- it is closed under addition and multiplication

Let $V$ be a vector space over $F$ and $S$ a nonempty subset of $V$. A vector $v \in V$ is a **linear combination** of vectors $s,t ∈ V$ if there exists a *finite* number of vectors $u_1, u_2, ..., u_n ∈ S$ and scalars $a_1, a_2, ..., a_n ∈ F$ such that $v = a_1 u_1 + a_2 u_2 + ... a_n u_n$.

We call $a_1 ... a_n$ the *coefficients* of the linear combination.



---


## Introduction: The Complex Numbers

A **complex number** is of the form $a + b\text{i}$, where $a, b$ are real numbers and $i$ represents the imaginary base.

We denote the set of complex numbers as $ℂ$.
The complex numbers form a *vector space*.
Every element in the vector space can be expressed as a *linear combination* of $a + bi$.

- ℂ: the set of complex numbers
- ℝ: the set of real numbers
- ℚ: the set of rational numbers

Elementary operations on the complex numbers as are follows:
- $(a + bi) ± (c + di) = a ± c ± (b ± d)$
- $(a + bi)(c + di) = (ac - bd) + (ad + bc)$
- $i^2 = -1$

The **conjugate** of $z = a + bi$ is ... defined as $\bar{z} = a - bi$.
As long as $a^2 + b^2 ≠ 0$, the inverse of $z = a + bi$ is given by $z^{-1} = \frac{a - b}{a^2 + b^2} = \frac{\bar{z}}{a^2 + b^2}$.

**Theorem**: Let $z, w ∈ ℂ$. Then:
- $\bar{\bar{z}} = z$ the conjugate of the conjugate
- $\bar{z ± w} = \bar{z} ± \bar{w}$
- $\bar{zw} = \bar{z}\bar{w}$
- $\bar{\frac{z}{w}} = \frac{\bar{z}}{\bar{w}}$ (if $w ≠ 0$)
  - (where $\frac{z}{w} = z ⋅ w^{-1} = z \frac{1}{w}$)
- $z$ is a *real number* iff $\bar{z} = z$

Let $z = a + bi$, where $a, b ∈ ℝ$. The **absolute value** of $z$ is defined as the real number $\sqrt{a^2 + b^2}$.

This absolute value shares many properties. For $z, w ∈ ℂ$:
- $z \bar{z} = a^2 + b^2 = \|z\|^2$, where $\|z\| = \sqrt{a^2 + b^2}$
- $\frac{z}{w} = \frac{\|z\|}{\|w\|}$ where $w ≠ 0$
- $\|z + w\| ≤ \|z\| + \|w\|$
- $\|z\| - \|w\| ≤ \|z + w\|$

**The Fundamental Theorem of Algebra**:
Consider the polynomial $p(z) = a_n z^n + a_{n-1}z^{n-1} + ... + a_1 z + a_0$ where all $a$ are complex numbers.
If $n ≥ 1$, then $p(z)$ has a zero (*root*), i.e. there exists a complex number $c$ such tbar $p(c) = 0$.

If $c$ is a root, $\bar{c}$ is also a root!

Let 𝔽 be one of the following sets: ℚ, ℝ, ℂ

https://math.stackexchange.com/questions/3492590/linear-combination-span-independence-and-bases-for-infinite-dimensional-vector

Let $S$ be a nonempty subset of a vector space $V$. The **span** of $S$, denoted $span(S)$, is the set consisting of all linear combination of vectors in S. For convenience, we define $span(∅) = \\{0\\}$.

The span of any subset $S$ of a vector space $V$ is a subspace of $V$.

---

A subspace $S$ over a vector space $V$ is **linearly dependent** if there exists a finite number of distinct vectors $u_1, u_2, ..., u_n ∈ S$ and scalars $a_1, a_2, ..., a_n ∈ F$ such that $a_1, a_2, ..., a_n$ are not all zero and $a_1 u_1 + a_2 u_2 + ... a_n u_n = 0$.

A subset $S$ of a vector space that is not linearly dependent is **linearly independent**.


Example: Consider the following set: $S = \\{(1, 0, 0, -1), (0, 1, 0, -1), (0, 0, 1, -1), (0, 0, 0, 1)\\}$
Assume that $a v_1 + a_2 v_2 + a_3 v_3 + a_4 v_4 = 0$. then...

as the determinant is nonzero, S is linearly independent.


Let $V$ be a vector space, and let $S_1 ⊆ S_2 ⊆ V$. If $S_1$ is linearly dependent, then $S_2$ is linearly dependent. If $S_2$ is linearly independent, then $S_1$ is also linearly independent.

Let $S$ be a linearly independent subset of a vector space $V$, and let $v ∈ V : v ∉ S$. Then $S ∪ \\{v\\}$ is linearly independent iff $v ∈ span(S)$.

A basis $B$ for a vector space $V$ is a *linearly independent* subset of $V$ that *spans* $V$. If $B$ is a basis for $V$, we also say that the vectors of $B$ form a basis for $V$.

Let $V$ be  a vector space and $β = \\{v_1, ..., v_n\\}$ be a subset of V. Then β is a basis for V iff every $v ∈ V$ can be **uniquely expressed** as a linear combination of vectors of β. that is, V can be written in the form $v = a_1 u_1 + a_2 u_2 ... a_n u_n$ for unique scalars a.

If a vector space V is spanned by a finite set S, then some subset of S is a basis of V. So, V has a finite basis.
Proof: If $S = ∅$, then $V = span{S} = span{∅} = \span{𝕆}$ in which case ∅ is a basis for $V$.
If S ≠ ∅, then ∃ u_1 ∈ S : u_1 ≠ 𝕆. and we have two cases: span(u_1) = V we are done...

(Replacement theorem) Let $V$ be a vector space that is spanned by a set G containing exactly n vectors, and let L be a linearly independent subset of V containing exactly m vectors. Then m ≤ n. Moreover, you can find a subset H of G

Let V be a vector space with dimension n.
- any finite spanning set for V contains at least n vectors, and a spanning set for V that contains exactly n vectors is a basis for V.

Theorem 1.4: Let $W$ be a subspace of a finite-dimensional vector space $V$. Then $W$ is also finite-dimensional (and dim W ≤ dim V). Moreover if dim W = dim V, then V = W.

---

## Linear Transformations

Let $V$ and $W$ be vector spaces (over a field $F$).

A function $T: V → W$ is a **linear transformation** from $V$ into $W$ if $∀x,y ∈ V, c ∈ F$ we have $T(cx + y) = cT(x) + T(y)$. Subsequently:
- $T(x + y) = T(x) + T(y)$
- $T(cx) = cT(x)$
- $T(0) = 0$
- $T(\sum_{i=1}^n a_i x_i) = \sum_{i=1}^n a_i T(x_i)$

Let $T: V → W$ be a linear transformation.

The **kernel** (or null space) $N(T)$ of $T$ is the set of all vectors in $V$ such that $T(x) = 0$: that is, $N(T) = \\{ x ∈ V : T(x) = 0 \\}$.

The **image** (or range) $R(T)$ of $T$ is the subset of $W$ consisting of all images (under $T$) of elements of $V$: that is, $R(T) = \\{ T(x) : x ∈ V \\}$.

Theorem: The kernel $N(T)$ and image $R(T)$ are subspaces of $V$ and $W$, respectively.
<details markdown="block">
<summary>Proof</summary>
We shall denote the zero vector of $V$ and $W$ as $0_v$ and $0_w$, respectively.

Let $x,y ∈ N(T)$ and $c ∈ F$. As $T(0_v) = 0_w$, $0_v ∈ N(T)$. Then $T(cx + y) = cT(x) + T(y) = 0_w + 0_w = 0_w$, as $x$ and $y$ are in the null space. Hence any linear combination of $x$ and $y$ in the null space is in the null space. So as $N(T) ⊆ V$ by definition, $N(T)$ is a subspace of $V$. ∎

Let $x,y ∈ R(T)$ and $c ∈ F$. As $T(0_v) = 0_w$, $0_w ∈ R(T)$. Then there exist $v,w ∈ V$ such that $T(v) = x$ and $T(w) = y$. So $T(v + cw) = T(v) + cT(w) = x + cy$. Hence any linear combination of $x$ and $y$ is in the image. So as $R(T) ⊆ W$ by definition, $R(T)$ is a subspace of $W$. ∎

</details>

Theorem: If $β = \\{ v_1, v_2, ... v_n \\}$ is a basis for $V$, then $R(T) = span(\\{ T(v_1), T(v_2), ..., T(v_n) \\})$.
<details markdown="block">
<summary>Proof</summary>

Clearly, $T(v_i) ∈ R(T)$ for all $i$. As $R(T)$ is a subspace of $W$ by the previous theorem, $R(T)$ *contains* $span(\\{ v_1, v_2, ... v_n \\}) = span(T(β))$.

Now consider an arbitrary $w ∈ R(T)$. By definition, there exists a $v ∈ V$ such that $w = T(v)$. As $β$ is a basis for $V$, we can consider $v$ as a linear combination of some basis vectors in $β$. As $T$ is linear, it thus must be the case that $T(v) ∈ span(T(β))$. So $R(T) = span(T(β))$. ∎

</details>

For a finite-dimensional $N(T)$ and $R(T)$: the **nullity** and **rank** of $T$ are the dimensions of $N(T)$ and $R(T)$, respectively.

**Rank-Nullity Theorem**: If $V$ is *finite-dimensional*, then $dim(V) = nullity(T) + rank(T)$.
<details markdown="block">
<summary>Proof</summary>

Let $dim(V) = n$, let $nullity(T) = k$, and let $β_N$ be a basis for $N(T)$. As $N(T) ⊆ V$, we may extend $β_N$ to a basis $β$ for $V$.

We assert that $S = \\{T(v_{k+1}), T(v_{k+2}), ..., T(v_n) \\}$. We must prove this.

...

So $S$ is linearly independent. As it contains $n-k$ linearly independent vectors in $R(T)$, it is a basis for $R(T)$.

</details>

Recall that a *function* definitionally maps *each* element of its domain to *exactly* one element of its codomain.
- A function is **injective** (or **one-to-one**) iff each element of its domain maps to a *distinct* element of its codomain, that is, $f(x) = f(y) → x = y$.
- A function is **surjective** (or **onto**) iff each element of the codomain is mapped to by *at least* one element in the domain, that is, $R(T) = W$.
- A function is **bijective** iff it is surjective and injective. Necessarily, a bijective function is invertible, which will be formally stated & proven later.

Theorem: $T$ is injective iff $N(T) = \\{0\\}$.
<details markdown="block">
<summary>Proof</summary>

Suppose that $T$ is injective. Consider some $x ∈ N(T)$. Then $T(x) = 0 = T(0)$. As $T$ is injective, $x = 0$, and so $N(T) = \\{0\\}$. ∎

Now suppose that $N(T) = \\{0\\}$. Consider some $T(x) = T(y)$. Then $T(x) - T(y) = T(x-y) = 0$. So $x-y ∈ N(T) = \\{0\\}$, and so $x-y = 0$ and $x = y$. So $T$ is injective. ∎

</details>

Theorem: For $V$ and $W$ of equal (and finite) dimension: $T$ is injective iff it is surjective.
<details markdown="block">
<summary>Proof</summary>

We have that $T$ is injective iff $N(T) = \\{0\\}$ and thus iff $nullity(T) = 0$. So $T$ is injective iff $rank(T) = dim(R(T)) = dim(W)$, from the Rank-Nullity Theorem. $dim(R(T)) = dim(W)$ is equivalent to $R(T) = W$, which is the definition of surjectivity. So $T$ is injective iff it is surjective. ∎

</details>

Theorem: Suppose that $V$ has a finite basis $\\{ v_1, v_2, ..., v_n \\}$. For any vectors $w_1, w_2, ... w_n$ in $W$, there exists *exactly* one linear transformation such that $T(v_i) = w_i$ for $i = 1, 2, ..., n$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: Suppose that $V$ has a finite basis $\\{ v_1, v_2, ..., v_n \\}$.  If $U, T : V → W$ are linear and $U(v_i) = T(v_i)$ for all $i$, then $U = T$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

## Composition of Linear Transformations

Let $V$, $W$, and $Z$ be vector spaces.

Theorem: Let $T, U : V → W$ be linear. Then their sum and scalar products are linear.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: The set of all linear transformations (via our definitions of addition and scalar multiplication above) $V → W$ forms a vector space over $F$. We denote this as $\mathcal{L}(V, W)$.
- If $V = W$, we write $\mathcal{L}(V)$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Let $T, U : V → W$ be arbitrary functions. We define **addition** $T + U : V → W$ as $∀x ∈ V : (T + U)(x) = T(x) + U(x)$, and **scalar multiplication** $aT : V → W$ as $∀x ∈ V : (aT)(x) = aT(x)$ for all $a ∈ F$.

Theorem: Let $T : V → W$ and $U : W → Z$ be linear. Then their composition $UT : V → Z$ is linear.
<details markdown="block">
<summary>Proof</summary>
Let $x,y ∈ V$ and $c ∈ F$. Then:

$$UT(cx + y)$$

$$= U(T(cx + y)) = U(cT(x) + T(y))$$

$$= cU(T(x)) + U(T(y)) = c(UT)(x) + UT(y)$$

</details>

Theorem: Let $T, U_1, U_2 ∈ \mathcal{L}(V)$, and $a ∈ F$. Then:
- $T(U_1 + U_2) = TU_1 + TU_2$
- $(U_1 + U_2)T = U_1 T + U_2 T$
- $T(U_1 U_2) = (TU_1) U_2$
- $TI = IT = T$
- $a(U_1 U_2) = (aU_1) U_2 = U_1 (aU_2)$
<details markdown="block">
<summary>Proof</summary>
...
<!-- A more general result holds for linear transformations with domains unequal to their codomains, exercise 7 -->
</details>

## Invertibility and Isomorphism

- Let $V$ and $W$ be vector spaces.
- Let $T: V → W$ be a linear transformation.
- Let $I_V: V → V$ and $I_W: W → W$ denote the identity transformations within $V$ and $W$, respectively.

A function $U: W → V$ is an **inverse** of $T$ if $TU = I_W$ and $UT = I_V$.<br>
If $T$ has an inverse, then $T$ is **invertible**.

Theorem: Consider a linear function $T: V → W$.
- If $T$ is invertible, it has a *unique* inverse $T^{-1}$.
- If $T$ is invertible, $T^{-1}$ is invertible with the inverse $T$.
- A function is invertible if and only iff it is bijective.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: If $T$ is linear and invertible, $T^{-1}$ is linear and invertible.
<details markdown="block">
<summary>Proof</summary>

Let $y_1, y_2 ∈ W$ and $c ∈ F$. As $T$ is bijective, there exist unique vectors $x_1$ and $x_2$ such that $T(x_1) = y_1$ and $T(x_2) = y_2$. So $x_1 = T^{-1}(y_1)$ and $x_2 = T^{-1}(y_2)$. So:

$$T^{-1}(y_1 + cy_2) = T^{-1}[T(x_1) + cT(x_2)] = T^{-1}[T(x_1 + cx_2)] = x_1 + cx_2 = T^{-1}(y_1) + cT^{-1}(y_2)$$

Thus $T^{-1}$ is linear. We may see it is invertible by definition. ∎

</details>

$V$ is **isomorphic** to $W$ if there exists an *invertible* linear $T : V → W$ (an **isomorphism**).

Lemma: For finite-dimensional $V$ and $W$: If $T: V → W$ is invertible, then $dim(V) = dim(W)$.
<details markdown="block">
<summary>Proof</summary>

As $T$ is invertible, it is bijective. So $nullity(T) = 0$, $rank(T) = dim(R(T)) = dim(W)$. So by the Rank-Nullity Theorem $dim(V) = dim(W)$. ∎

</details>

Theorem: $V$ is isomorphic to $W$ iff $dim(V) = dim(W)$. Subsequently:
- $V$ is isomorphic to $F^n$ iff $dim(V) = n$.
<details markdown="block">
<summary>Proof</summary>

Suppose that $V$ is isomorphic to $W$ and that $T : V → W$ is an isomorphism. As $T$ is an isomorphism, it is invertible, and so by our earlier lemma $dim(V) = dim(W)$.

Now suppose that $dim(V) = dim(W)$. Let $β = \\{v_1, v_2, ..., v_n\\}$ and $γ = \\{w_1, w_2, ..., w_n\\}$ be bases for $V$ and $W$, respectively. Then as $V$ and $W$ are of equal dimension there must exist $T : V → W$ such that $T$ is linear and $T(v_i) = w_i$ for all $i$. Thus $R(T) = span(T(β)) = span(γ) = W$, and $T$ is surjective. As $V$ and $W$ are of equal dimension, $T$ is also injective. So $T$ is bijective, and so $T$ is an isomorphism. ∎

</details>

## Linear Transformations as Matrices

- Let $V, W$ be finite-dimensional vector spaces (over a field $F$).
- Let $T, U : V → W$ be linear transformations from $V$ to $W$.
- Let $β$ and $γ$ be ordered bases of $V$ and $W$, respectively.
- Let $a ∈ F$ be a scalar.

An **ordered basis** of a finite-dimensional vector space $V$ is, well, an ordered basis of $V$. We represent this with exactly the same notation as a standard unordered basis, but will call attention to it whenever necessary.
- For the vector space $F^n$ we call $\\{ e_1, e_2, ..., e_n \\}$ the **standard ordered basis** for $F^n$.
- For the vector space $P_n(F)$ we call $\\{ 1, x, ..., x^n \\}$ the **standard ordered basis** for $P_n(F)$.

Let $a_1, a_2, ... a_n$ be the unique scalars such that $x = Σ_{i=1}^n a_i u_i$ for all $x ∈ V$. The **coordinate vector** of $x$ relative to $β$ is $\begin{pmatrix} a_1 \\ a_2 \\ ... \\ a_n \end{pmatrix}$ (vert) and denoted $[x]_β$.

The $m × n$ matrix $A$ defined by $A_{ij} = a_{ij}$ is called the **matrix representation of $T$ in the ordered bases $β$ and $γ$**, and denoted as $A = [T]_β^γ$. If $V = W$ and $β = γ$, we write $A = [T]_β$.

Theorem: $[T + U]_β^γ = [T]_β^γ + [U]_β^γ$ and $[aT]_β^γ = a[T]_β^γ$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

---

Let $A$ be a $n × n$ matrix. Then $A$ is **invertible** iff there exists an $n × n$ matrix $B$ such that $AB = BA = I$.

Theorem: If $A$ is invertible, the matrix $B$ is unique, and denoted $A^{-1}$.
<details markdown="block">
<summary>Proof</summary>
Suppose there existed another inverse matrix $C$. Then $C = CI = C(AB) = (CA)B = IB = B$.
</details>

Theorem: For finite-dimensional $V$ and $W$: $T$ is invertible iff $[T]_β^γ$ is invertible, and $[T^{-1}]_γ^β = ([T]_β^γ)^{-1}$. Subsequently:
- Let $U : V → V$ be linear. $U$ is invertible iff $[U]_β$ is invertible, and $[U^{-1}]_β = ([T]_β)^{-1}$.
- Let $A$ be an $n × n$ matrix. $A$ is invertible iff $[L_A]$ is invertible, and $(L_A)^{-1} = L_{A-1}$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: Let $V$ and $W$ be of finite dimensions $n$ and $m$ with ordered finite bases $β$ and $γ$, respectively. The function $Φ : \mathcal{L}(V,W) → M_{m×n}(F)$ where $Φ(T) = [T]_β^γ$ for all $T ∈ \mathcal{L}(V,W)$ is an isomorphism.
That is, the set (really *vector space*) of all linear transformations between two vector spaces $V$ and $W$ itself is isomorphic to the vector space of all $m × n$ matrices. Subsequently:
- For $V$ and $W$ of finite dimensions $n$ and $m$, $\mathcal{L}(V,W)$ is of finite dimension $mn$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

The **standard representation** of an $n$-dimensional $V$ with respect to its ordered basis $β$ is the function $ϕ_β : V → F^n$ where $ϕ_β(x) = [x]_β$ for all $x ∈ V$.

Theorem: For any $V$ with ordered basis $β$, $ϕ_β$ is an isomorphism.
<details markdown="block">
<summary>Proof</summary>
...
</details>

## The Change of Coordinate Matrix

- Let $V$ be a finite-dimensional vector space (over a field $F$).
- Let $B$ and $β'$ be two ordered bases for $V$.
- Let $T$ be a linear operator on $V$.

Theorem: Let $Q = [I_V]_{β'}^β$. Then $Q$ is invertible, and for any $v ∈ V$, $[v]_β = Q[v]_{β'}$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

We call such a matrix $Q = [I_V]_{β'}^β$ a **change of coordinate matrix** and say that $Q$ **changes β'-coordinates into β-coordinates**.

Let $Q = [I_V]_{β'}^β$.

Theorem: $Q^{-1}$ changes β-coordinates into β'-coordinates.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: $[T]_{β'} = Q^{-1} [T]_β Q$
<details markdown="block">
<summary>Proof</summary>s
...
</details>

## Dual Spaces

The **dual space** of a vector space $V$ is the vector space $\mathcal{L}(V,F)$ and is denoted by $V^*$.

- Let $V, W$ be finite-dimensional vector spaces (over a field $F$).
- Let $T, U : V → W$ be linear transformations from $V$ to $W$.
- Let $β$ and $γ$ be ordered bases of $V$ and $W$, respectively.

Theorem: For a finite-dimensional $V$: $dim(V^*) = dim(\mathcal{L}(V,F)) = dim(V) ∙ dim(F) = dim(V)$.

Corollary: $V$ and $V^*$ are isomorphic.
<details markdown="block">
<summary>Proof</summary>
$dim(V) = dim(V^*)$ and so they are isomorphic.
</details>

The **dual basis** of a basis $β$ (of a vector space $V$) is the ordered basis $β^* = \\{ f_1, f_2, ..., f_n \\}$ of $V^*$ that satisfies the $i$th coordinate function $f_i(x_j) = δ_{ij} (1 ≤ i, j ≤ n)$. Recall that $δ_{ij} = 1$ if $i=j$, and $0$ otherwise.

Theorem: Let $β = \\{ x_1, x_2, ..., x_n \\}$. Let $f_i(1 ≤ i ≤ n)$ be the $i$th coordinate function wrt. $β$, and let $β^* = \\{ f_1, f_2, ..., f_n \\}$. Then $

...

Theorem: For any linear transformation $T : V → W$, the mapping $T^t : W^\* → V^\*$ defined by $T^t(g) = gT$ for all $g ∈ W^\*$ is a linear transformation, and $[T^t]_{γ^\*}^{β^\*} = ([T]_β^γ)^t$
<details markdown="block">
<summary>Proof</summary>
...
</details>

<!-- ## Homogeneous Linear Differential Equations -->

## Systems of Linear Equations

This section is mostly review.

<!--

## Determinants

...

Let $Ax = b$ be the matrix form of a system of $n$ linear equations in $n$ unknowns (where $x = (x_1, x_2, ..., x_n)^t$).

Cramer's Rule: If $det(A) ≠ 0$, then the system $Ax = b$ has a *unique* solution, and for each $k$ from $1$ to $n$, $x_k = [det(A)]^{-1} ∙ det(M_k)$, where $M_k$ is the $n × n$ matrix obtained from $A$ by replacing column $k$ of $A$ by $b$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

...

-->

---

## Determinants, summarized

Determinants are important for future sections. We state facts here without proof.

Let $A$ be a matrix containing entries from a field $F$.

The **determinant** of an $n×n$ (square) matrix $A$ is a scalar in $F$, and denoted $\|A\|$. The determinant is calculated as follows:
- For a $1 × 1$ matrix $A = [a]$, $\|A\| = a$
- For a $2 × 2$ matrix $A = \begin{bmatrix} a & b \\\\ c & d \end{bmatrix}$, $\|A\| = ac - bd$
- For an $n × n$ matrix $A = \begin{bmatrix} a_{1,1} & a_{1,2} & ... & a_{1,n} \\\\ a_{2,1} & a_{2,2} & ... & a_{2,n} \\\\ ⋮ & ⋮ & ⋱ & ⋮ \\\\ a_{n,1} & a_{n,2} & ... & a_{n,n} \end{bmatrix}$, $\|A\| = Σ^n_{i=1} (-1)^{i+j} A_{i,j} \|A^\~_{i,j}\|$.

The last one deserves some additional exposition: todo

The determinant has a number of nice properties that make it of fair interest.
1. $\|B\| = -\|A\|$ if $B$ is a matrix obtained by exchanging any two rows or columns of $A$
2. $\|B\| = k\|A\|$ if $B$ is a matrix obtained by multiplying $A$ by some scalar $k$
3. $\|B\| = \|A\|$ if $B$ is a matrix obtained by adding a multiple of a column or row to a *different* column or row
4. $\|A\| = 1 $ if $A$ is the identity matrix
5. $\|A\| = 0 $ if either the rows or columns of $A$ are not linearly independent
6. $\|AB\| = \|A\|\|B\|$ if $A$ and $B$ are both $n×n$ matrices
7. $\|A\| ≠ 0 $ iff $A$ is invertible
8. $\|A\| = \|A^t\|$
9. $\|A\| = \|B\|$ if $A$ and $B$ are *similar*

Thus, we can say that the determinant *characterizes* square matrices (and thus linear operations), somewhat. It is a scalar value with a deep relation to the core identity of the matrix, and changes regularly as the matrix changes.

## Eigenvalues and Eigenvectors

- Let $V$ be a finite-dimensional vector space over a field $F$.
- Let $T: V → V$ be a linear operator on $V$.
- Let $β$ be an ordered basis on $V$.
- Let $A$ be in $M_{n×n}(F)$ (a square $n×n$ matrix with entries in $F$).

$T$ is **diagonalizable** if there exists an ordered basis $β$ for $V$ such that $[T]_β$ is a *diagonal matrix*.
$A$ is **diagonalizable** if $L_A$ is diagonalizable.

A nonzero vector $v ∈ V$ is an **eigenvector** if $∃λ ∈ F$: $T(v) = λv$.
The corresponding scalar $λ$ is the **eigenvalue** corresponding to the eigenvector $v$.
A nonzero vector $v ∈ F^n$ is an **eigenvector** of $A$ if $v$ is an eigenvector of $L_A$ (that is, $∃λ ∈ F$: $Av = λv$)
The corresponding scalar $λ$ is the **eigenvalue** of $A$ corresponding to the eigenvector $v$.

The terms *characteristic vector* and *proper vector* are also used in place of *eigenvector*.
The terms *characteristic value* and *proper value* are also used in place of *eigenvalue*.

Theorem: $T: V → V$ is diagonalizable if and only if $V$ has an ordered basis $β$ consisting of eigenvectors of $T$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Corollary: If $T$ is diagonalizable, and $β = \\{v_1, v_2, ..., v_n\\}$ is an ordered basis of eigenvectors of $T$, and $D = \[T\]_β$, then $D$ is a diagonal matrix with $D_{i,i}$ being the eigenvalue corresponding to $v_n$ for any $i ≤ n$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

To *diagonalize* a matrix (or a linear operator) is to find a basis of eigenvectors and the corresponding eigenvalues.

Theorem: A scalar $λ$ is an eigenvalue of $A$ if and only if $\|A - λI_n\| = 0$, that is, the eigenvalues of a matrix are exactly the zeros of its characteristic polynomial.
<details markdown="block">
<summary>Proof</summary>
A scalar $λ$ is an eigenvalue of $A$ iff $∃v ≠ 0 ∈ F^n$: $Av = λv$, that is, $(A - λI_n)(v) = 0$.
... todo
</details>

The **characteristic polynomial** of $A$ is the polynomial $f(t) = \|A - tI_n\|$.
The **characteristic polynomial** of $T$ is the characteristic polynomial of $[T]_β$, often denoted $f(t) = \|T - tI\|$.

Theorem: The characteristic polynomial of $A$ is a polynomial of degree $n$ with leading coefficient $(-1)^n$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Corollary: $A$ has at most $n$ distinct eigenvalues.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: A vector $v ∈ V$ is an eigenvector of $T$ corresponding to an eigenvalue $λ$ iff $v ∈ N(T - λI)$ and $v ≠ 0$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

<!-- look at examples 6 and 7 in chapter 5 -->

## Diagonalizability

- Let $V$ be a finite-dimensional vector space over a field $F$.
- Let $T: V → V$ be a linear operator on $V$.
- Let $A$ be in $M_{n×n}(F)$ (a square $n×n$ matrix with entries in $F$).
- Let $λ$ be an eigenvalue of $T$.
- Let $λ_1, λ_2, ..., λ_k$ be distinct eigenvalues of $T$.

Theorem: Let $λ_1, λ_2, ..., λ_k$ be distinct eigenvalues of $T$. If $v_1, v_2, ..., v_k$ are eigenvectors of $T$ such that $λ_i$ corresponds to $v_i$ (for all $i ≤ k$), then $\\{v_1, v_2, ..., v_k\\}$ is linearly independent. In fewer words, eigenvectors with distinct eigenvalues are all linearly independent from one another.
<details markdown="block">
<summary>Proof</summary>
...
</details>


Corollary: If $T$ has $n$ distinct eigenvalues, where $n = dim(V)$, then $T$ is diagonalizable.
<details markdown="block">
<summary>Proof</summary>
...
</details>

A polynomial $f(t) ∈ P(F)$ **splits over** $F$ if there are scalars $c, a_1, a_2, ..., a_n ∈ F$: $f(t) = c(a_1 - t)(a_2 - t)...(a_n - t)$.

Theorem: The characteristic polynomial of any diagonalizable linear operator splits.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Let $λ$ be an eigenvalue of a linear operator or matrix with characteristic polynomial $f(t)$. The **(algebraic) multiplicity** of $λ$ is the largest positive integer $k$ for which $(t-λ)^k$ is a factor of $f(t)$.

Let $λ$ be an eigenvalue of $T$. The set $E_λ = \\{x ∈ V: T(x) = λx\\} = N(T - λI_V)$ is called the **eigenspace** of $T$ with respect to the eigenvalue $λ$. Similarly, the **eigenspace** of $A$ is the eigenspace of $L_A$.

Theorem: If $T$ has multiplicity $m$, $1 ≤ dim(E_λ) ≤ m$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Lemma: Let $S_i$ be a finite linearly independent subset of the eigenspace $E_{λ_i}$ (for all $i ≤ k$). Then $S = S_1 ∪ S_2 ∪ ... ∪ S_k$ is a linearly independent subset of $V$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: If the characteristic polynomial of $T$ splits, then $T$ is diagonalizable iff the multiplicity of $λ_i$ is equal to $dim(E_{λ_i})$ (for all $i ≤ k$).
<details markdown="block">
<summary>Proof</summary>
...
</details>

Corollary: If the characteristic polynomial of $T$ splits, $T$ is diagonalizable, $β_i$ is an ordered basis for $E_{λ_i}$ (for all $i ≤ k$), then $β = β_1 ∪ β_2 ∪ ... ∪ β_k$ is an ordered basis for $V$ consisting of eigenvectors of $T$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

$T$ is diagonalizable iff both of the following conditions hold:
1. The characteristic polynomial of $T$ splits.
2. The multiplicity of $λ$ equals $n - rank(T-λI)$, where $n = dim(V)$ and for each eigenvalue $λ$ of $T$.

## Direct Sums

- Let $V$ be a finite-dimensional vector space over a field $F$.
- Let $T: V → V$ be a linear operator on $V$.
- Let $W_1, W_2, ..., W_k$ be subspaces of $V$.

The **sum** of some subspaces $W_i$ (for $1 ≤ i ≤ k$) is the set $\\{v_1 + v_2 + ... + v_k : v_i ∈ W_i \\}$, denoted $W_1 + W_2 + ... + W_k$ or $Σ^k_{i=1} W_i$.

The subspaces $W_i$ (for $1 ≤ i ≤ k$) form a **direct sum** of $V$, denoted $W_1 ⊕ W_2 ⊕ ... ⊕ W_k$, if $V = Σ^k_{i=1} W_i$ and $W_j ∩ Σ_{i≠j} W_i = \\{0\\}$ for all $j ≤ k$.

Theorem: The following conditions are equivalent:
1. $V = W_1 ⊕ W_2 ⊕ ... ⊕ W_k$.
2. $V = Σ^k_{i=1} W_i$ and ??? todo
3. Every vector $v ∈ V$ can be uniquely written as $v = v_1 + v_2 + ... + v_k$ where $v_i ∈ W_i$.
4. If $γ_i$ is an ordered basis for $W_i$ (for $1 ≤ i ≤ k$), then $γ_1 ∪ γ_2 ∪ ... ∪ γ_k$ is an ordered basis for $V$.
5. There exists an ordered basis $γ_i$ for $W_i$ for every $1 ≤ i ≤ k$ such that $γ_i ∪ γ_2 ∪ ... γ_k$ is an ordered basis for $V$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: $T: V → V$ is diagonalizable if and only if $V$ is the direct sum of the eigenspaces of $T$.
<details markdown="block">
<summary>Proof</summary>
...
</details>