summaryrefslogtreecommitdiff
path: root/mathematics/linear-algebra.md
blob: 7b9cc274cebf4c85219ce751536231e40d549aca (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
---
layout: algebra
title: mathematics/linear algebra
---

# Linear Algebra

$$ℝ^n$$

## Introduction: The Complex Numbers

A **complex number** is of the form $a + b\text{i}$, where $a, b$ are real numbers and $i$ represents the imaginary base.

We denote the set of complex numbers as $ℂ$.
The complex numbers form a *vector space*.
Every element in the vector space can be expressed as a *linear combination* of $a + bi$.

- ℂ: the set of complex numbers
- ℝ: the set of real numbers
- ℚ: the set of rational numbers

Elementary operations on the complex numbers as are follows:
- $(a + bi) ± (c + di) = a ± c ± (b ± d)$
- $(a + bi)(c + di) = (ac - bd) + (ad + bc)$
- $i^2 = -1$

The **conjugate** of $z = a + bi$ is ... defined as $\bar{z} = a - bi$.
As long as $a^2 + b^2 ≠ 0$, the inverse of $z = a + bi$ is given by $z^{-1} = \frac{a - b}{a^2 + b^2} = \frac{\bar{z}}{a^2 + b^2}$.

**Theorem**: Let $z, w ∈ ℂ$. Then:
- $\bar{\bar{z}} = z$ the conjugate of the conjugate
- $\bar{z ± w} = \bar{z} ± \bar{w}$
- $\bar{zw} = \bar{z}\bar{w}$
- $\bar{\frac{z}{w}} = \frac{\bar{z}}{\bar{w}}$ (if $w ≠ 0$)
  - (where $\frac{z}{w} = z ⋅ w^{-1} = z \frac{1}{w}$)
- $z$ is a *real number* iff $\bar{z} = z$

Let $z = a + bi$, where $a, b ∈ ℝ$. The **absolute value** of $z$ is defined as the real number $\sqrt{a^2 + b^2}$.

This absolute value shares many properties. For $z, w ∈ ℂ$:
- $z \bar{z} = a^2 + b^2 = |z|^2$, where $|z| = \sqrt{a^2 + b^2}
- $\frac{z}{w} = \frac{|z|}{|w|}$ where $w ≠ 0$
- $|z + w| ≤ |z| + |w|$
- $|z| - |w| ≤ |z + w|$

**The Fundamental Theorem of Algebra**:
Consider the polynomial $p(z) = a_n z^n + a_{n-1}z^{n-1} + ... + a_1 z + a_0$ where all $a$ are complex numbers.
If $n ≥ 1$, then $p(z)$ has a zero (*root*), i.e. there exists a complex number $c$ such tbar $p(c) = 0$.

If $c$ is a root, $\bar{c}$ is also a root!

Let 𝔽 be one of the following sets: ℚ, ℝ, ℂ

A **vector space** $V$ over 𝔽 is a non-empty set on which two operations (addition `+` and scalar multiplication `*`) are defined such that for each pair of elements $x, y$ in V there is a unique element $x + y$ in V, and for each element $a$ in 𝔽 and each element $x$ in $V$...
- commutativity: $∀x,y ∈ V : x + y = y + x$
- associativity: $∀x,y,z ∈ V : (x + y) + z = x + (y + z)$
- additive identity: $∃𝕆 ∈ V : ∀x ∈ V, 𝕆 + x = x$
- additive inverse: $∀x ∈ V, ∃y ∈ V : x + y = 𝕆$
- commutativity: $∀a,b ∈ 𝔽, ∀ x ∈ V (ab)x = a(bx)
- distributivity: $∀a ∈ 𝔽, ∀x,y ∈ V : a(x + y) = ax + ay$
- distributativity $∀a,b ∈ 𝔽, ∀x ∈ V : (a + b)x = ax + bx$

Another definition which somewhat more motivates the set being non-empty is as follows: A **vector space** is a set $V$ with additive and scalar multiplicative operations such that the following properties hold:
- commutativity: $∀u, v ∈ V : u + v = v + u$
- associativity: $∀u, v, w \in V : (u + v) + w = u + (v + w)$
- additive identity: $∃0 ∈ V : ∀v ∈ V, v + 0 = v$
- additive inverse: $∀v ∈ V, ∃w ∈ V : v + w = 0$
- multiplicative identity: $∀v ∈ V, 1v = v$
- distributive properties: $∀a, b ∈ 𝔽, ∀u, v ∈ V : a(u + v) = au + av, (a + b)v = av + bv$

Our definition of a vector space leads to some facts:
- The zero vector is *unique* and always present.
  - Proof: Suppose there were two zero vectors: 0 and 0'. Then 0' = 0 + 0' = 0' + 0 = 0.
- Vector spaces are *non-empty*.
  - Proof: by definition, the zero vector exists.
- The additive inverse for some $x$ is *unique*.
  - Proof: exercise
- If $V$ is a vector space over $𝔽$ and $V ≠ \\{0\\}$, then $V$ is an *infinite set*. (note this only holds over $𝔽$)
  - Proof: you can just keep adding things

Example: Let $S = \\{(a_1, a_2) | a_1, a_2 ∈ ℝ\\}$.
For $(a_1, a_2), (b_1, b_2) ∈ S$ and $c ∈ ℝ$, we define:
- $(a_1, a_2) + (b_1, b_2) = (a_1 + b_1, a_2 - b_2)$
- $c(a_1, a_2) = (ca_1, ca_2)$.
- This fails commutativity!

Example: Let $S = \\{(a_1, a_2) | a_1, a_2 ∈ ℝ\\}$. We define:
- $(a_1, a_2) + (b_1, b_2) = (a_1 + b_1)$
- $c(a_1, a_2) = (ca_1, 0)$
- fails zero!

A subset $W$ of a vector space $V$ over a field 𝔽 is called a **subspace** of $V$ if $W$ is a *vector space* over 𝔽 with the operations of addition and scalar multiplication from $V$.
- .
- .

A subset of $V$ is a **subspace** of V iff:
- the subset is non-empty
- the subset contains the zero vector
- it is closed under addition and multiplication

Let $V$ be a vector space over $F$ and $S$ a nonempty subset of $V$. A vector $v \in V$ is a **linear combination** of vectors $s,t ∈ V$ if there exists a *finite* number of vectors $u_1, u_2, ..., u_n ∈ S$ and scalars $a_1, a_2, ..., a_n ∈ F$ such that $v = a_1 u_1 + a_2 u_2 + ... a_n u_n$.

We call $a_1 ... a_n$ the *coefficients* of the linear combination.

https://math.stackexchange.com/questions/3492590/linear-combination-span-independence-and-bases-for-infinite-dimensional-vector

Let $S$ be a nonempty subset of a vector space $V$. The **span** of $S$, denoted $span(S)$, is the set consisting of all linear combination of vectors in S. For convenience, we define $span(∅) = \\{0\\}$.

The span of any subset $S$ of a vector space $V$ is a subspace of $V$.

---

A subspace $S$ over a vector space $V$ is **linearly dependent** if there exists a finite number of distinct vectors $u_1, u_2, ..., u_n ∈ S$ and scalars $a_1, a_2, ..., a_n ∈ F$ such that $a_1, a_2, ..., a_n$ are not all zero and $a_1 u_1 + a_2 u_2 + ... a_n u_n = 0$.

A subset $S$ of a vector space that is not linearly dependent is **linearly independent**.


Example: Consider the following set: $S = {(1, 0, 0, -1), (0, 1, 0, -1), (0, 0, 1, -1), (0, 0, 0, 1)}$
Assume that $a v_1 + a_2 v_2 + a_3 v_3 + a_4 v_4 = 0$. then...

as the determinant is nonzero, S is linearly independent.


Let $V$ be a vector space, and let $S_1 ⊆ S_2 ⊆ V$. If $S_1$ is linearly dependent, then $S_2$ is linearly dependent. If $S_2$ is linearly independent, then $S_1$ is also linearly independent.

Let $S$ be a linearly independent subset of a vector space $V$, and let $v ∈ V : v ∉ S$. Then $S ∪ {v}$ is linearly independent iff $v ∈ span(S)$.

A basis $B$ for a vector space $V$ is a *linearly independent* subset of $V$ that *spans* $V$. If $B$ is a basis for $V$, we also say that the vectors of $B$ form a basis for $V$.

Let $V$ be  a vector space and $β = {v_1, ..., v_n}$ be a subset of V. Then β is a basis for V iff every $v ∈ V$ can be **uniquely expressed** as a linear combination of vectors of β. that is, V can be written in the form $v = a_1 u_1 + a_2 u_2 ... a_n u_n$ for unique scalars a.

If a vector space V is spanned by a finite set S, then some subset of S is a basis of V. So, V has a finite basis.
Proof: If $S = ∅$, then $V = span{S} = span{∅} = \span{𝕆}$ in which case ∅ is a basis for $V$.
If S ≠ ∅, then ∃ u_1 ∈ S : u_1 ≠ 𝕆. and we have two cases: span(u_1) = V we are done...

(Replacement theorem) Let $V$ be a vector space that is spanned by a set G containing exactly n vectors, and let L be a linearly independent subset of V containing exactly m vectors. Then m ≤ n. Moreover, you can find a subset H of G

Let V be a vector space with dimension n.
- any finite spanning set for V contains at least n vectors, and a spanning set for V that contains exactly n vectors is a basis for V.

Theorem 1.4: Let $W$ be a subspace of a finite-dimensional vector space $V$. Then $W$ is also finite-dimensional (and dim W ≤ dim V). Moreover if dim W = dim V, then V = W.

---

## Linear Transformations

Let $V$ and $W$ be vector spaces (over a field $F$).

A function $T: V → W$ is a **linear transformation** from $V$ into $W$ if $∀x,y ∈ V, c ∈ F$ we have $T(cx + y) = cT(x) + T(y)$.
Subsequently:
- $T(x + y) = T(x) + T(y)$
- $T(cx) = cT(x)$
- $T(0) = 0$
- $T(\sum_{i=1}^n a_i x_i) = \sum_{i=1}^n a_i T(x_i)$

Let $T: V → W$ be a linear transformation.

The **kernel** (or null space) $N(T)$ of $T$ is the set of all vectors in $V$ such that $T(x) = 0$: that is, $N(T) = \\{ x ∈ V : T(x) = 0 \\}$.

The **image** (or range) $R(T)$ of $T$ is the subset of $W$ consisting of all images (under $T$) of elements of $V$: that is, $R(T) = \\{ T(x) : x ∈ V \\}$.

Theorem: The kernel $N(T)$ and image $R(T)$ are subspaces of $V$ and $W$, respectively.
<details markdown="block">
<summary>Proof</summary>
We shall denote the zero vector of $V$ and $W$ as $0_v$ and $0_w$, respectively.

Let $x,y ∈ N(T)$ and $c ∈ F$. As $T(0_v) = 0_w$, $0_v ∈ N(T)$. Then $T(cx + y) = cT(x) + T(y) = 0_w + 0_w = 0_w$, as $x$ and $y$ are in the null space. Hence any linear combination of $x$ and $y$ in the null space is in the null space. So as $N(T) ⊆ V$ by definition, $N(T)$ is a subspace of $V$.

Let $x,y ∈ R(T)$ and $c ∈ F$. As $T(0_v) = 0_w$, $0_w ∈ R(T)$.
...
</details>

Theorem: If $β = \\{ v_1, v_2, ... v_n \\}$ is a basis for $V$, then $R(T) = span(\\{ T(v_1), T(v_2), ..., T(v_n) \\})$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

For a finite-dimensional $N(T)$ and $R(T)$: the **nullity** and **rank** of $T$ are the dimensions of $N(T)$ and $R(T)$, respectively.

**Rank-Nullity Theorem**: If $V$ is *finite-dimensional*, then $dim(V) = nullity(T) + rank(T)$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Recall that a *function* definitionally maps *each* element of its domain to *exactly* one element of its codomain.
A function is **injective** (or **one-to-one**) iff each element of its domain maps to a *distinct* element of its codomain.
A function is **surjective** (or **onto**) iff each element of the codomain is mapped to by *at least* one element in the domain.
A function is **bijective** iff it is surjective and injective. Necessarily, a bijective function is invertible, which will be formally stated & proven later.

Theorem: $T$ is injective iff $N(T) = \\{0\\}$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: For $V$ and $W$ of equal (and finite) dimension: $T$ is injective iff it is surjective.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: Suppose that $V$ is finite-dimensional with a basis $\\{ v_1, v_2, ..., v_n \\}$. For any vectors $w_1, w_2, ... w_n$ in $W$, there exists *exactly* one linear transformation such that $T(v_i) = w_i$ for $i = 1, 2, ..., n$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

## Composition of Linear Transformations

Let $V$, $W$, and $Z$ be vector spaces.

Theorem: Let $T, U : V → W$ be linear. Then their sum and scalar products are linear.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: The set of all linear transformations (via our definitions of addition and scalar multiplication above) $V → W$ forms a vector space over $F$. We denote this as $\mathcal{L}(V, W)$. If $V = W$, we write $\mathcal{L}(V)$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Let $T, U : V → W$ be arbitrary functions. We define **addition** $T + U : V → W$ as $∀x ∈ V : (T + U)(x) = T(x) + U(x)$, and **scalar multiplication** $aT : V → W$ as $∀x ∈ V : (aT)(x) = aT(x)$ for all $a ∈ F$.

Theorem: Let $T : V → W$ and $U : W → Z$ be linear. Then their composition $UT : V → Z$ is linear.
<details markdown="block">
<summary>Proof</summary>
Let $x,y ∈ V$ and $c ∈ F$. Then:

$$UT(cx + y)$$
$$= U(T(cx + y)) = U(cT(x) + T(y))$$
$$= cU(T(x)) + U(T(y)) = c(UT)(x) + UT(y)$$
</details>

Theorem: Let $T, U_1, U_2 ∈ \mathcal{L}(V)$. Then:
- $T(U_1 + U_2) = TU_1 + TU_2$ and $(U_1 + U_2)T = U_1 T + U_2 T$
- $T(U_1 U_2) = (TU_1) U_2$
- $TI = IT = T$
- $∀a ∈ F : a(U_1 U_2) = (aU_1) U_2 = U_1 (aU_2)$
<details markdown="block">
<summary>Proof</summary>
...
<!-- A more general result holds for linear transformations with domains unequal to their codomains, exercise 7 -->
</details>

## Invertibility and Isomorphism

- Let $V$ and $W$ be vector spaces.
- Let $T: V → W$ be a linear transformation.
- Let $I_V: V → V$ and $I_W: W → W$ denote the identity transformations within $V$ and $W$, respectively.

A function $U: W → V$ is an **inverse** of $T$ if $TU = I_W$ and $UT = I_V$.<br>
If $T$ has an inverse, then $T$ is **invertible**.

Theorem: Consider a linear function $T: V → W$.
- If $T$ is invertible, it has a *unique* inverse $T^{-1}$.
- If $T$ is invertible, $T^{-1}$ is invertible with the inverse $T$.
- A function is invertible if and only iff it is bijective.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: If $T$ is linear and invertible, $T^{-1}$ is linear and invertible.
<details markdown="block">
<summary>Proof</summary>
...
</details>

$V$ is **isomorphic** to $W$ if there exists an *invertible* linear $T : V → W$ (an **isomorphism**).

Lemma: For finite-dimensional $V$ and $W$: If $T: V → W$ is invertible, then $dim(V) = dim(W)$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: $V$ is isomorphic to $W$ iff $dim(V) = dim(W)$.
- Subsequently, $V$ is isomorphic to $F^n$ iff $dim(V) = n$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

## Linear Transformations as Matrices

- Let $V, W$ be finite-dimensional vector spaces (over a field $F$).
- Let $T, U : V → W$ be linear transformations from $V$ to $W$.
- Let $β$ and $γ$ be ordered bases of $V$ and $W$, respectively.
- Let $a ∈ F$ be a scalar.

An **ordered basis** of a finite-dimensional vector space $V$ is, well, an ordered basis of $V$. We represent this with exactly the same notation as a standard unordered basis, but will call attention to it whenever necessary.
- For the vector space $F^n$ we call $\\{ e_1, e_2, ..., e_n \\}$ the **standard ordered basis** for $F^n$.
- For the vector space $P_n(F)$ we call $\\{ 1, x, ..., x^n \\}$ the **standard ordered basis** for $P_n(F)$.

Let $a_1, a_2, ... a_n$ be the unique scalars such that $x = Σ_{i=1}^n a_i u_i$ for all $x ∈ V$. The **coordinate vector** of $x$ relative to $β$ is $(a_1, ..., a_n)$ (vert) and denoted $[x]_β$.

The $m × n$ matrix $A$ defined by $A_{ij} = a_{ij}$ is called the **matrix representation of $T$ in the ordered bases $β$ and $γ$**, and denoted as $A = [T]_β^γ$. If $V = W$ and $β = γ$, we write $A = [T]_β$.

Theorem: $[T + U]_β^γ = [T]_β^γ + [U]_β^γ$ and $[aT]_β^γ = a[T]_β^γ$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

---

Let $A$ be a $n × n$ matrix. Then $A$ is **invertible** iff there exists an $n × n$ matrix $B$ such that $AB = BA = I$.

Theorem: If $A$ is invertible, the matrix $B$ is unique, and denoted $A^{-1}$.
<details markdown="block">
<summary>Proof</summary>
Suppose there existed another inverse matrix $C$. Then $C = CI = C(AB) = (CA)B = IB = B$.
</details>

Theorem: For finite-dimensional $V$ and $W$: $T$ is invertible iff $[T]_β^γ$ is invertible, and $[T^{-1}]_γ^β = ([T]_β^γ)^{-1}$. Subsequently:
- Let $U : V → V$ be linear. $U$ is invertible iff $[U]_β$ is invertible, and $[U^{-1}]_β = ([T]_β)^{-1}$.
- Let $A$ be an $n × n$ matrix. $A$ is invertible iff $[L_A]$ is invertible, and $(L_A)^{-1} = L_{A-1}$.
<details markdown="block">
<summary>Proof</summary>
...
</details>

Theorem: Let $V$ and $W$ be of finite dimensions $n$ and $m$ with ordered finite bases $β$ and $γ$, respectively. The function $Φ : \mathcal{L}(V,W) → M_{m×n}(F)$ where $Φ(T) = [T]_β^γ$ for all $T ∈ \mathcal{L}(V,W)$ is an isomorphism.
<details markdown="block">
<summary>Proof</summary>
...
</details>
That is, the set (really *vector space*) of all linear transformations between two vector spaces $V$ and $W$ itself is isomorphic to the vector space of all $m × n$ matrices.
- Subsequently, for $V$ and $W$ of finite dimensions $n$ and $m$, $\mathcal{L}(V,W)$ is of finite dimension $mn$.

Let $β$ be an ordered basis for an $n$-dimensional $V$. The **standard representation** of $V$ with respect to $β$ is the function $ϕ_β : V → F^n$ where $ϕ_β(x) = [x]_β$ for all $x ∈ V$.

Theorem: FOr any $V$ with ordered basis $β$, $ϕ_β$ is an isomorphism.
<details markdown="block">
<summary>Proof</summary>
...
</details>