diff options
author | JJ | 2024-04-16 04:20:32 +0000 |
---|---|---|
committer | JJ | 2024-04-16 04:20:32 +0000 |
commit | caa723146c6e478767760cb766701fa2aa173e89 (patch) | |
tree | 3d82324e36cd24ec6d76e5c69f14c4cc0836adca | |
parent | 8aa75ae12239bfe325c9770e176ed23130a52765 (diff) |
meow
-rw-r--r-- | linguistics/syntax.md | 34 | ||||
-rw-r--r-- | mathematics/linear-algebra.md | 185 |
2 files changed, 199 insertions, 20 deletions
diff --git a/linguistics/syntax.md b/linguistics/syntax.md index d9cbe7b..17fd901 100644 --- a/linguistics/syntax.md +++ b/linguistics/syntax.md @@ -36,18 +36,21 @@ Certainly, all of syntax cannot be taught at once. Yet the desire to generalize - [Minimalism](#minimalism) [n/a] - [Merge, Part II](#merge-part-ii) - Projection [SKS 5] - - Selection + - Selection [SKS 8] - Move [SKS 8] - [Affix Hopping](#affix-hopping) - Head Movement [SKS 8.3] + - [Subject Raising](#subject-raising) [SKS 12.4] - Wh- Movement [SKS 10] - - [vP Shells](#vp-shells) [SKS 12.4] - Agree - Theta Roles [SKS 6.8.1] - Locality - [Binding](#binding) [SKS 7] - Raising & Control [SKS 9] - Advanced Syntax + - Negation + - Ellipsis + - Parsing - References </details> @@ -84,7 +87,7 @@ Surprisingly (or unsurprisingly), we shall see that these ideas generalize to se Why are proper names $D$s? Why is it possible to say either *I moved the couches* and *I moved couches*, but only possible to say *I moved the couch* and not *I moved couch*? Why is the infinitive form of a verb identical to the present, in some cases? -These inconsistencies can be all addressed by one (controversial) concept: the idea of *silent morphemes*, invisible in writing and unpronounceable in speech. We represent such morphemes as ∅, and so may write the earlier strange sentence as *I moved ∅-couches*. +These inconsistencies can be all addressed by one (strange) concept: the idea of *silent morphemes*, invisible in writing and unpronounceable in speech. We represent such morphemes as ∅, and so may write the earlier strange sentence as *I moved ∅-couches*. ... @@ -265,9 +268,7 @@ Merge is *the* fundamental underlying aspect of syntax and arguably language as So far, we have not dealt with tense. We have diagrammed sentences with verbs in present and past forms by entirely ignoring their *-s* and *-ed* affixes. But tense is an aspect of grammar just like anything else, and writing it off as purely semantic does no good to anyone. Indeed, the English future having its tense marker *will* as a free-standing morpheme strongly suggests that we have to treat tense as a syntactic category in its own right, and not just as an inflectional property of verbs. -A tense needs a *subject*. This is evident in our tree structure below, but is motivated by... - -For now, we'll consider the verb to no longer be in charge of selecting the subject. This is not in fact accurate - as we will see at the end of this section - but it is a simplification we shall make for the time being. +A tense needs a *subject*. This is often stated as the **extended projection principle**, for how fundamentally it influences the structure of sentences. For now, we'll consider the verb to no longer be in charge of selecting the subject, and leave it to the tense. This is not in fact accurate - as we will see at the end of this section - but it is a simplification we shall make for the time being. ![will](tense-will.png) <details markdown="block"> @@ -318,6 +319,8 @@ In this section, we introduce the idea of *movement*: that certain portions of s </details> +(we say that *-ed* leaves a **trace** when moving to *walk*. we denote this here with *(-ed)*, but another common notation is to write *t*.) + English's first-person present does not inflect the verb, and so we must introduce a null $T$. A similar example is given for the present tense in the third person, which does have an explicit tense marker. ![()](tense-null.png) @@ -400,9 +403,9 @@ Consider the following sentence: *Alice will speak to the assembly*. With our cu The $D$ *Alice* here is the subject. While replacing it with some $D$s produces grammatical sentences ex. *The prime minister will speak to the assembly*: this is not true of all $D$s. Slotting in inanimate $D$s like *Time will speak to the assembly* and *Knowledge will speak to the assembly* produces grammatically unacceptable sentences. So there is some *selection* occurring somewhere in the sentence that wants a particular *feature set* (f-features) from the subject $D$, specifically, animacy. -Observe, however, that our tree structure suggests that $T$ - and only $T$ - is involved in the selection of $Alice$ as the subject, given locality of selection. But this can't be quite right. Plenty of other sentences involving the $T$ *will* are just fine with inanimate subjects: *Time will pass*, *Knowledge will be passed on*, etc. (Notice that *Alice will pass* and *Alice will be passed on* are similarly ungrammatical). How do we reconcile this? +Observe, however, that our tree structure suggests that $T$ - and only $T$ - is involved in the selection of $Alice$ as the subject, given locality of selection and the extended projection principle. But this can't be quite right. Plenty of other sentences involving the $T$ *will* are just fine with inanimate subjects: *Time will pass*, *Knowledge will be passed on*, etc. (Notice that *Alice will pass* and *Alice will be passed on* are similarly ungrammatical). How do we reconcile this? -We now introduce the idea of $vP$ shells and V-to-T movement. Our observations above point towards the $V$ of the sentence rather than the $T$ selecting for the subject $D$ - somehow. This selection would break our guiding principle of locality of selection. But this behavior *does* occur. Can we extend our model to explain this, *without* modifying the locality of selection that has been so useful thus far? We can, indeed, with movement, and illustrate so in the following tree. +We now introduce the idea of **subject raising** / $vP$ shells. Our observations above point towards the $V$ of the sentence rather than the $T$ selecting for the subject $D$ - somehow. This selection would break our guiding principle of locality of selection. But this behavior *does* occur. Can we extend our model to explain this, *without* modifying the locality of selection that has been so useful thus far? We can, indeed, with movement, and illustrate so in the following tree. ![`[T [D Alice] [T_D [T_{D,V} will] [V [D (subj)] [V_D [V_{D,P} speak] [P [P_D to] [D [D_N the] [N assembly]]]]]]]`](subject-movement.png) <details markdown="block"> @@ -429,9 +432,10 @@ We now introduce the idea of $vP$ shells and V-to-T movement. Our observations a </details> -So we say that *Alice* is originally selected by the $V$ and *moves* to its surface position in the $T$. Our head movement principles allow for this. This does mean that every tree diagram we have drawn up until now is inaccurate, and that almost every tree we draw going forward will have to have this somewhat redundant V-to-T movement. This is a fine tradeoff to make in exchange for accurately describing previously-unclear syntactic behavior. +So we say that *Alice* is originally selected by the $V$ and *moves* to its surface position in the $T$. *Alice* satisfies the projection principle by being selected by the $V$, satisfies the extended projection principle by fulfilling the role of the subject for the $T$, and satisfies locality of selection by being in complement and specifier position for the $V$ and the $T$, respectively. Our concept of movement allows *Alice* to play **both** roles at the same time. This does mean that every tree diagram we have drawn up until now is inaccurate, and that almost every tree we draw going forward will have to have this somewhat redundant subject raising. This is a fine tradeoff to make in exchange for accurately describing previously-unclear syntactic behavior. + +This subject raising is an example of **A-movement** (argument movement). A-movement exists in contrast to **A'-movement** (movement to a non-argument position), which is responsible for wh-movement and topicalization: two topics that we shall touch on shortly. -> Note: this is not called V-to-T movement. What *is* it called? ### small clauses @@ -449,13 +453,14 @@ How do pronouns work? The theory of binding operates under three fundamental principles. - **Principle A**: an anaphor must be bound in its domain. -- **Principle B**: a pronoun must be free in its domain. +- **Principle B**: a (personal) pronoun must be free in its domain. - **Principle C**: an r-expression may never be bound. Our principles imply various things. Principle A implies that: -- a reflexive must be coreferential with its antecedent -- the antecedent of a reflexive must c-command the reflexive -- the reflexive and its antecedent must be in all the same nodes that have a subject +- a reflexive must be *coreferential* with its antecedent + - (agreeing in person, number, and gender) +- the antecedent of a reflexive must *c-command* the reflexive +- the reflexive and its antecedent must be *in all the same nodes* that have a subject ### raising and control @@ -465,4 +470,5 @@ Our principles imply various things. Principle A implies that: ## References - ✨ [An Introduction to Syntactic Analysis and Theory](https://annas-archive.org/md5/11bbf70ff9259025bc6985ba3fa4083b) +- ✨ [The Science of Syntax](https://pressbooks.pub/syntax/) - MIT 24.902: [2017](https://web.mit.edu/norvin/www/24.902/24902.html), [2015](https://ocw.mit.edu/courses/24-902-language-and-its-structure-ii-syntax-fall-2015/), [2003](https://ocw.mit.edu/courses/24-902-language-and-its-structure-ii-syntax-fall-2003/) diff --git a/mathematics/linear-algebra.md b/mathematics/linear-algebra.md index ce24abd..95e0138 100644 --- a/mathematics/linear-algebra.md +++ b/mathematics/linear-algebra.md @@ -241,7 +241,7 @@ Theorem: For $V$ and $W$ of equal (and finite) dimension: $T$ is injective iff i <details markdown="block"> <summary>Proof</summary> -We have that $T$ is injective iff $N(T) = \\{0\\}$ and thus iff $nullity(T) = 0$. So $T$ is injective iff $rank(T) = dim(W)$ and $dim(R(T)) = dim(W)$ from the Rank-Nullity Theorem. $dim(R(T)) = dim(W)$ is equivalent to $R(T) = W$, which is the definition of surjectivity. +We have that $T$ is injective iff $N(T) = \\{0\\}$ and thus iff $nullity(T) = 0$. So $T$ is injective iff $rank(T) = dim(R(T)) = dim(W)$, from the Rank-Nullity Theorem. $dim(R(T)) = dim(W)$ is equivalent to $R(T) = W$, which is the definition of surjectivity. So $T$ is injective iff it is surjective. ∎ </details> @@ -282,8 +282,11 @@ Theorem: Let $T : V → W$ and $U : W → Z$ be linear. Then their composition $ Let $x,y ∈ V$ and $c ∈ F$. Then: $$UT(cx + y)$$ + $$= U(T(cx + y)) = U(cT(x) + T(y))$$ + $$= cU(T(x)) + U(T(y)) = c(UT)(x) + UT(y)$$ + </details> Theorem: Let $T, U_1, U_2 ∈ \mathcal{L}(V)$, and $a ∈ F$. Then: @@ -460,7 +463,7 @@ Theorem: For any linear transformation $T : V → W$, the mapping $T^t : W^* → ... </details> -## Homogeneous Linear Differential Equations +<!-- ## Homogeneous Linear Differential Equations --> ## Systems of Linear Equations @@ -470,10 +473,6 @@ This section is mostly review. ## Determinants -Let $A \begin{bmatrix} a & b \\\\ c & d \end{bmatrix}$. - -We define the **determinant** of any 2 × 2 matrix $A$ to be the scalar $ad - bc$, and denote it $det(A)$ or $|A|$. - ... Let $Ax = b$ be the matrix form of a system of $n$ linear equations in $n$ unknowns (where $x = (x_1, x_2, ..., x_n)^t$). @@ -487,3 +486,177 @@ Cramer's Rule: If $det(A) ≠ 0$, then the system $Ax = b$ has a *unique* soluti ... --> + +--- + +## Determinants, summarized + +Determinants are important for future sections. We state facts here without proof. + +Let $A$ be a matrix containing entries from a field $F$. + +The **determinant** of an $n × n$ (square) matrix $A$ is a scalar in $F$, and denoted $|A|$. The determinant is calculated as follows: +- For a $1 × 1$ matrix $A = [a]$, $|A| = a$ +- For a $2 × 2$ matrix $A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$, $|A| = ac - bd$ +- For an $n × n$ matrix $A = \begin{bmatrix} a_{1,1} & a_{1,2} & ... & a_{1,n} \\ a_{2,1} & a_{2,2} & ... & a_{2,n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{n,1} & a_{n,2} & ... & a_{n,n} \end{bmatrix}$, $|A| = Σ^n_{i=1} (-1)^{i+j} A_{i,j} |A^~_{i,j}|$. + +The last one deserves some additional exposition: todo + +The determinant has a number of nice properties that make it of fair interest. +1. $|B| = -|A|$ if $B$ is a matrix obtained by exchanging any two rows or columns of $A$ +2. $|B| = k|A|$ if $B$ is a matrix obtained by multiplying $A$ by some scalar $k$ +3. $|B| = |A|$ if $B$ is a matrix obtained by adding a multiple of a column or row to a *different* column or row +4. $|A| = 1$ if $A$ is the identity matrix +5. $|A| = 0$ if either the rows or columns of $A$ are not linearly independent +6. $|AB| = |A||B|$ if $A$ and $B$ are both $n × n$ matrices +7. $|A| ≠ 0$ iff $A$ is invertible +8. $|A| = |A^t|$ +9. $|A| = |B|$ if $A$ and $B$ are *similar* + +Thus, we can say that the determinant *characterizes* square matrices (and thus linear operations), somewhat. It is a scalar value with a deep relation to the core identity of the matrix, and changes regularly as the matrix changes. + +## Eigenvalues and Eigenvectors + +- Let $V$ be a finite-dimensional vector space over a field $F$. +- Let $T: V → V$ be a linear operator on $V$. +- Let $β$ be an ordered basis on $V$. +- Let $A$ be in $M_{n×n}(F)$ (a square $n×n$ matrix with entries in $F$). + +$T$ is **diagonalizable** if there exists an ordered basis $β$ for $V$ such that $[T]_β$ is a *diagonal matrix*. +$A$ is **diagonalizable** if $L_A$ is diagonalizable. + +A nonzero vector $v ∈ V$ is an **eigenvector** if $∃λ ∈ F$: $T(v) = λv$. +The corresponding scalar $λ$ is the **eigenvalue** corresponding to the eigenvector $v$. +A nonzero vector $v ∈ F^n$ is an **eigenvector** of $A$ if $v$ is an eigenvector of $L_A$ (that is, $∃λ ∈ F$: $Av = λv$) +The corresponding scalar $λ$ is the **eigenvalue** of $A$ corresponding to the eigenvector $v$. + +The terms *characteristic vector* and *proper vector* are also used in place of *eigenvector*. +The terms *characteristic value* and *proper value* are also used in place of *eigenvalue*. + +Theorem: $T: V → V$ is diagonalizable if and only if $V$ has an ordered basis $β$ consisting of eigenvectors of $T$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Corollary: If $T$ is diagonalizable, and $β = \{v_1, v_2, ..., v_n\}$ is an ordered basis of eigenvectors of $T$, and $D = [T]_β$, then $D$ is a diagonal matrix with $D_{i,i}$ being the eigenvalue corresponding to $v_n$ for any $i ≤ n$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +To *diagonalize* a matrix (or a linear operator) is to find a basis of eigenvectors and the corresponding eigenvalues. + +Theorem: A scalar $λ$ is an eigenvalue of $A$ if and only if $|A - λI_n| = 0$, that is, the eigenvalues of a matrix are exactly the zeros of its characteristic polynomial. +<details markdown="block"> +<summary>Proof</summary> +A scalar $λ$ is an eigenvalue of $A$ iff $∃v ≠ 0 ∈ F^n$: $Av = λv$, that is, $(A - λI_n)(v) = 0$. +... todo +</details> + +The **characteristic polynomial** of $A$ is the polynomial $f(t) = |A - tI_n|$. +The **characteristic polynomial** of $T$ is the characteristic polynomial of $[T]_β$, often denoted $f(t) = |T - tI|$. + +Theorem: The characteristic polynomial of $A$ is a polynomial of degree $n$ with leading coefficient $(-1)^n$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Corollary: $A$ has at most $n$ distinct eigenvalues. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Theorem: A vector $v ∈ V$ is an eigenvector of $T$ corresponding to an eigenvalue $λ$ iff $v ≠ 0 ∈ N(T - λI)$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +<!-- look at examples 6 and 7 in chapter 5 --> + +## Diagonalizability + +- Let $V$ be a finite-dimensional vector space over a field $F$. +- Let $T: V → V$ be a linear operator on $V$. +- Let $A$ be in $M_{n×n}(F)$ (a square $n×n$ matrix with entries in $F$). +- Let $λ$ be an eigenvalue of $T$. +- Let $λ_1, λ_2, ..., λ_k$ be distinct eigenvalues of $T$. + +Theorem: Let $λ_1, λ_2, ..., λ_k$ be distinct eigenvalues of $T$. If $v_1, v_2, ..., v_k$ are eigenvectors of $T$ such that $λ_i$ corresponds to $v_i$ (for all $i ≤ k$), then $\{v_1, v_2, ..., v_k\}$ is linearly independent. In fewer words, eigenvectors with distinct eigenvalues are all linearly independent from one another. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + + +Corollary: If $T$ has $n$ distinct eigenvalues, where $n = dim(V)$, then $T$ is diagonalizable. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +A polynomial $f(t) ∈ P(F)$ **splits over** $F$ if there are scalars $c, a_1, a_2, ..., a_n ∈ F$: $f(t) = c(a_1 - t)(a_2 - t)...(a_n - t)$. + +Theorem: The characteristic polynomial of any diagonalizable linear operator splits. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Let $λ$ be an eigenvalue of a linear operator or matrix with characteristic polynomial $f(t)$. The **(algebraic) multiplicity** of $λ$ is the largest positive integer $k$ for which $(t-λ)^k$ is a factor of $f(t)$. + +Let $λ$ be an eigenvalue of $T$. The set $E_λ = \{x ∈ V: T(x) = λx\} = N(T - λI_V)$ is called the **eigenspace** of $T$ with respect to the eigenvalue $λ$. Similarly, the **eigenspace** of $A$ is the eigenspace of $L_A$. + +Theorem: If $T$ has multiplicity $m$, $1 ≤ dim(E_λ) ≤ m$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Lemma: Let $S_i$ be a finite linearly independent subset of the eigenspace $E_{λ_i}$ (for all $i ≤ k$). Then $S = S_1 ∪ S_2 ∪ ... ∪ S_k$ is a linearly independent subset of $V$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Theorem: If the characteristic polynomial of $T$ splits, then $T$ is diagonalizable iff the multiplicity of $λ_i$ is equal to $dim(E_{λ_i})$ (for all $i ≤ k$). +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Corollary: If the characteristic polynomial of $T$ splits, $T$ is diagonalizable, $β_i$ is an ordered basis for $E_{λ_i}$ (for all $i ≤ k$), then $β = β_1 ∪ β_2 ∪ ... ∪ β_k$ is an ordered basis for $V$ consisting of eigenvectors of $T$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +## Direct Sums + +- Let $V$ be a finite-dimensional vector space over a field $F$. +- Let $T: V → V$ be a linear operator on $V$. +- Let $W_1, W_2, ..., W_k$ be subspaces of $V$. + +The **sum** of some subspaces $W_i$ (for $1 ≤ i ≤ k$) is the set $\{v_1 + v_2 + ... + v_k : v_i ∈ W_i \}$, denoted $W_1 + W_2 + ... + W_k$ or $Σ^k_{i=1} W_i$. + +The subspaces $W_i$ (for $1 ≤ i ≤ k$) form a **direct sum** of $V$, denoted $W_1 ⊕ W_2 ⊕ ... ⊕ W_k$, if $V = Σ^k_{i=1} W_i$ and $W_j ∩ Σ_{i≠j} W_i = \{0\}$ for all $j ≤ k$. + +Theorem: The following conditions are equivalent: +1. $V = W_1 ⊕ W_2 ⊕ ... ⊕ W_k$. +2. $V = Σ^k_{i=1} W_i$ and ??? todo +3. Every vector $v ∈ V$ can be uniquely written as $v = v_1 + v_2 + ... + v_k$ where $v_i ∈ W_i$. +4. If $γ_i$ is an ordered basis for $W_i$ (for $1 ≤ i ≤ k$), then $γ_1 ∪ γ_2 ∪ ... ∪ γ_k$ is an ordered basis for $V$. +5. There exists an ordered basis $γ_i$ for $W_i$ for every $1 ≤ i ≤ k$ such that $γ_i ∪ γ_2 ∪ ... γ_k$ is an ordered basis for $V$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> + +Theorem: $T: V → V$ is diagonalizable if and only if $V$ is the direct sum of the eigenspaces of $T$. +<details markdown="block"> +<summary>Proof</summary> +... +</details> |