--- layout: algebra title: mathematics/algebra/coding theory --- # MATH 342: Coding Theory Sections 1-9, 11-12, 14 ## Things I forget - $M$ = number of codewords. the sphere packing bound - $n$ = length of codewords - $d$ = distance between codewords. or minimal distance of a code - $k$ = the dimension of the code, number of row vectors in a generator matrix - $q$ = the alphabet - $V(n, q) = F_q^n$ - (n, ?, ?)-code - [?, ?, ?]-code - r = ??? - $C$: a code. - $G$: the generator matrix. - $H$: the parity-check matrix. ## What is coding theory? Coding theory deals with **codes**: ways to represent data robustly & securely. MATH 342 will mostly cover the theory behind *detecting & correcting* errors. Definition: A **code** C is a set of codewords, for example: ``` C = { w1 = 00000 w2 = 01011 w3 = 10101 w4 = 11110 } ``` A goal of codes is to *detect and correct errors*. How can we do this? Coming back to it later. Definition: A **q-ary** (n, M, D) code C is a set of codewords such that: - $q$ = size of alphabet - $n$ = length of codewords (when consistent) - $M$ = number of codewords - $d$ = *minimal distance* of the code A *2-ary code* is a **binary code**, a *10-ary code* is a **decimal code**. etc. Let $(F_q)^n$ denote the set of all ordered $n$-tuples $a = a_1 a_2 ... a_n$, where each $a_i \in F_q$. The elements of $(F_q)^n$ are called *vectors* or *words*. The order of the set $(F_q)^n$ is $q^n$. A $q$-ary code of length $n$ is a subset of $(F_q)^n$. ### Distance functions & Hamming distance Consider the distance: - $d(x, y) = \text{# of places where they differ}$ - $d(C) = min(d(x, y))$ such that $x ≠ y$ Definition: The **Hamming distance** of two codes a, b is the number of *places* where a and b differ. The **distance function** $dist(x, y)$ on a set $S$ satisfies: - $dist(x, y) == dist(y, x)$ - $dist(x, y) ≥ 0$ and $dist(x, y) == 0 ⟺ x == y$ - $dist(x, z) ≤ dist(x, y) + dist(y, z)$ The last one is known as the *triangle inequality*. Aside: A set $S$ plus a distance function $dist$ is known as a **metric space**. ### Error detection & correction Definition: The **minimum distance** between distinct codewords is denoted $d(C)$, and given to be $d(C) = min \{dist(x, y) | x, y ∈ C, x ≠ y\}$, for some code $C$. Given a code $C$, $d(C) = d$ (where $d(C)$ is the minimum distance in the code) 1. The code can *detect* up to $d-1$ errors. 2. The code can *correct* up to $d-1 / 2$ errors. ### Nearest neighbour decoding Just how should we go about *decoding* these codes? Particularly when there is error correction involved. This is obvious, but... the **nearest neighbour decoding** is the decoding such that the distance between the received word and a codeword is as *small* as possible. This *maximizes* the likelihood of correcting errors given the following assumptions: - Every symbol is equally likely to have been received in error - If a symbol is received in error, then each of the $q-1$ possible errors is equally likely. This is also known as a $q$-ary **symmetric channel**. ### What makes a good code? **The main coding theory problem.** A good (n, M, D)-code has: - a small $n$: for compactness & subsequently speed - a large $M$: to fit many codewords & subsequently a variety of messages - a large $d$: to correct many errors What is known as **the main coding theory problem** is a solution of how to balance these conflicting aims, by the way of *optimizing* one of the parameters for *given values* of the other two. The **usual version of the main coding problem** is to find the largest possible $M$, i.e. the code with the most codewords, given a fixed $n, d, q$. We denote $A_q(n, d)$ to be the largest M s.t. there exists a $q$-ary $(n, M, d)-code$. - This is known for $d=1$: $A_q(n, 1) = q^n$ - This is known for $d=n$: $A_q(n, n) = q$ ### Equivalence of codes **Definition**: Two codes are *equivalent* if we can go from one to the other using elementary operations. Elementary operations are: 1. Permute rows. 2. Permute columns. 3. Permute all the symbols in a column. Any $q$-ary $(n,M,d)$-code over an alphabet $\{0,1,...,q-1\}$ is also equivalent to an $(n,M,d)$-code which contains the zero vector. - Proof: This is trivial by ordering some code at the top and permuting the non-zero symbols to zero. ### Weight of a vector The weight of a vector in $(F_2)^n$ is simply the count of $1$ bits. - If $x, y \in (F_2)^n$, then $d(x,y) = w(x+y)$. - If $x, y \in (F_2)^n$, then $d(x,y) = w(x) + w(y) - 2w(x ∩ y)$. Some theorems about oddness and evenness: - If $d$ is odd, then a binary $(n,M,d$-code exists if and only if a binary $(n+1, M, d+1)$-code exists. - If $d$ is odd, then $A_2(n+1, d+1) = A_2(n,d)$. Equivalently, if $d$ is even, then $A_2(n,d) = A_2(n-1,d-1)$. ### Binomial coefficients Note the definition of the binomial: $$\binom{n}{m} = \frac{n!}{m!(n-m)!}$$ An ordered selection of $m$ distinct objects from a set of $n$ distinct objects can be made in $\frac{n!}{(n-m)!}$ ways. An unordered selection of $m$ distinct objects from a set of $n$ distinct objects can be made in $\binom{n}{m} = \frac{n!}{m!(n-m)!}$ ways. todo: binomial coeffients sphere packing bounds perfect codes balanced blocks baby finite fields ### Sphere Packing Bound Consider $(n, M, d)$ for some q-ary code C. - n: length of words - M: number of words - d: minimum distance - q: alphabet set (??) (n, M, d)-codes [n, k, d]-codes are the same as (n, q^k, d)-codes - linear, so M = q^k and linear [n, k]-codes leave d implied (also linear) Ham(r, q) Sphere packing is an application of the main coding theory problem. Fix q, n, d. Then $A_q(n, d) = max(M)$. Assume d is odd, and d = 2t + 1. $F_q^n$ = all words of length n Consider the idea of fitting codewords in a box: we can represent those as spheres, and so the distance does fancy stuff $S(w_i, t)$ = all words of distance ≤ t from $w_i$ - where $w_i$ = the center - t = radius $A_q(n, d)$ = the maximum number of disjoint spheres of radius t that fit into $F_q^n$ $N_t$ = the number of points in a sphere of radius t $F_q^n$ has $q^n$ points **Sphere packing bound** M = number of spheres ≤ (# points in $F_q^n$ / # points in a sphere) = ($q^n/N_t$) The **sphere-packing bound** or **Hamming bound** states that a q-ary $(n, M, 2t+1)$-code satisfies $M ≤ q^n$. A code that achieves the sphere-packing bound is called a **perfect code**. Intuitively, for a perfect code, the $M$ speheres of radius $t$ centred on the words "fill" the whole space $(F_q)^n$ without overlapping. Example: Binary reptition codes: $n=3$, $d=3$, $q=2$: i.e. $C = \{000, 111\}$. - $M ≤ 2^3 / N_t = 8/4 = 2$ All binary repitition codes are perfect, i.e. when n = d Finding $N_t$ for a general n, q, t: choose which $n$ symbol to choose a new symbol: q-1 choices ??? n(q-1) Example: finding the bound for $q=3$, $n=7$, $d=5$. How big of a code can we find for these parameters? $d=5$, so $t=2$: $d=2t+1$ $N_t$ = 1 + (n 1)(q-1) + (n 2)(q-1)^2 = 99$ $M ≤ 3^7/99 = 81*27/99 = 22.9$ $A_3(n=7, d=5) ≤ 22$ ## A Taste of Abstract Algebra Recall: sphere packing. in $ℝ^2$, it's easy: points form a lattice. If you fix the zero point my center is closed under addition A **lattice** is a set of points closed under addition. $ℤ^n ⊆ ℝ^n$ We want to do the same for any code C that is closed under addition. How do we add two vectors? $C ⊆ F_q^n =~ ℝ^n$ ### Rings and Fields A **ring** $R$ is a set of elements with *two* operations + (addition) and • (multiplication) satisfying: - $F$ is *closed* under $+$ and $•$ - + is commutative, associative, and distributive; • is associative and distributive - commutative on addition: $a+b = b+a$ - associative: $(a+b)+c = a+(b+c)$, $a•(b•c) = (a•b)•c$ - distributive: $a•(b+c) = a•b + a•c$ - The additive identity element $0$ must exist - $∀a ∈ R : a + 0 = a$ - The multiplicative identity element $1$ must exist - The additive inverse element $(-a)$ must exist - $∀a ∈ F : a + (-a) = 0$ We will take "ring" to refer to **abelian/commutative rings**. This adds commutativity on multiplication (), and the multiplicative identity element $1$ ($∀a ∈ R : a • 1 = a$). - unital - why with an identity????? do rings not normally have both additive and multiplicative identities????????? in every ring, 0 times something is 0 0 + a = b 0b + ab = ab R = \{0\} is a ring, not a field 2ℤ has no 1 A **field** $F$ is a set of elements with *two* operations + (addition) and • (multiplication) satisfying: - $F$ is *closed* under + and • - + and • are commutative, associative, and distributive - commutative: $a+b = b+a$, $a•b = b•a$ - associative: $(a+b)+c = a+(b+c)$, $a•(b•c) = (a•b)•c$ - distributive: $a•(b+c) = a•b + a•c$ - The additive identity element $0$ must exist - $∀a ∈ F : a + 0 = a$ - The multiplicative identity element $1$ must exist - $∀a ∈ F : a • 1 = a$ - The additive inverse element $(-a)$ must exist - $∀a ∈ F : a + (-a) = 0$ - The multiplicative inverse element $a^{-1}$ must exist, when $a ≠ 0$ - $∀a ∈ F : a • a^{-1} = 1$ - is this true for 0???? Fields behave as standard operations on the rationals and reals. The only difference between a field and an (abelian) ring is the existence of a *multiplicative inverse*. A ring can also be regarded as having *three* operations, with - given by its status as an additive inverse, i.e. $a-b = a+(-b)$. A field can also be regarded as having *four* operations, with - and ÷ given by their definitions as inverses, i.e. $a-b = a+(-b)$ and $a÷b = a(b^{-1})$ for $b ≠ 0$. The operation $a•b$ can be more succienctly written as $ab$. The operation $a÷b$ can be alternatively writtten as $a/b$. ### Euclidean Arithmetic For a given number $q + r/m$, $r$ is the **principle remainder**. $m$ **divides** $n$ $m|n$ if $n/m$ is an integer. A number is **prime** if its divisors are $1, n$ only. Theorem: $gcd(m, n) = gcd(m, n-q•m)$ ### The modular integers Consider $ℤ_m = (\{0,1\}, +, •)$ For example: $ℤ_2 = \{0,1\}$. - $0+1=1$ - $1+1=0$ - $0•1=0$ - $1•1=1$ Another example: $ℤ_4 = \{0,1,2,3\}$ `+` | 0 1 2 3 ----------- 0 | 0 1 2 3 1 | 1 2 3 0 2 | 2 3 0 1 3 | 3 0 1 2 `•` | 0 1 2 3 ----------- 0 | 0 0 0 0 1 | 0 1 2 3 2 | 0 2 0 2 3 | 0 3 2 1 Note the difference between the integers and the reals. In ℤ, $a/b$ may not exist - ℤ is a *ring*. In ℝ, $a/b$ always exists if $b != 0$ - ℝ is a *field*. ### Division in $ℤ_m$ When does $1/n$ exist in $ℤ_m$? - $1/n = a ⟺ a•n ≡ 1 (mod m)$ - $⟺ a•n + b•m = 1$ for some b - $⟺ gcd(n,m) = 1$ Some examples of modular math: - $a•5 + b•7 = 1$, $3•5 + 2•7 = 1$ Ex. $ℤ_4$, $1/2$ does not exist - gcd(2, 4) != 1 Theorem: $ℤ_m$ is a field if and only if $m$ is prime. Proof: - Assume $M$ is prime. - We must show every $n$ in $\{1, 2, 3, ...\}$ is invertible if and only if $gcd(m, n) = 1$. - If m$m is not prime, then $m = m_1*m_2$, $m_1, m_2 > 1$. Then $0 != m$ has no inverse gcd($m_1, m_1) = m_1 > 1$. - ???? The **invertible elements** of $ℤ_m$ (units of ℤ_m) is expressed as $ℤ_m^x$. It is the **group of units**. - ex. $ℤ_7^x = \{0, 1, 2, 3, 4, 5, 6\} = ℤ_7$ - ex. $ℤ_8^x = \{1, 3, 5, 7\} ⊂ ℤ_7$ A **finite or Galois field** $𝔽_p, GF(p)$ is a field that, well, is finite. Every $ℤ_p$, $p$ is prime, is a finite field. Every $ℤ_{p^n}$ is also a field! $𝔽_4$ is a four element field. It is *not* necessarily $ℤ_4$! - clarify: what? ℤ_7 counts as 𝔽_7? exact numbers do not matter? does 𝔽_4 exist, then? Consider $ℤ_6$. $2•3 = 0$. A field has *no* zero divisors. Why? - $a*b=0$, a and b are not zero - $1/a (a*b = 0)$ and $a != 0$. b = $0/a = 0$ The **cancellation property** of a field states that if $a*b = a*c$ and $a != 0$, then $b=c$. ### Back to coding theory We can now bridge our knowledge of fields back to coding theory: consider *alphabets* to be *finite fields*, $𝔽_q^n$ to be a *vector space*, with *codewords* as vectors ($w = (a_1, a_2, ..., a_n))$ An example is an ISBN code. - textbook: $0-19-853803-0$ - language, publisher, book number, checksum - codewords: all vectors $(x_1, ..., x_{10})$ that satisfy the linear equation - $1x_1 + 2x_2 + ... + 10x_{10} = 0$ - because the checksum changes to satisfy that equation checksum: set such that it satisfies the equation why is the distance between isbns two? cause you have it 1 normally and 2 with the checksum The ISBN code can: - Detect a single error - Detect if we switch two numbers - Can correct a single error *if we know where it is* Given an equation $1x_1 + 2x_2 + ... + 10x_{10} = 0$, $x_7 = -\frac{1}{7}(1x_1 + 2x_2 + ... + 10x{10} - 7x_7)$ These linear equations are linear, so linear algebra comes in handy... ... missed a bunch ... ### Vector Spaces on Finite Fields Our finite fields can be viewed as **vector spaces**, with the *elements* as **scalars** and the ordered $n$-tuples over $GF(q)$ as **vectors**. We denote the set of all vectors of length $n$ as $V(n,q)$. We must define two additional operations within $V(n,q)$: - *addition* of vectors: if $x = (x_1, x_2, ..., x_n)$ and $y = (y_1, y_2, ..., y_n)$ and $x,y \in V(n, q)$, then $x+y = (x_1 + y_1, x_2 + y_2, ..., x_n + y_n$. - *multiplication* of a vector by a scalar: if $x = (x_1, x_2, ..., x_n) \in V(n, q)$, $a \in GF(q)$, then $ax = (ax_1, ax_2, ..., ax_n)$. The formal axioms for a vector space are as follows. Given $\textbf{u} = (u_1, u_2, ..., u_n)$, $\textbf{v} = (v_1, v_2, ..., v_n)$, $u,v \in V(n, q)$: - $\textbf{u}+\textbf{v} \in V(n, q)$ - $(\textbf{u}+\textbf{v}) + \textbf{w} = \textbf{u} + (\textbf{v}+\textbf{w})$ - $\textbf{0} = (0, 0, ..., 0) \in V(n, q)$ satisfying $\textbf{u} + \textbf{0} = \textbf{0} + \textbf{u} = \textbf{u}$. - $-\textbf{u} = (-u_1, -u_2, ..., -u_n) \in V(n, q)$ - $\textbf{u} + \textbf{v} = \textbf{v} + \textbf{u}$ - (note that the above show $V(n, q)$ forms a commutative group under addition) - $a\textbf{v} \in V(n, q)$ - $a(\textbf{u} + \textbf{v}) = a\textbf{u} + a\textbf{v}$, $(a+b)\textbf{u} = a\textbf{u} + b\textbf{u}$ - $(ab)\textbf{u} = a(b\textbf{u})$ - $1\textbf{u} = \textbf{u}$ A (non-empty) *subset* $C$ of $V(n,q)$ is called a **subspace** of $V(n,q)$ if and only if $C$ is closed under addition and scalar multiplication. i.e: - if $x,y \in C$, then $x+y \in C$ - if $a \in GF(q)$ and $x \in C$, then $ax \in C$. The set $\{0\}$ and the set $V(n, q)$ are **trivial subspaces** and present for every vector space. A subspace is *non-trivial* if it contains at least one vector other than 0. - q: is the subspace that is just the space trivial? A **linear combination** of $r$ vectors $\vec{v_1}, \vec{v_2}, ..., \vec{v_r}$ in $V(n, q)$ is a vector of the form $a_1\vec{v_1} + a_2\vec{v_2} + ... + a_r\vec{v_r}$ where $a_i$ are scalars. A set of vectors is **linearly dependent** if they can form the zero vector as a linear combination, without all scalars being 0. A set of vectors is **linearly independent** if they *cannot* form the zero vector as a linear combination without all scalars 0. Let $C$ be a subspace of $V(n,q)$. Then a *subset* of $C$ is called a **generating set** (or spanning set) of $C$ if every vector in $C$ can be expressed as a linear combination of the vectors in the subset. A generating set which is linearly independent is also called a **basis**. For a (non-trivial) subspace $C$ of $V(n,q)$, any generating set of $C$ contains a basis of $C$. For a basis of a subspace $C$ of $V(n,q)$, every vector of $C$ can be expressed *uniquely* as a linear combination of the basis vectors, and $C$ contains exactly $q^k$ vectors. It follows that any two bases of $C$ contain the same number $k$ vectors: this is called the **dimension** of $C$ and denoted $dim(C)$. chapter 4 done ## Linear Codes A **linear code** over $GF(q)$ is a subspace of $V(n,q)$ for some positive integer $n$. Thus, a *subset* $C$ of $V(n,q)$ is a linear code if and only if it is: - closed under addition: $u + v \in C$ for all $u,v \in C$ - closed under scalar multiplication: $au \in C$, for all $u \in C$, $a \in GF(q)$ - q: why subset? If $C$ is a k-dimensional subspace of $V(n,q)$, then the linear code $C$ is also called an **[n,k]-code** or a **[n,k,d]-code**. - n: length of codewords - k: dimension of $C$ - d: minimum distance of $C$ Note: a q-ary [n,k,d]-code is also a q-ary (n,q^k,d)-code but not every (n,q^k,d)-code is an [n,k,d]-code? do not understand - also known as group codes The **weight** $w(\vec{x})$ of a vector $\vec{x} \in V(n,q)$ is the number of non-zero entries of $\vec{x}$. For an $x,y \in V(n,q)$, $d(x,y) = w(x-y)$. From this, the **minimum distance** of a linear code is equal to the smallest non-zero weight of the codewords. Linear codes are nice because we can determine the weight by simply examining all $M$ codewords. They're also nice because we can describe them by simply giving a basis. A $k×n$ matrix whose rows form a basis of a linear $[n,k]-code$ is called a **generator matrix** of the code. - ex. the generator matrix $[0 1 1][1 0 1]$ describes a certain $[3,2,2]-code$ Linear $q-ary$ codes *cannot* be defined unless $q$ is a prime power. However, it's pretty easy to transform non-prime-power $q-ary$ codes into prime-power $q-ary$ codes. - q: what's q again? (size of alphabet) what does a prime power entail again? ### Equivalence of Linear Codes Two linear codes over $GF(q)$ are called **equivalent** if one can be obtained from the other by a combination of operations being: - permutation of positions of the code - multiplication of the symbols in a fixed position by a (non-zero) scalar - q: ??? what? Two $k×n$ matrices generate equivalent linear $[n,k]-codes$ over $GF(q)$ if one matrix can be obtained from the other by a sequence of operations: - (R1) Permutation of rows - (R2) Multiplication of a row by a (non-zero) scalar - (R3) Addition of a scalar multiple of one row to another - (C1) Permutation of columns - (C2) Multiplication of any column by a (non-zero) scalar Let $G$ be a generator matrix of an $[n,k]-code$. Then by performing the above operations, $G$ can be transformed to the **standard form** $[I_k | A]$, where $I_k$ is the $k×k$ identity matrix, and $A$ is a $k×(n-k)$ matrix. If two generator matrices can be transformed to the same standard form by *row operations* only, they generate the **same** code. If they must use the column operations to reach the same standard form, they are **equivalent**, but not necessarily the *same* code. - q: not *necessarily* the same? if there is no way without column operations must they be different? chapter 5 done ### Encoding with linear codes A **coset** of an $[n,k]$-code $C$ over $GF(q)$... Suppose that $C$ is an $[n,k]$-code over $GF(q)$ and that $\vec{a}$ is any vector in $V(n, q)$. Then the set $\vec{a} + C$ defined by $\vec{a} + C = \{\vec{a}+\vec{x} | \vec{x} \in C\}$ is called a **coset** of $C$. For an $[n,k]$-code $C$ over $GF(q)$: - *every vector* of $V(n,q)$ is in some coset of $C$ - every coset contains *exactly* $q^k$ vectors - *every coset* is either disjoint or identical (no partial overlap) - Proof: todo... todo: proof The **coset leader** is the vector having minimum weight in a coset. (there may be multiple. it is generally fine to pick one randomly.) A (Slepian) **standard array** for an $[n,k]$-code $C$ is a $q^{n-k} × q^k$ matrix of *all* of the vectors in $V(n,q)$, with the first row consisting of the code $C$, and the following rows consisting of the cosets of $C$, each arranged with the coset leader first ($\vec{0}$ for $C$) and the codewords in consistent order, so that they line up by columns. $$\begin{bmatrix} 0000 & 1011 & 0101 & 1110\\ 1000 & 0011 & 1101 & 0110\\ 0100 & 1111 & 0001 & 1010\\ 0010 & 1001 & 0111 & 1100\\ \end{bmatrix}$$ Simple decoding of linear codes may be performed with the observation that *cosets correspond to distance* with regard to error correction (rephrase). By choosing coset leaders to be minimum weight vectors, the standard array decoding is a nearest neighbour decoding scheme. An entry may be looked up in the standard array, and the corresponding codeword may be found by looking at the *top entry* of its column. ### Probability of error correction ### Probability of error detection ### cosets ### Parity-check matrices ... missed a bunch ... ## Chapter 7: The dual code etc ### Dual Codes The **inner product** $u ⋅ v$ of vectors $u$ and $v$ is the *dot product*. (todo: learn relation) If $u ⋅ v = 0$, $u$ and $v$ are **orthogonal**. - $u ⋅ v = v ⋅ u$ - $(λu + μv) ⋅ w = λ(u ⋅ w) + μ(v ⋅ w)$ For some linear $[n,k]$-code $C$, the **dual code** of $C$ (denoted $C^⊥$) is the set of vectors in $V(n,q)$ which are *orthogonal to every codeword of $C$*. We may generalize this to say: a vector $v ∈ V(n,q)$ belongs to $C^⊥$ if and only if $v$ is orthogonal to every row of $G$, the generator matrix for $C$. - $C^⊥ = \{v ∈ V(n,q) | v ⋅ u = 0\}$, for all $u ∈ C$ $C^⊥$ has dimension $n-k$: in other words, if $C$ is an $[n,k]$-code over $GF(q)$, the dual code $C^⊥$ is a linear $[n,n-k]$-code. We know that $(C^⊥)^⊥ = C$. A **parity-check matrix** $H$ for an $[n,k]$-code $C$ is a generator matrix of $C^⊥$. Note that this means any linear code is completely specified by a parity-check matrix. If $G = [I_k | A]$ is the standard form generator matrix of an $[n,k]$-code $C$, then a parity-check matrix in **standard form** is $H = [-A^T | I_{n-k}]$. ## Syndrome Decoding For a parity-check matrix $H$, a **syndrome** of a given row vector is... ## Hamming Codes The Hamming codes are an important family of single-error-correcting codes which are easy to encode and decode. They are linear and definable over any finite field $GF(q)$. Let $r$ be a positive integer and let $H$ be an $r×(2^r - 1)$ matrix whose columns are the distinct non-zero vectors of $V(r, 2)$. Then the code having $H$ as its parity-check matrix is called a **binary Hamming code** and is denoted by $Ham(r,2)$. To find a Hamming code, you start with the parity-check matrix $H$, then find the generator matrix $G$, then find the code $C$. $$H = [-A^⊤ | I_{n-k}]$$ $$G = [I_k | A]$$ For a binary Hamming code of $r ≥ 2$: - it is a $[2^r - 1, 2^r - 1 - r]$-code - it has minimum distance $d=3$ - it is a perfect code. ### How to find $dist(C)$ from H C = Nul(H), codewords = $[a_1 a_2 ... a_n]$ $H·\begin{bmatrix}a_1\\ a_2\\ ...\\ a_n\end{bmatrix}$ = $\vec{0} \iff [a_1 a_2 ... a_n]·H^t = \vec{0}$ where $H_i$ are the columns of $H$, $H = \begin{bmatrix}1&1&...&1\\H_1&H_2&...&H_n\\1&1&...&1\end{bmatrix}$ $H·\begin{bmatrix}a_1\\ a_2\\ ...\\ a_n\end{bmatrix} = a_1 H_1 + a_2 H_2 + ... a_n H_n = \vec{0}$ A codeword = a linear relation among $H_i$ The smallest weight of a codeword = the shortest linear relation among $H_i$ ex. Consider a codeword of weight = 1, i.e. $\vec{w} = [0 ... 0 a_i 0 ... 0]$ Relation: $a_i H_i = \vec{0} \iff H_i = \vec{0}$ There is a codeword of weight 1 iff some $H_i = \vec{0}$ ex. Consider a codeword of weight 2, i.e. $\vec{w} = [0 a_i a_j 0]$ Relation: $a_i H_i + a_j H_j _ \vec{0} \iff H_i = -\frac{a_j}{a_i} H_j$ ### Theorems about distance For an [n, k] q-ary code, any set of $d-1$ columns in the matrix H (parity check matrix) have to be linearly independent. A Hamming code has to have distance 3. Suppose $C$ is a linear $[n, k]$-code over $GF(q)$ with parity-check matrix $H$. THen the minimum distance of $C$ is $d$ if and only if $d-1$ columns of $H$ are linearly independent but some $d$ columns are linearly dependent. ... missed a bunch... ## Polynomials We will denote the field $GF(q)$ by $F_q$, or frequently $F$ (with $q$ implicit). Then, $F[x]$ is the set of polynomials in $x$ with **coefficients** in $F$. Polynomials in $F[x]$ form a **ring**. They: - Contain the zero vector $\vec{0}$ - Are closed under addition: $(5x^2 + x^1) + (x^3 + 2x^2) = x^3 + 7x^2 + x^1$ - Are closed under subtraction: $(5x^2 + x^1) - (x^3 + 2x^2) = -x^3 + 3x^2 + x^1$ - Are closed under multiplication: $(5x^2 + x^1) · (x^3 + 2x^2)$ = 5x^5 + 11x^4 + 2x^3$ - *Do not* have multiplicative inverses! So they are not a field. Just as $Z_m$ is a ring, so also is $F[x] / f(x)$: it is called For some polynomial $f(x) = f_m x^m + f_{m-1} x^{m-1} + ... f_1 x + f_0$, $m$ is called the **degree** of $f(x)$, and $f_m$ is called the **leading coefficient** of $f(x)$. By convention, the degree of the zero polynomial is $-∞$. (why??) A polynomial is called **monic** if its leading coefficient is 1. Note that if $f(x), g(x) \in F[x]$, $deg(f(x)g(x)) = deg(f(x)) + deg(g(x))$. The division algorithm exists for polynomials just as it exists for the integers. For any pair of polynomails $a(x), b(x) \in F[x]$, there exists a unique pair of polynomials $a(x)$ and $b(x)$ ### The division algorithm for polynomials Some polynomial $a$ in $Z_p$ is a **root* of $f(x)$ ($\in Z_p [x]$) if $f(a) = 0$. A polynomial $a$ is a root of $f(x)$ if and only if $x-a$ divides $f(x)$. A polynomial $f(x) \in F[x]$ is said to be **reducible** if $f(x) = a(x)b(x)$, where $a(x), b(x) \in F[x]$ and $deg(a(x)) < deg(f(x))$ and $deg(b(x)) < deg(f(x))$, i.e. it is a product of smaller degree polynomials. If $f(x)$ is not reducible, it is called **irreducible**. Just as any positive integer can be factorized uniquely into a product of prime numbers, any *monic* polynomial in $F[x]$ can be factorized uniquely into a product of irreducible monic polynomials. Some useful lemmas: - A polynomial $f(x)$ has a linear factor $x-a$ if and only if $f(a) = 0$. - A polynomial $f(x) \in F[x]$ of degree 2 or 3 is irreducible if and only if $f(a) ≠ 0$ for all $a \in F$. - Over any field, $x^n - 1 = (x-1)(x^n-1 + x^n-2 + ... + x + 1)$. Proof for a polynomial of degree 2 or 3 being irreducible: ... The property of a polynomial being irreducible corresponds directly to the property in $ℤ$ of a number being prime. - The ring $Z_m$ is a field if and only if $m$ is prime. - The ring $F[x]/f(x)$ is a field if and only if $f(x)$ is irreducible in $F[x]$. - For any prime number $p$ and any positive integer $h$, there exists an irreducible polynomial over $GF(p)$ of degree $h$. This corresponds to the existence of fields $GF(p^h)$ for all integers $h ≥ 1$. --- Factoring polynomials $f(x) \in Z_p [x]$ $f(x) = C·p_1(x) p_2(x) ... p_m(x)$, $p_1 (x)$ implies it is monic, and irreducible $n \in ℤ$ $n = \pm p_1 p_2 ... p_m$ 4 | irreducible polynomials: x, x+1 0b10 = 2 0b11 = 3 x^2 + x + 1 0b111 = 7 x^3 + x + 1 0b1011 = 11 x^3 + x^2 + 1 0b1101 = 13 but no 5???? huh x^2 + 1 ### Factorization of polynomials Factorization: a polynomial is fully reduced if it is fully a product of irreducible polynomials. How to factor: just, kinda do it... try dividing by every irreducible polynomial All (monic) irreducible polynomials in $Z_2 [x]$: deg. | all. | reducible. | irreducible -----|------|------------|------------ 1 | $x$, $x+1$ | - | $x$, $x+1$ 2 | $x^2$, $x^2 + 1$, $x^2 + x$, $x^2 + x + 1$ | $x^2 = x·x$, $x^2 + x = x(x+1)$, $x^2 + 1 = x^2 + 2x + 1 = (x+1) (x+1)$ | $x^2 + x + 1$ 3 | 8 poly | 6 poly | $x^3 + x + 1$, $x^3 + x^2 + 1$ Note: while these irreducible polynomials can be written as binary numbers, ex. $x+1 ~ 11_2 = 3$, and all of the above examples are prime, there is *not* a correspondence with primes. See $x^2 + 1$: $101_2 = 5$, but $x^2 + 1 ≡ (x+1)(x+1)$ (in $Z_2 [x]$). ex. Factor $$f(x) = x^6 + x^5 + x^4 + x^3 + x^2 + x$$ 1. Check whether an irreducible polynomial divides $f(x)$. - Does $f$ have roots? Yes, $f(0) = 0$. So $x-0$ divides $f$. - $f(x) = x g(x) = x(x^5 + x^4 + x^3 + x^2 + x + 1)$ - $g(x) = (x^5 + x^4 + x^3 + x^2 + x + 1)$ 2. Keep checking whether a linear? polynomial divides $f(x)$. - Does $f$ have roots? Yes, $f(1) = 0$. So $x-1$ divides $f$. - Note that $x-1 ≡ x+1$ mod $2$. - $f(x) = x (x+1) h(x) = x(x+1)(x^4 + x^2 + 1)$ - $h(x) = (x^4 + x^2 + 1)$ 3. Now no linear? polynomials divide $f(x)$. What to do? - $h(x)$ may still be factored... - By an identity in $Z_2$, $(a + b + c)^2 ≡ (a^2 + b^2 + c^2)$ (in $Z_2$) - Verify by long division on $(x^4 + x^2 + 1)$: $(x^2 + x + 1)^2$ - $f(x) = x(x+1)(x^2+x+1)^2$ ### Modular polynomials The idea behind modular polynomials: just modular arithmetic! $Z_p [x] / f(x)$ denotes the set of polynomials in $Z_p [x]$ that are congruent mod $f(x)$, i.e. $f(x) ≡ 0$. Definition: $g(x), f(x) \in Z_p [x]$ are congruent (mod $f(x)$), i.e. $$g ≡ h (\mod f)$$ if $g(x) = h(x) + t(x) f(x)$ for some $t(x)$. Every $g(x)$ is congruent to a unique polynomial $r(x)$ when $\deg r(x) < \deg f(x)$. - Proof: just divide $g(x)$ by $f(x)$! - $\frac{g}{f} = q(x) + \frac{r(x)}{f(x)}$ - $g = f·q + r$ - $f·q ≡ 0$ Note: for $ℤ/(m)$, every $n$ is congruent to an element in $\{0, 1, ..., m-1\}$. i.e. $Z_m = ℤ/(m)$. ??? what is this? Definition: Let $f(x) \in Z_p [x]$, $\deg f = n > 0$. Then $Z_p [x] / f(x)$ = the set of all polynomials $r(x)$ with deg. r < n. Operations `+`, `*`: - Do the usual operations with polynomials - If the result has degree ≥ n, then take its residue mod $f(x)$. - This is a *ring*! We have addition, subtraction (additive inverse), multiplication... no division - *also* a vector space! isomorphic to $Z_p^n$ - In a vector space: need addition and *scalar multiplication* ex. $Z_2 [x] / (x^3 + x)$ (we'll consider $f(x) = x^3 + x$) Addition: polynomial addition - $(1+x^2) + (x+x^2) = 1+x$ (as $2??? ≡ 0$ in $Z_2$) Multiplication: - $(1+x^2)(x+x^2) = x+x^2+x^3+x^4 ≡ r(x)$ (mod $f$) - $x^4 + x^3 + x^2 + x ÷ x^3 + x = x+1$ - by long division - So $(1+x^2)(x+x^2) ≡ x+1$ ### Quotient polynomial rings We've previously? defined quotient rings: $Z_m = ℤ/(m)$, where: - $m ≡ 0$ - $n ≡ r$ (mod $m$) - $r =$ the principle residue - $n/m = q + r/m$, for $0 <= r <= m$ - $Z_m$ then equals the set of *prime residues*, i.e. $\{0, 1, 2, ..., m-1\}$ for $m$ We can also have quotient polynomial rings: $Z_p [x] / f(x)$, where - $f(x) ≡ 0$ - $g(x) ≡ r(x)$ (mod $f$) - $r(x) =$ the prime residue - $g(x)/f(x) = q(x) + r(x)/f(x)$ for some $deg(r) < deg(f)$ - $Z_p [x] / f(x)$ then equals the set of *prime residues*, i.e. $r(x)$ st. $deg(r) < deg(f)$ (???) Operations on quotient rings are the usual: `+`, `·`. But when can we divide? - We can divide when $1/n$ exists, which is equivalent to $\gcd(n, m) = 1$ - To have a field: $m = p$ prime Ex. $Z_2 [x] / (x^3 + x + 1)$ - Note: this is a field! $(x^3 + x + 1)$ is irreducible - Goal: find $1/(x+1)$ - $gcd(x^3 + x + 1, x+1)$ - $gcd(x^3 + x + 1 - (x+1)(x^2 + x), x+1) = gcd(1, x+1)$ - $gcd(1, x+1 - 1·(x+1)) = gcd(1, 0)$ - $gcd(f, g)$ - $gcd(f - g(x^2 + x), g)$ - $1 = f-g(x^2 + x)$ - $1 = -(x^2 + x) g$ (mod $f$) - $1/g = x^2 + x$ Ex. $Z_3 [x] / (x^2 + x + 2)$ - Goal: find $1/x$ - $gcd(x^2 + x + 2, x)$ - $gcd(x^2 + x + 2 - (x+1)·x, x)$ - $gcd(2, x)$ - $gcd(f, g)$ - $gcd(f - (x+1)·g, g)$ - $2 = f - (x+1)·g$ - $1 = 2(f - (x+1)·g)$ - $1/g = -2(x+1) = x+1$ So what about fields? $Z_p [x] / f(x)$, where: - The characteristic $p$ is prime - $f(x)$ (of degree n) is irreducible - $\{a_0 + a_1 x + a_2 x^2 + ... + a_n-1 x^n-1 | a_i \in Z_p\}$ $\simeq$ $Z_p^n$ - The field has $q = p^n$ elements - Fact: every finite field is of this type, $𝔽_q$ + Two fields with the same number of elements turn out to be the same field! They're *isomorphic*. Ex. $Z_2 [x] / (x^3 + x + 1) = 𝔽_8$ -> $Z_2 [x] / (x^3 + x^2 + 1) = 𝔽_8$ - x -> x+1 Ex. $F_4 = Z_2 [x] / (x^2 + x + 1) \neq Z_4$ - $\{0, 1, x, x+1\} \neq \{0, 1, 2, 3\}$ 𝔽_4 | 0 | 1 | x | x+1 ----|---|---|---|---- 0 | 0 | 0 | 0 | 0 1 | 0 | **1** | x | x+1 x | 0 | x | x+1 | **1** x+1 | 0 | x+1 | **1** | x Z_4 | 0 | 1 | 2 | 3 ----|---|---|---|--- 0 | 0 | 0 | 0 | 0 1 | 0 | **1** | 2 | 3 2 | 0 | 2 | 0 | 2 3 | 0 | 3 | 2 | **1** Every $𝔽_q$ has a primitive element $a(x)$: $$a(x), a^2(x), ..., a^{q-1}(x)$$ are all distinct nonzero elements of $𝔽_q$ Ex. In $𝔽_4$, is $x$ primtive? - $x x^2 = x+1$, $x^3 = x(x+1) = 1$ - $b^q-1 = 1$ for any $0 \neq b \in 𝔽_q$ Little Fermat's theorem - $b = [a(x)]^i$, $b^q-1 = [a(x)^q-1]^i = 1^i = 1$ Example from first midterm: - In $Z_m$, $b^m-1 = 1$ holds iff $m=p$ is prime - ex. Z_123$, $b^122 = 1$ - $Z_m^x = \{n \in Z_m | 1/n exists\}$ - $b^{|Z_m^x|} = 1$ if $b \in Z_m^x$ - $Z_4^x = \{1,3\}$, $b^2 = 1$, $3^3 \neq 1$ ## Cyclic codes A code $C$ is **cyclic** if $C$ is a linear code, and any cyclic shift of a codeword is also a codeword. In other words, whenever $a_0 a_1 ... a_n-1 \in C$, $a_n-1 a_0 a_1 ... a_n-2 \in C$. - ex. the binary code $\{0000, 0101, 1010, 1111\}$ is cyclic. Consider the *ring* $R_n = 𝔽_q [x] / (x^n - 1)$: as as a vector space, $\simeq 𝔽_q^n$ A cyclic code $C$ is a (nonempty) $C \subseteq R_n$, where: - $C$ is a subspace closed under `+` - $C$ is closed under scalar multiplication - $C$ is closed under `·` - This is a *linear code* (??) We can consider a cyclic shift to be a *multiplication* by $x$. - ex. $a_0 + a_1 x + a_2 x^2 + ... + a_n-1 x^n-1$ = $(a_0, a_1, ..., a_n-1)$ - $x·(a_0 + a_1 x + a_2 x^2 + ... + a_n-1 x^n-1)$ - $a_0 x + a_1 x^2 + a_2 x^3 + ... + a_n-1 x^n$ - $x^n ≡ 1$, so... - $a_n-1 + a_0 x + a_1 x^2 + ... + a_n-2 x^n-1$ $C$ is a **module** on the ring $R_n$, where a **module** is a vector space working over a ring, not a field ... missed some stuff... Every cyclic code is of the form $$. The polynomial $g(x)$ is uniquely determined by: - $g(x)$ is monic - $g(x)$ divides $x^n - 1$ in $𝔽_q [x]$ As a corollary, cyclic codes correspond with monic divisors of $x^n - 1$. Ex. find all binary cyclic codes of length $n=3$. $$x^3 - 1 = (x+1)(x^2 + x + 1)$$ So our two codes are: - $ = \{\text{all even words}\}$ - $ = \{000, 111\}$ Ex. find all binary cyclic codes of length 4 - $x^4 - 1 = (x-1)^4$ So our codes are: - $<1> = R_n$ - $ = \{\text{even words}\}$ - $(x-1)^2 = ?$ - $(x-1)^3 = \{000, 111\}$ - $(x-1)^4 = 0$ Let $C \subseteq R_n$ be a cyclic code. Let $g(x) \in C$ be the lowest degree polynomial. Then, $ = C$. Proof: - To show $ \subseteq C$: $g(x) \in C \implies f·g \in C$ - To show $ \supseteq C$: we must show that $h(x) = f(x)·g(x)$ for some unknown $h(x) \in C$. - $\frac{h}{g} = f + {r}{g}$, $h = f·g + r$, $r = h-fg \in C$. - $deg(r) < deg(g)$ by the definition of division. - $r=0$ as the degree of $r$ is less than the lowest degree polynomial. Ex. $R_3 = 𝔽_2 [x] / (x^3 - 1)$ - $g(x) = (x^2 + 1) = (x+1)^2$ - $g(x) = (x^3 - 1) = (x+1)(x^2 + x + 1)$ - $ =$ a cyclic code - 0, $1·(x^2 + 1) = x^2 + 1$ - $x(x^2 + 1) = x^3 + x \equiv 1+x$ - $(x+1)(x^2 + 1) = x^3 + x^2 + x + 1 \equiv x^2 + x$ - $x^2 (x^2 + 1) = x^4 + x^2 = x + x^2$ - $(x^2 + 1)(x^2 + 1) = x^4 + 1 = x + 1$ - $(x^2 + x + 1) (x^2 + 1) = x^3 + 1 + x^3 + x \equiv x + 1 + 1 + x \equiv 0$ - $C = \{0, 1+x, x^2 + 1, x^2 + x\} = <1+x>$ Let $g(x)$ divide $x^n - 1$ in $𝔽_q (x)$. Then $dim = dim(C) = n-deg(g)$. - Matrix: $g(x) = a_0 + a_1 x + a_2 x^2 + ... + a_m x^m$ - i.e. $g(x), x·g(x), x^2 g(x), ..., x^{n-m-1} g(x)$ form a basis for the vector space $C \subseteq R_n$. ideal --- ### Vandermonde matrix $𝔽$: field, $ V = [1, 1, 1, ...; x_1, x_2, ..., x_n; x_1^2, x_2^2, ..., x_n^2; x_1^n-1, x_2^n-2, ..., x_n^n-1] If $x_i$ are distinct, then $V$ is invertible (i.e. the determinant is nonzero). The determinant: $dev V = Π_{i>j} (x_i - x_j)$ examples... the determinant of a vandermonde matrix equals its transpose ### Lagrange interpolation ex. a continuous *function*: distinct x, not necessarily distinct y Theorem: If $x_i$ are distinct (for any numbers y_i), then there is a unique $p(x)$ of degree ≤ $n-1$ that interpolates the points (x_i, y_i). Proof: $P_n-1 = \{a_6 + a_1x + ... + a_n x^n-1 | a_i ∈ 𝔽\}$ $P_n-1 kindaeq 𝔽^n$ $p(x) <--> (a_0, a_1, ... a_n-1)$ find a $p(x)$ such that when evaluated at $x_1, x_2, ...$ it equals $y_1, y_2, ...$ theorem is equivalent to saying that T is invertible: for $T : P_n-1 --> 𝔽^n$ invertible means 1-1 and onto (equivalent, if you have one property you have the other) Rank-Nullity theorem: $dim P_n-1 = dim N(T) + dim R(T)$ 1-1 means that the null space is zero - why? onto means that the dimension of the range is all of $dim 𝔽^n = n$ The matrix of T: $P_n-1 -> 𝔽^n$ T = [T(1), T(x), T(x^2)...] ### BCH codes Fix $x_1 x_2... x_n$ in 𝔽_q distinct Fix some $r = redundancy ≤ n$ H = [1, 1, ..., 1; x_1, x_2, ... x_n; ...; x_1^n-1, x_2^n-1, ..., x_n^n-1] dim C^⟂ = r, dim C = n - r the data of the codeword: n - r, the redundancy of the codeword: r \ distance of C: any r columsn of H are independent (r columns form a rxr Vandermonde matrix) ex. construct a length 10 (n) code, distance 5 (r + 1) (can correct 2 errors) Choose 𝔽_q, x_1, ..., x_10 in 𝔽_q Take 𝔽_11 = Z_11, x_1 = 1, x_2 = 2, ..., x_10 = 10 H = [1, 1, ..., 1; 1, 2, ..., 10; 1, 2^2, ... 10^2, 1, 2^3, ..., 10^3] (# of rows: 4) BCH codes are not perfect, usually? but they do achieve the singleton bound ## Singleton bound C: length n, distance d, q-ary, then: M ≤ q^n-d+1 which (if C is linear): <=> dim C ≤ n-d BCH code: d = r+1 singleton bound dim C ≤ n-(d-1) = n-r M ≤ Singleton bound ≤ Sphere packing bound M ≤ ex. 10-5+1 = 11^6 ≤ 11^10 next time: how to decode BCH codes ## The main linear coding theory problem The main linear coding theory problem: What is the "best" linear code? Question: What is the "best" linear code? Answer: we construct H in 𝔽_q such that $$H = [H_1 H_2 \ldots H_4]$$ as n goes to the maximum for some fixed r. How many vectors $H_1 ... H_n$ in 𝔽_q^n, such that any d-1 are linearly independent, n = max(q, r) <--- ???? Last time: for $d=3$: the best codes are $Ham(q, r)$ Today: for $d=4$: we need to think about algebraic geometry?? How many vectors $H_1, ... H_n$ in 𝔽_q^n such that no 3 vectors lie on a planes ### Projective spaces $$ℙ^{n-1} = \{\text{all lines through 0 in 𝔽_q^n}\}\}$$ For r=3: Ex. for $r=3$, $q=2$: we have a cube with its corners as $000, 010, 100, 110, 001, 011, 101, 111$ ``` 010 | | 110 | | 011------111 | | | | | | | | 011------101-----------100 ``` A projective space reduces the dimension by 1. - ex. a line $L$ in $𝔽_q^n$ = point in $ℙ^{n-1}$ - ex. a plane $W$ in $𝔽_q^n$ = a line in $ℙ^{n-1}$ - ex. a 3d subspace in $𝔽_q^n$ = a plane in $ℙ_q^n$ In ℙ^2, any two different lines intersect at a point. Ex. ℙ_2 over 𝔽_2 ``` 010 011 111 110 001 101 100 ``` (the triangle with points intersecting) ### MLCT problem d=4 Find the maximum number of lines L_1, L_2, ..., L_n in 𝔽_q^n such that no 3 lines lie on a plane. This is the same as our earlier problem! This is also the same as finding the maximum number of points in the projective space ℙ^r-1 such that 3 points are on a line. Ex. for r=3, 𝔽_q = 𝔽_2 - Take all 4 finite points r > 3, 𝔽_q = 𝔽_2 - Take all 2^r-1 finite points - Ham(2, r) + parity bit = 2^r - 1 + 1 = 2^r = HamHat(2, r) - for redundancy r+1 Ex. for the case q ≠ 2 q = 3 Cannot take all the finite points We consider a parabola intersecting a line M in a plane, and consider the points intersected with it For q > 2, r = 3, the maximum n is q+1 if q is odd, and q+2 if q is even.