Lecture 2 - 2025 / 2 / 25

  1. Infinite communication
  2. A probability distribution over the message

Goal: Minimize average code length

Prefix-Free Codes

(1) Prefi-free is a sufficient condition

(2) Is prefix-free a necessary condition? No

AA: prefix-free codes

BB: uniquely decodable codes

ABA \subsetneq B

minCBE((C))=minCAE((C))\min_{C \in B} E ( \ell (C)) = \min_{C \in A} E ( \ell (C))

But why? (Uniquely decodable means L,S(L)<2L\forall L, S(L) < 2^L where Where S(L)S(L) represents the number of strings of length LL. By calculating S(L)S(L), Kraft inequality must be satisfied.)

CB\forall C^* \in B such that E((C))E(\ell(C^*)) achieves the minimum, C~A,E((C))=E((C~))\exists \tilde C^* \in A , E(\ell(C^*)) = E(\ell(\tilde C^*))

Kraft Inequality for Prefix-free Codes

Theorem. Assume C=(c1,,cn)C = (c_1, \cdots, c_n) is prefix-free, Let 1,,n\ell_1, \cdots, \ell_n be the length (numbers of bits) of c1,,cnc_1, \cdots, c_n. Then
i=1n2i1\sum_{i=1}^{n} 2^{-\ell_i} \le 1

Minimal Average Code Length

Setting: Message M={m1,,mn}M = \{m_1, \cdots, m_n\}, probability distribution P={p1,,pn}P = \{p_1, \cdots, p_n\}.

Goal: Find C={c1,,cn}C = \{c_1, \cdots, c_n\} with length 1,,n\ell_1, \cdots, \ell_n, CC is prefix-free.

min1,,ni=1npiis.t. i=1n2i1,i0\min_{\ell_1, \cdots, \ell_n} \sum_{i=1}^{n} p_i\ell_i \quad \text{s.t. } \sum_{i=1}^{n} 2^{-\ell_i} \le 1, \ell_i \ge 0

Note that i\ell_i may not in N\N.

WLOG we assume that i=1n2i=1\sum_{i=1}^{n} 2^{-\ell_i} = 1. Let qi=2iq_i = 2^{-\ell_i}, then q1,,qnq_1, \cdots, q_n is PMF,
maxq1,,qni=1npilog2qi\max_{q_1, \cdots, q_n} \sum_{i=1}^{n} p_i \log_2 q_i

Therefore, i=log2pi\ell_i = - \log_2 p_i.

Definition (Entropy) Given a random source XX (random variable), with PMF (p1,,pn)(p_1, \cdots, p_n). The entropy of XX is
H(X):=i=1npilog21piH(X) := \sum_{i=1}^{n} p_i \log_2 \frac{1}{p_i}

  1. Minimal code length (description length)
  2. Quantify information
  3. Uniform distribution H(X)H(X) maximum. Deterministic H(X)=0H(X) = 0.
  4. HH measures uncertainty of XX