오늘의 글의 목차는 다음과 같습니다.
1. Basic concentration inequalities
2. Azuma-Hoeffding inequality
3. McDiarmid's inequality
1. Basic concentration inequalities
Theorem 1.1 Markov Inequality
$P(X\geq t)\leq\frac{E[X]}{t}=O(\frac{1}{t})$
- $X$ should be non-negative
proof
$E[X]=\int_0^{\infty}xp(x)dx\geq\int^{\infty}_{t}xp(x)dx\geq t\int^{\infty}_tp(x)dx=tP(X\geq t)$
Remark 1.2 Generalization of Markov Inequality
Let $\phi$ a strictly monotonically increasing non-negative function. Then for any $X$,
$P(X\geq t) = P(\phi(X)\geq \phi(t))\leq \frac{E\phi(x)}{\phi(t)}$
Remakr 1.2를 $x^2$ 로 설정하면 Chebyshev Inequality 가 된다.
Theorem 1.3 Chebyshev Inequality
$P(|X-E[X]|\geq t\sigma)\leq\frac{1}{t^2}=O(\frac{1}{t^2})$ where $Var(X)=\sigma^2$
proof
$P(|X-E[X]|\geq t\sigma)=P((X-E[X])^2\geq t^2\sigma^2)\geq \frac{E[(X-E[X])^2]}{t^2\sigma^2}=\frac{Var(X)}{t^2\sigma^2}=\frac{1}{t^2}$
Example 1.4 Chebyshev Inequality for sample mean
Taking $X=\hat{\mu}$,
$P(|\hat{\mu}-\mu|\geq \frac{t\sigma}{n})=\frac{Var(\hat{\mu})}{t^2\sigma^2}n=\frac{1}{t^2}$
Take $t=10$, then with probability at least 0.99, the average is within $\frac{10\sigma}{n}$ from its expectation.
Corollary 1.5 Moment of Chebyshev Inequality
If $X$ has $q$ th central moment, then we can have
$P(|X-EX|\geq t)\leq \frac{E[|X-E[X]|^q]}{t^q}$
If we optimize this over $q$, it could provide very sharp estimates of the tail probabilites.
Theorem 1.6 Chernoff Method
Let $\mu=E[X]$
$P((X-\mu)\geq u)=P(exp(t(X-\mu))\geq exp(tu))\leq \frac{E[exp(t(X-\mu))]}{exp(tu)}$
If we optimize over $t$ where mgf of $X$ is finite, we can get tighter upper bound
Cramer Chernoff Method는 Markov Inequality와 지수함수를 활용해 tail bound를 얻을 수 있는 한 가지 방법이다.
Markov Inequality에 지수함수를 활용하는 것을 생각해보면, 다음과 같다.
$$ P( Z\geq t) \leq e^{-\lambda t} E e^{\lambda Z}$$
Log Momenta Generating function, $ \psi_Z (\lambda ) = \log E e^{\lambda Z}$ 을 활용해 다시 생각해보면,
$$ P (Z \geq t ) \leq e^{\psi_Z(\lambda) - \lambda t}$$
따라서, tail bound의 tight probability는 $ \psi^*_Z (t) = \sup_{\lambda \geq 0 } (\lambda t - \psi_Z(\lambda) )$ 에서 얻어진다.
여기서 $\psi^*_Z$를 Cramer transform of $Z%라고 하고, $t \geq EZ$일 때, Fechel-Legendre Dual Form과 같아지는 것을 알 수 있다.
2. Azuma-Hoeffding inequality
Let $X_0,X_1,\dots,X_n$ be a Martingale sequence with $a_i \leq X_i -X_{i-1} \leq b_i,\;1\leq i\leq n$. Then,
$$\mathbb{P}(X_n-X_0\geq t) \leq \exp\left(-\frac{2t^2}{\sum^n_{i=1}(b_i-a_i)^2}\right)$$
Chernoff Method로부터 inequality를 유도할 수 있다.
우선 $Z_i=X_i-X_{i-1},\;1\leq i \leq n$으로 정의한다.
그러면 Hoeffding lemma로부터 $$\mathbb{E}[\exp(\theta Z_k)\mid\mathcal{F}_{k-1}]\leq \exp\left(\frac{\theta^2(b_k-a_k)^2}{8}\right)$$. 즉,
$$\mathbb{E}\left[\exp\left(\theta\left(\sum^n_{i=1}Z_i\right)\right)\right]=\mathbb{E}[ \exp(\theta\left(\sum^{n-1}_{i=1}Z_i \right)\mathbb{E}[\exp(\theta Z_n)\mid\mathcal{F}_{n-1}]]$$
3. McDiamid's inequality
$f\mathbb{R}^n\to\mathbb{R}$이 coordinate wise하게 $l_i, 1\leq i \leq n$-lipschitz하다고 가정한다. 그러면,
$$\mathbb{P}(f(X_1,X_2,\dots,X_n)-\mathbb{E}[f(X_1,X_2,\dots,X_n)]\geq t) \leq \exp \left(-\frac{2t^2}{\sum^n_{i=1}l_i^2} \right)$$
Azuma-Hoeffding inequality를 활용하면 쉽게 유도할 수 있다.
$Z_i =\mathbb{E}[ f(X_1,X_2,\dots,X_n)\mid\mathcal{F}_i]-\mathbb{E}[f(X_1,X_2,\dots,X_n)\mid\mathcal{F}_{i-1}]$
그러면
$\sum^n_{i=1}Z_i=f(X_1,\dots,X_m)-\mathbb{E}[f(X_1,\dots,X_n)]$이고 $\mathbb{E}[Z_i\mid \mathcal{F}_{i-1}]=Z_{i-1}$
'Statistics > Statistical Inference' 카테고리의 다른 글
| Sample Space (0) | 2021.03.05 |
|---|---|
| Location and Scale Family (0) | 2021.01.08 |
| 지수족 (0) | 2021.01.07 |
| Ancillary and Complete Statistics (0) | 2021.01.06 |
| Sufficient Statistics : Factorization Theorem (0) | 2020.12.30 |