본문 바로가기

Statistics/Statistical Inference

Concentration Inequalities

오늘의 글의 목차는 다음과 같습니다.

1. Basic concentration inequalities

2. Azuma-Hoeffding inequality

3. McDiarmid's inequality

1. Basic concentration inequalities

Theorem 1.1 Markov Inequality

$P(X\geq t)\leq\frac{E[X]}{t}=O(\frac{1}{t})$

  • $X$ should be non-negative

proof

$E[X]=\int_0^{\infty}xp(x)dx\geq\int^{\infty}_{t}xp(x)dx\geq t\int^{\infty}_tp(x)dx=tP(X\geq t)$

Remark 1.2 Generalization of Markov Inequality

Let $\phi$ a strictly monotonically increasing non-negative function. Then for any $X$,

$P(X\geq t) = P(\phi(X)\geq \phi(t))\leq \frac{E\phi(x)}{\phi(t)}$

 

Remakr 1.2를 $x^2$ 로 설정하면 Chebyshev Inequality 가 된다.

 

Theorem 1.3 Chebyshev Inequality

$P(|X-E[X]|\geq t\sigma)\leq\frac{1}{t^2}=O(\frac{1}{t^2})$ where $Var(X)=\sigma^2$

proof

$P(|X-E[X]|\geq t\sigma)=P((X-E[X])^2\geq t^2\sigma^2)\geq \frac{E[(X-E[X])^2]}{t^2\sigma^2}=\frac{Var(X)}{t^2\sigma^2}=\frac{1}{t^2}$

 

Example 1.4 Chebyshev Inequality for sample mean

Taking $X=\hat{\mu}$,

$P(|\hat{\mu}-\mu|\geq \frac{t\sigma}{n})=\frac{Var(\hat{\mu})}{t^2\sigma^2}n=\frac{1}{t^2}$

Take $t=10$, then with probability at least 0.99, the average is within $\frac{10\sigma}{n}$ from its expectation.

 

Corollary 1.5 Moment of Chebyshev Inequality

If $X$ has $q$ th central moment, then we can have

$P(|X-EX|\geq t)\leq \frac{E[|X-E[X]|^q]}{t^q}$

If we optimize this over $q$, it could provide very sharp estimates of the tail probabilites.

Theorem 1.6 Chernoff Method

Let $\mu=E[X]$

$P((X-\mu)\geq u)=P(exp(t(X-\mu))\geq exp(tu))\leq \frac{E[exp(t(X-\mu))]}{exp(tu)}$

If we optimize over $t$ where mgf of $X$ is finite, we can get tighter upper bound

 

Cramer Chernoff Method는 Markov Inequality와 지수함수를 활용해 tail bound를 얻을 수 있는 한 가지 방법이다.

Markov Inequality에 지수함수를 활용하는 것을 생각해보면, 다음과 같다.

$$ P( Z\geq t) \leq e^{-\lambda t} E e^{\lambda Z}$$

Log Momenta Generating function, $ \psi_Z (\lambda ) = \log E e^{\lambda Z}$ 을 활용해 다시 생각해보면,

$$ P (Z \geq t )  \leq e^{\psi_Z(\lambda) - \lambda t}$$

따라서, tail bound의 tight probability는 $ \psi^*_Z (t) = \sup_{\lambda \geq 0 } (\lambda t - \psi_Z(\lambda) )$ 에서 얻어진다.

여기서 $\psi^*_Z$를 Cramer transform of $Z%라고 하고, $t \geq EZ$일 때, Fechel-Legendre Dual Form과 같아지는 것을 알 수 있다.

 

2. Azuma-Hoeffding inequality

Let $X_0,X_1,\dots,X_n$ be a Martingale sequence with $a_i \leq X_i -X_{i-1} \leq b_i,\;1\leq i\leq n$. Then,

$$\mathbb{P}(X_n-X_0\geq t) \leq \exp\left(-\frac{2t^2}{\sum^n_{i=1}(b_i-a_i)^2}\right)$$

Chernoff Method로부터 inequality를 유도할 수 있다.

우선 $Z_i=X_i-X_{i-1},\;1\leq i \leq n$으로 정의한다.

그러면 Hoeffding lemma로부터 $$\mathbb{E}[\exp(\theta Z_k)\mid\mathcal{F}_{k-1}]\leq \exp\left(\frac{\theta^2(b_k-a_k)^2}{8}\right)$$. 즉,

$$\mathbb{E}\left[\exp\left(\theta\left(\sum^n_{i=1}Z_i\right)\right)\right]=\mathbb{E}[ \exp(\theta\left(\sum^{n-1}_{i=1}Z_i \right)\mathbb{E}[\exp(\theta Z_n)\mid\mathcal{F}_{n-1}]]$$

3. McDiamid's inequality

$f\mathbb{R}^n\to\mathbb{R}$이 coordinate wise하게 $l_i, 1\leq i \leq n$-lipschitz하다고 가정한다. 그러면,

$$\mathbb{P}(f(X_1,X_2,\dots,X_n)-\mathbb{E}[f(X_1,X_2,\dots,X_n)]\geq t) \leq \exp \left(-\frac{2t^2}{\sum^n_{i=1}l_i^2} \right)$$

Azuma-Hoeffding inequality를 활용하면 쉽게 유도할 수 있다.

$Z_i =\mathbb{E}[ f(X_1,X_2,\dots,X_n)\mid\mathcal{F}_i]-\mathbb{E}[f(X_1,X_2,\dots,X_n)\mid\mathcal{F}_{i-1}]$

그러면

$\sum^n_{i=1}Z_i=f(X_1,\dots,X_m)-\mathbb{E}[f(X_1,\dots,X_n)]$이고 $\mathbb{E}[Z_i\mid \mathcal{F}_{i-1}]=Z_{i-1}$

'Statistics > Statistical Inference' 카테고리의 다른 글

Sample Space  (0) 2021.03.05
Location and Scale Family  (0) 2021.01.08
지수족  (0) 2021.01.07
Ancillary and Complete Statistics  (0) 2021.01.06
Sufficient Statistics : Factorization Theorem  (0) 2020.12.30