VC-dimension

이번 글의 목차는 다음과 같습니다.

1. Growth function과 VC-dimension

2. $\epsilon$-net theorem and PAC learnability

1. Growth function과 VC-dimension

Definition 1 (Growth function,[1])
Let $ \mathcal{H} $ be a hypothesis class. Then the growth function of $ \mathcal{H} $, $ \tau_{\mathcal{H}} : \mathbb{N}\to\mathbb{N} $ is defined as:
\begin{align*}
\tau_{\mathcal{H}} (m) := \max_{C\subset\mathcal{X}:|C|=m} | \mathcal{H}_C|.
\end{align*}

즉, growth function은 hypothesis class $\mathcal{H}$에 의해 labeleing 가능한 $C\subset\mathcal{X}$의 경우의 수이다. Growth function에 대한 몇가지 성질을 알 수 있는데, finite hypothesis class에 대해 $\tau_{\mathcal{H}} (m)\leq|\mathcal{H}|$ 이다.

2. $\epsilon$-net theorem and PAC learnability

Definition 2 ( $\epsilon$-net, [1]) Let $\mathcal{X}$ be a domian. $S\subset\mathcal{X}$ is an $\epsilon$-net for $\mathcal{H}\subset2^{\mathcal{X}}$ with respect to a distribution $\mathcal{D}$ over $\mathcal{X}$ if

$$\forall{h}\in\mathcal{H} : \mathcal{D}(h) \geq \epsilon \rightarrow h\cap S \neq \emptyset$$

즉, $\epsilon$-net은 좋은 것이다.

Theorem 3 ([1]). Let $ \mathcal{H} \subset 2^{|\mathcal{X}|}$ and $\text{VCdim}(\mathcal{H})=d$. There exist $m$ such that for training et $|S|\geq m$ , with probability $1-\delta$ over a choice of $ S \sim\mathcal{D}^m$, $S$ is $\epsilon$-net for $\mathcal{H}$.

Theorem 4 ([1]). Let $ \mathcal{H}$ be a hypothesis class over $\mathcal{X}$ with $\text{VCdim}(\mathcal{H})=d$. Let $\mathcal{D}$ be a distriubtion over $\mathcal{X}$ and let $c\in\mathcal{H}$ be a target hypothesis. For $m\geq m^{\text{PAC}}(\epsilon,\delta)$, with probabilty at least $ 1- \delta $ with $m$ i.i.d. instances from $\mathcal{X}$, we have ERM hypothesis has a true error at most $\epsilon$.

Proof

$\mathcal{H}^c:=\{c \bigtriangleup h : h\in\mathcal{H} \}$의 VC-dimension이 $\mathcal{H}$와의 VC-dimension과 같음을 보이면 된다.

$L_{\mathcal{D}}(h)=\mathcal{D}(h \bigtriangleup c)$이고 $S$가 $\mathcal{H}^c$의 $\epsilon$-net이 되므로 $(h \bigtriangleup c)\cap A =\emptyset$이면 $L_{\mathcal{D}}(h)<\epsilon$이다.

Q.E.D.

Agnostic한 경우에는 uniform convergence property를 먼저 증명한다.

Theorem $\mathcal{H}$ is agnostic PAC learnable with sample complexity

$$m_{\mathcal{H}}(\epsilon,\delta)\leq C_2 \frac{d+\log(1/\delta)}{\epsilon^2}$$

마지막으로 우리는 VC-dimension과 PAC learnability사이의 관계를 알 수 있다.

Theorem. Let $\mathcal{H}$ be a hypothesis class of functions from $\mathcal{X}$ to $\{0,1\}$. Let $m$ be a trainign set size and exists a set $C\subset\mathcal{X}$ of size $2m$ that is shattered by $\mathcal{H}$. Then, there exist distribution $\mathcal{D}$ over $\mathcal{X}\times\{0,1\}$ and $L_{\mathcal{D}}(h)=0$ but with probability at least $1/7$ we have $L_{\mathcal{D}}(A(S))\geq 1/8$.

즉 VC-dimension이 infinite한 경우 PAC-learnable 하지 않다는 것을 증명할 수 있다.

References

[1] Shai Shalev-Shwartz and Shai Ben-David. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.

저작자표시 (새창열림)

'Artificial Intelligence > Machine Learning' 카테고리의 다른 글

Non-uniform learnability (0)	2023.04.18
PAC learning and uniform convergence (0)	2023.04.18
No free lunch theorem (0)	2023.04.16
Supervised Learning Terminology (0)	2019.10.15
Parametric method - Regression (0)	2019.10.07

Stareering

VC-dimension

1. Growth function과 VC-dimension

2. $\epsilon$-net theorem and PAC learnability

References

'Artificial Intelligence > Machine Learning' 카테고리의 다른 글

티스토리툴바

VC-dimension

1. Growth function과 VC-dimension

2. $\epsilon$-net theorem and PAC learnability

References

'Artificial Intelligence > Machine Learning' 카테고리의 다른 글

'Artificial Intelligence/Machine Learning' Related Articles

티스토리툴바