3  Chapter 2: Optimal Classification

Author

Shao-Ting Chiu

Published

September 2, 2022

3.1 Stoachastic analysis

3.2 Classification without predictors

\[ \hat{Y}= \begin{cases} 0, & P(Y=0) \geq P(Y=1)\\ 1, & P(Y=1) > P(Y=0)\\ \end{cases} \]

  • \(E[Y] = P(Y=1)\)
Posterior-probability function

\[\eta(x) = E[Y|X=x] = P(Y=1|X=x), \quad x\in R^d\]

3.3 Classification error

\[\epsilon[\psi] = p(\psi(X)\neq Y) = p(\{(x,y)|\psi(x)\neq y\})\]

The classification error is determined by the feature-label distribution \(P_{X,Y}\)

3.4 Class-specific errors

3.5 Bayes Classifier

\[\psi^{*} = \arg\min_{\psi\in\mathcal{C}} P(\psi(X) \neq Y) \]

3.6 Feature transformation

\[\epsilon^{*}(x,y) \geq \epsilon^{*}(X', y)\quad \text{with } X = t^{-1}(X')\]