1
$\begingroup$

How can we Prove that the sum of slacks $\sum \xi_n$ from the objective function of the SVM formulation with soft margin is an upper bound on the number of misclassified training examples?

Some references: https://www.cs.colostate.edu/~asa/pdfs/howto.pdf

My current knowledge: Optimization problem is
minimize w,b
1/2 $||w||^2$ + C $\sum \xi_i$

subject to $y_i (w^T x_i +b) \ge 1 - \xi_i, \xi_i \ge 0$

But it uses the problem taken for granted, I am looking how can we actually solve this.

$\endgroup$
3
  • $\begingroup$ Do you understand why $\xi_i > 1$ if and only if the $i^{th}$ training example is misclassified? $\endgroup$ Commented Nov 27, 2017 at 1:25
  • $\begingroup$ @jbowman I know if an example is misclassified, then the slack variable $\xi_i$ is positive and we want to minimize total sum of slacks in SVM. $\endgroup$ Commented Nov 27, 2017 at 1:28
  • $\begingroup$ It's not just positive, it's > 1. The slack variables are always nonnegative. $\endgroup$ Commented Nov 27, 2017 at 1:32

1 Answer 1

1
$\begingroup$

First, let's consider the case $y_i = 1$. In order to be classified correctly, $w^Tx_i + b > 0$, therefore $0 <y_i(w^Tx_i + b)$. In order for this number to be $\geq 1 - \xi_i$, $0 \leq \xi_i < 1$, with $\xi_i = 0$ when $y_i(w^Tx_i + b) \geq 1$.

Now let's assume the same classification occurred, so that $w^Tx_i + b > 0$, but incorrectly, i.e., $y_i = -1$. As $y_i < 0$, it must be that $y_i(w^Tx_i + b) < 0$. In order for this number to be $\geq 1 - \xi_i$, it must be that $\xi_i > 1$.

A similar logic can be worked through for the case where $w^Tx_i + b < 0$. In both cases, misclassification implies that $\xi > 1$.

Since $\xi_i > 1$ for all misclassified cases, it follows that $\Sigma_{i \in \text{misclassified}}\xi_i > \Sigma_{i \in \text{misclassified}}1$, which latter evidently equals the number of misclassified cases.

Since $\xi_i \geq 0$ for all correctly classified cases, it follows that $\Sigma_{i \in \text{correct}}\xi_i \geq 0$.

Therefore the sum over all cases must be greater than or equal to the number of misclassified cases, which makes it an upper bound on the number of misclassified cases.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.