I have been reading about the technique of k-fold cross validation and I came through this example:
>>> clf = svm.SVC(kernel='linear', C=1) >>> scores = cross_validation.cross_val_score( ... clf, iris.data, iris.target, cv=5) ... >>> scores array([ 0.96..., 1. ..., 0.96..., 0.96..., 1. ]) The mean score and the standard deviation of the score estimate are given by:
>>> >>> print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2)) Accuracy: 0.98 (+/- 0.03) According to this source it says
When you perform k-fold CV, you get k different estimates of your model’s error- say e_1, e_2, e_3, ..., e_k. Since each e_i is an error estimate, it should ideally be zero.
To check out you model’s bias, find out the mean of all the e_i's. If this value is low, it basically means that your model gives low error on an average– indirectly ensuring that your model’s notions about the data are accurate enough.
According to the example of the SVM with the iris dataset, it gives a mean of 0.98, so does this mean that our model is not flexible enough?