The approach you are considering is similar to a multi-class SVM or a one-vs-the-rest approach.
And here is how I describe the problem. The support vector machine, per example, is fundamentally a two-class classifier.
In practice, however, we often have to tackle problems involving K > 2 classes. Various methods have therefore been proposed for combining multiple two-class SVMs in order to build a multi-class classifier.
One commonly used approach (Vapnik, 1998) is to construct K separate SVMs, in which the kth model y_k(x) is trained using the data from class C_k as the positive examples and the data from the remaining K − 1 classes as the negative examples. This is known as the one-versus-the-rest approach where :
y(x) = max_k y_k(x)
Unfortunately, this heuristic approach suffers from the problem that the different classifiers were trained on different tasks, and there is no guarantee that the real-valued quantities y_k(x) for different classifiers will have appropriate scales.
Another problem with the one-versus-the-rest approach is that the training sets are imbalanced. For instance, if we have ten classes each with equal numbers of training data points, then the individual classifiers are trained on data sets comprising 90% negative examples and only 10% positive examples, and the symmetry of the original problem is lost.
Therefor, you got your bad accuracy.
PS: Accuracy, in most cases, is not a good measure for evaluating a classifier model.
References :
- Vapnik, V. - Statistical Learning Theory. Wiley-Interscience, New York.
- Christopher M. Bishop - Pattern Recognition and Machine Learning.