It's hard to think of a more eloquent way of phrasing this question - I'm basically wondering if a classifier trained on data where examples of some of the classes are infrequent/rare would be a bad model? I'm mainly interested in decision trees (C4.5).
I think the answer is no, but that you will get a high error, because you will usually classify members of the infrequent classes as instances of the more frequent classes. This has been my experience so far.
I'm also wondering when it's okay to remove these examples and when it's considered bad practice (i.e. doing it just to lower the error). I'm guessing that it's okay to remove these if there's a good reason to do so, and you explain that reasoning when you report your results.
I'm not really interested in building the best classifier, I'm more interested in understanding relationships between the variables and the structure of the data. But all my variables are categorical and it's non-linear data, so decision trees have so far been the best tool I've found to do this. (SVMs and ensemble methods are more accurate, but you can't really see the internal model structure, which you get with decision trees.)
thanks.