3.0 Classifier ensemble design issues

There are many decisions that one must make when building classifier ensembles. To begin with, one must decide what types of classifiers will be used to construct the ensemble. Will only a single type of classifier be used, each trained to converge to a different solution, or will a variety of different classifier types be employed? How many classifiers are to be incorporated into the ensemble? Will the classifiers operate in a single parallel layer, or will some sort of structure (e.g. hierarchical) be imposed on them?

What features each classifier will have access to and what subset of the training data each classifier will train on must also be determined. At one extreme, each classifier has access to every feature and is each trained on the entire training set. At the other extreme, each classifier is only exposed to a small sub-set of the features and training data, and no other classifier has access to these features and training data. There is also a broad intermediate spectrum between these two extremes, where there is some difference between the information that each classifier is exposed to, but also some overlap. The best approach is greatly dependent upon the particular classifiers and coordination techniques used as well as, to a lesser degree, concerns relating to computational resources available.

Bagging and boosting, both discussed in Section 7, are two common approaches to assigning training data. There are also a number of different ways of applying feature selection techniques to classifier ensembles. Although these are beyond the scope of this report, Section 8.1 of Kuncheva’s book (2004) provides a good survey of them.

Another important design issue involves how the results of each component classifier are to be combined. This is the emphasis of Sections 4, 5 and 6. The choice of coordination techniques should influence, and should be influenced by, the types of design decisions described above.

There are two general approaches to designing ensembles. “Decision optimization” involves choosing and optimizing the classifier combination algorithm for a fixed ensemble of base classifiers. “Coverage optimization,” in contrast, involves optimizing the base classifiers for a given classifier combination algorithm. Some approaches, such as the mixture of experts model, discussed in Section 6.3, train the classifiers and combiner simultaneously.

Next: Overview of classifier combination techniques

Last modified: April 18, 2005.
-top of page-