|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectbodhidharma.classifiers.SupervisedClassifier
public abstract class SupervisedClassifier
An abstract class for designing a classifier that used supervised learning in order to classify arbitrary feature sets. This implementation makes it possible to assign more than one label to a single feature set.
Feature sets are fed into the classifier as arrays of doubles. Categories are specified as arrays of Strings.
All classes implementing this interface should include a constructor that allows the parameters of particular classifiers to be set. This constructor should also set the categories, feature_names and identifier fields.
All classes implementing this interface should also include a constructor that only takes in a String holding the file path of an XML file holding the information necessary to reconstruct a trained classifier.
Use the train
method to train the classifier (the
getFeatureNames
method is useful for deteriming the features that
can be fed to the classifer).
Use the classify
method to classify feature sets once training
has been completed (the getCategories
method is useful for
determining what categories feature sets can be classified into and for
determining the order of the categories when parsing classification results).
Use the save
method to save the classifier and its current
state to disk.
Use the getClassifierName
and getClassifierParameters
methods to obtain information about the classifier.
Use the getClassifierIdentifier
method to get a name or code that
was given to an instantiation of the classifier when it was constructed. This
identifier can be used by external classes to identify the instantiation.
Field Summary | |
---|---|
protected java.lang.String[] |
categories
The possible categories into which feature sets can be classified. |
protected java.lang.String[] |
feature_names
The names of the different features which are used to perform classifications. |
protected java.lang.String |
identifier
An identifier that can be associated with the classifier so that outside classes can identify it. |
protected ProgressBarTaskTrainMonitor |
training_monitor
Used to monitor training progress. |
Constructor Summary | |
---|---|
SupervisedClassifier()
|
Method Summary | |
---|---|
abstract double[][] |
classify(double[][] feature_sets,
java.lang.String[] feature_labels)
Returns the relative scores of each of the possible categories when the given sets of features are classified. |
java.lang.String[] |
getCategories()
Returns the contents of the categories field. |
java.lang.String |
getClassifierIdentifier()
Returns a name or code that was given to this instantiation of the classifier when it was constructed. |
abstract java.lang.String |
getClassifierName()
Returns the name of the type of classifier. |
abstract java.lang.String |
getClassifierParameters()
Returns a String describing the parameters of the classifier. |
java.lang.String[] |
getFeatureNames()
Returns the contents of the feature_names field. |
protected boolean[][] |
getModelResults(java.lang.String[][] given_results)
Returns a 2-D array whose first indice corresponds to the feature sets specified by the first indice of the given_results parameter and whose second indice corresponds to each of the categories in the categories field. |
protected double[][] |
getOrderedFeatureSets(double[][] feature_sets,
java.lang.String[] feature_labels)
Returns a 2-D array of doubles that consists of the contents of the feature_sets parameter after having been reordered so that the order of the features (as specified in the feature_labels parameter) are the same as in the feature_names field. |
abstract void |
save(java.io.File place_to_save)
Writes the SupervisedClassifier and its current state to the given save_file. |
void |
setTrainingMonitor(ProgressBarTaskTrainMonitor monitor)
Sets the training_monitor parameter to the given value. |
abstract double[] |
train(double[][] feature_sets,
java.lang.String[] feature_labels,
java.lang.String[][] model_categories,
int iterations,
double acceptable_threshold,
int consecutive_iterations)
Trains the SupervisedClassifier using the given feature sets. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected java.lang.String[] categories
protected java.lang.String[] feature_names
protected java.lang.String identifier
protected ProgressBarTaskTrainMonitor training_monitor
Constructor Detail |
---|
public SupervisedClassifier()
Method Detail |
---|
public java.lang.String[] getCategories()
public java.lang.String[] getFeatureNames()
public java.lang.String getClassifierIdentifier()
public abstract java.lang.String getClassifierName()
public abstract java.lang.String getClassifierParameters()
public void setTrainingMonitor(ProgressBarTaskTrainMonitor monitor)
public abstract double[] train(double[][] feature_sets, java.lang.String[] feature_labels, java.lang.String[][] model_categories, int iterations, double acceptable_threshold, int consecutive_iterations) throws java.lang.Exception
The feature_labels parameter specifies the names of each of the features in the feature_sets parameter. The features in the feature_sets parameter will automatically matched to the features in the feature_names field based on the content of the feature_labels parameter unless a value of null is passed to the feature_labels. In this case, the feature values in the feature_sets parameter will simply be fed into the classifier in the order that they occur.
The model_categories parameter gives the categorie(s) of each of the given feature sets. The first indice corresponds to the feature set. The second indice corresponds to different possible categories for the given feature set. Only categories to which the feature set belongs should be included.
The iterations parameter specifies the number of training iterations performed. If a negative value is passed here, then the number of iterations to perform is calculated automatically based on the acceptable_threshold parameter, which specifies the absolute rate of change of the classification error below which training will stop, and the consecutive_iterations parameter, which specifies the number of consecutive iterations for which the rate of change must be below this threshold in order for training to stop. The number of iterations that go by will never exceed the absolute valud of the iterations value, irregardless of the other parameters.
For example, if a value of 1000 is given for iterations, then 1000 iterations will be performed irregardless of the other parameters. If a value of -1000 is given, then training will automatically stop if the absolute value of the rate of change of the classification error from one sample to the next falls below the acceptable_threshold parameter for consecutive_iterations iterations, but no more than 1000 iterations will be performed in any case.
NOTE: the last three parameters are ignored if a type of classifier is used that does not use more than one iteration.
NOTE: the way in which the classification error is calculated varies from implementation to implementation, but 0 is always perfect performance, and rising levels indicate poorer performance.
The returned double is a set of classification error after training iterations. The indice of the returned array corresponds to the iteration of training that the error is associated with. A value of null is returned if the classifier does not provide this information.
An exception if thrown if the feature_labels do not contain the same names as feature_names (although a different ordering is permitted) or if any of the feature sets in feature_sets have a different number of features than feature_names. An exception is also thrown if feature_sets and model_categories have different sizes in regard to their first parameters. An exception is also thrown if the given_results parameter contains a name not present in the categories field or if it contains the same category more than once.
java.lang.Exception
public abstract double[][] classify(double[][] feature_sets, java.lang.String[] feature_labels) throws java.lang.Exception
The feature_sets parameter specifies the feature sets to be classified. The first indice corresponds to different feature sets. The second indice corresponds to different features in the given featue set. It should be noted that all feature sets must use the same features in the same order as given in the feature_labels parameter.
The feature_labels parameter specifies the names of each of the features in the feature_sets parameter. The features in the feature_sets parameter will automatically matched to the features in the feature_names field based on the content of the feature_labels parameter unless a value of null is passed to the feature_labels. In this case, the feature values in the feature_sets parameter will simply be fed into the classifier in the order that they occur.
An exception if thrown if the feature_labels do not contain the same names as feature_names (although a different ordering is permitted) or if any of the feature sets in feature_sets have a different number of features than feature_names.
java.lang.Exception
public abstract void save(java.io.File place_to_save) throws java.lang.Exception
SupervisedClassifier
and its current state to the given save_file.
If the particular SupervisedClassifier
needs to write more than one file, then
a directory should be passed to this method's parameter, where the appropriate files will
be written. Throws an exception if a problem occurs during saving.
java.lang.Exception
protected double[][] getOrderedFeatureSets(double[][] feature_sets, java.lang.String[] feature_labels) throws java.lang.Exception
An exception is thrown if the feature_labels parameter contains any labels not found in the feature_names field or if there are any labels found in the feature_names field that are not in the feature_labels parameter. The thrown exception contains a description of the problem encountered. An exception is also thrown if any of the feature sets contain a number of features different from the number of feature labels.
This method is intended for use by the train
and classify
methods.
The arrays passed as parameters are not altered.
java.lang.Exception
protected boolean[][] getModelResults(java.lang.String[][] given_results) throws java.lang.Exception
This method is intended for use by the An exception is thrown if the given_results parameter contains a name not
present in the categories field or if it contains the same category more than once.
train method to process its
model_categories parameter, which is passed to the given_results parameter
of this method. The given_results parameter consists of feature sets (first
indice) and the category names (second parameter) that each feature set belongs to.
java.lang.Exception
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |