|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectace.CrossValidator
public class CrossValidator
Cross validates a set of Weka Instances.
Instances are partitioned into folds. Different instances are used for training and testing in each fold. A Weka Classifier is trained and tested for each fold. Results are used to evaluate the performance of various classification techniques, including feature selection and classifier ensembles. Methods of this class are used both in the context of a single cross validation and in the context of experimentation where multiple cross validations are performed.
Instances are partitioned in the constructor of this class. An array of integers generated by a call to generatePartitionAray stores the directions for the partitioning. This array is either generated in the constructor (in the context of a single cross validation) or in the Experimenter class (in the context of experimentation).
Constructor Summary | |
---|---|
CrossValidator(weka.core.Instances instances,
int[] partition,
int num_folds)
This constructor will be called in the context of Experimentation when multiple Classifiers and types of dimensionality reduction are being used. |
|
CrossValidator(weka.core.Instances instances,
int num_folds,
java.lang.String[] identifiers)
This constructor will be called when a single cross validation is being performed with only one type of Classifier and one type of dimensionality reduction. |
Method Summary | |
---|---|
java.lang.String |
crossValidate(TrainedModel trained,
CrossValidationResults[] cvres,
weka.core.Instances instances,
java.io.OutputStream out,
java.lang.StringBuffer cv_results,
java.lang.String file_name,
java.lang.String feature_selector,
boolean save_intermediate_arffs,
boolean verbose,
int i)
Cross validates a set of Weka Instances. |
static int[] |
generatePartitionArray(int num_folds,
int num_instances)
Generates an array of evenly distributed random numbers between 0 and num_folds -1 to be used during the partitioning of Instances into cross validation folds. |
static java.lang.StringBuffer |
getClassifications(weka.core.Instances actual,
weka.core.Instances predicted,
weka.core.Instances training,
java.lang.String[][] identifiers)
Prints the training instances and testing instances with their corresponding model and predicted classification. |
static java.lang.String[] |
getClassNames(weka.core.Instances instances)
Gets the names of the possible classes into which an instance of the given data set could be classified. |
double[][] |
getOverallConfusionMatrix(double[][][] confusion_matrices)
Gets the confusion matrix for the cross validation as a whole from the confusion matrices of each fold. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public CrossValidator(weka.core.Instances instances, int num_folds, java.lang.String[] identifiers)
instances
- The Weka Instances to be used for cross validation.num_folds
- The number of folds into which the Instances should
be partitioned.identifiers
- String of unique identifiers for the given Instances.
These identifiers will be partitioned alongside the Instances.public CrossValidator(weka.core.Instances instances, int[] partition, int num_folds)
instances
- The instances to be used for cross validation.partition
- An array of numbers that specify the way in which the
given Instances should be partitioned. This parameter
is necessary in the context ofnum_folds
- The number of folds into which the Instances should
be partitioned.Method Detail |
---|
public java.lang.String crossValidate(TrainedModel trained, CrossValidationResults[] cvres, weka.core.Instances instances, java.io.OutputStream out, java.lang.StringBuffer cv_results, java.lang.String file_name, java.lang.String feature_selector, boolean save_intermediate_arffs, boolean verbose, int i) throws java.lang.Exception
trained
- The Serializable object that stores the Weka Classifier
and dimensionality reduction objects.cvres
- Holds the results of the cross validation. In the
context of a single cross validation, an array of
size one is passed. In the context of experimentation,
the array will have a cell for each Classifier that
is being tested.instances
- The Weka Instances to use in cross validation.out
- A progess report is printed to this OutputStream.cv_results
- Results of the cross validation are appended to
this StringBuffer. In the context of a single
cross validation, this will be instantiated in
Coordinator and will only be accessed by
this method. In the context of experimentation,
Experimenter will instantiate and also write
to this StringBuffer.file_name
- The file to save the results to. The content
of this file will be the same that is returned.
This string will always be null in the context
of experimentation because Experimenter
writes the results to a file itself.feature_selector
- The name of the feature selector being used
for this cross validation. If null, "None"
will be printed in the results sting as
the type of dimensionality reduction performed.save_intermediate_arffs
- Whether or not to save training data to an
arff file after parsing, after thinning and,
and again after feature selection, if any.verbose
- Whether or not to print and save a detailed report
of the cross validation, including the partitioning
and classification of individual instances
and detailed report of dimensionality reduction
that was performed. Incorrect classifications
are marked with an asterix.i
- The index of the array of CrossValidationResults
objects to access. This will always be 0 in the
context of a single cross validation.
java.lang.Exception
- If a problem occurs.public static int[] generatePartitionArray(int num_folds, int num_instances)
num_folds
- The number of folds into which the instances should
be divided.num_instances
- The number of instances to be partitioned.
public double[][] getOverallConfusionMatrix(double[][][] confusion_matrices)
confusion_matrices
- 3D array containing the confusion matrices for
each fold. First index will be the fold.
public static java.lang.String[] getClassNames(weka.core.Instances instances)
instances
- The Weka Instances in question.
public static java.lang.StringBuffer getClassifications(weka.core.Instances actual, weka.core.Instances predicted, weka.core.Instances training, java.lang.String[][] identifiers)
actual
- The testing set of Weka Instances.predicted
- The classified Weka Instances that were returned by classification.training
- The training set of Weka Instances.identifiers
- 2D array containing the identifiers for the training
and testing data. First index contains identifiers for
training data. Second index contains identifiers for
testing data.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |