bodhidharma.classifiers
Class ClassificationResultsInterpereter

java.lang.Object
  extended by bodhidharma.classifiers.ClassificationResultsInterpereter

public class ClassificationResultsInterpereter
extends java.lang.Object

Part of the Bodhidharma automatic music classification project. This class is used to interperet the results of classifications mase by a ClassificationPanel object. How the results are calculated are based in part on the contents of the PreferencesPanel given to the constructor.

See Also:
ClassificationPanel, PreferencesPanel
Author:
Cory McKay

Constructor Summary
ClassificationResultsInterpereter(PreferencesPanel preferences)
          Basic constructor that sets the preferences used for calculation of results.
 
Method Summary
 double[][] getConfusionMatrix(double[][] classifier_scores, java.lang.String[][] model_classifications, java.lang.String[] possible_categories)
          Returns a confusion matrix showing how recordings were classified compared to how they should have been classified.
 double[] getSuccessRate(double[][] classifier_results, java.lang.String[][] model_classifications, java.lang.String[] classifier_categories)
          Returns the following three statistics about the success of the given classifier results based on the given model classifications:
 java.lang.String[] getWinnerLabels(double[][] classifier_scores, java.lang.String[] possible_categories, java.lang.String[][] model_classifications, boolean report_scores, boolean report_second_choices)
          Returns a formatted set of strings containing the categories which are judged to be winners by the classification system for each recording.
 boolean[][] getWinners(double[][] classifier_scores)
          Returns true for the winning categories for each recording and false otherwise.
 float[][] getWinningScores(double[][] classifier_scores)
          Determines which categories are winners, which are second choices and which are losers given a set of classifier scores for a set of recordings.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClassificationResultsInterpereter

public ClassificationResultsInterpereter(PreferencesPanel preferences)
Basic constructor that sets the preferences used for calculation of results.

Method Detail

getConfusionMatrix

public double[][] getConfusionMatrix(double[][] classifier_scores,
                                     java.lang.String[][] model_classifications,
                                     java.lang.String[] possible_categories)
                              throws java.lang.Exception
Returns a confusion matrix showing how recordings were classified compared to how they should have been classified. The criteria used to judge winners are given in the getWinningScores method's description. The indice of the returned array indicates the recording, and corresponds to the first indice of the classifier_scores array.

The rows (first indice) of the returned array correspond to how the recordings were classified, and the columns (second indice) to how they should have been classified. The order of the labels of both the rows and the columns corresponds to that of the possible_categories parameter. One extra row (the last) is included for unknown classifications.

Classifications (and models) consisting of multiple categories are spread over the matrix fractionally for each category, so that every recording has the same weighting (1 in total).

An exception is thrown if the model_classifications contains a category not in possible_categories.

Parameters:
classifier_scores - Classification scores in the form of relative scores for each of the possible categories for each recording (not normalized). Scores should fall in the range between 0.0 and 1.0 and higher scores correspond to a greater certainty that a recording has a corresponding label. The first indice corresponds to recording and the second corresponds to the category.
possible_categories - The names of the categories into which classifier_scores can be classified. Indice of this and second indice of classifier_scores correspond.
model_classifications - The names of the correct categories for each recording. Returns null if null passed here. First indicie indicates recording. There is one entry for each correct category for each recording.
Throws:
java.lang.Exception

getWinnerLabels

public java.lang.String[] getWinnerLabels(double[][] classifier_scores,
                                          java.lang.String[] possible_categories,
                                          java.lang.String[][] model_classifications,
                                          boolean report_scores,
                                          boolean report_second_choices)
                                   throws java.lang.Exception
Returns a formatted set of strings containing the categories which are judged to be winners by the classification system for each recording. The criteria used to judge this are given in the getWinningScores method's description. The indice of the returned array indicates the recording, and corresponds to the first indice of the classifier_scores array.

Additional information that can be included in the report for each recording includes model classifications (null can be sent to the model_classifications parameter if these are not known, the classifier's secondsary choices and actual scores for each chosen category.

One, more than one or no categories may be selected for each recording.

An exception is thrown if the user preferences have inappropriate values.

Parameters:
classifier_scores - Classification scores in the form of relative scores for each of the possible categories for each recording (not normalized). Scores should fall in the range between 0.0 and 1.0 and higher scores correspond to a greater certainty that a recording has a corresponding label. The first indice corresponds to recording and the second corresponds to the category.
possible_categories - The names of the categories into which classifier_scores can be classified. Indice of this and second indice of classifier_scores correspond.
model_classifications - The names of the correct categories for each recording. Ignored if null is passed here. First indicie indicates recording. There is one entry for each correct category for each recording.
report_scores - Whether or not to include reports of score values along with winning categories.
report_second_choices - Whether or not to report second choices along with winning categories.
Throws:
java.lang.Exception

getSuccessRate

public double[] getSuccessRate(double[][] classifier_results,
                               java.lang.String[][] model_classifications,
                               java.lang.String[] classifier_categories)
                        throws java.lang.Exception
Returns the following three statistics about the success of the given classifier results based on the given model classifications:

- The percentage of recordings where the correct categories were found in the first choice. The percentage for a single recording is the number of correct first choices for a category divided by the number of model categories for that recording, and the average of this is found accross recordings.

- The percentage of recordings where at least one category that was missed in the first choices was present in the second choices.

- The percentage of recordings where at least one extra first choice was present that was not in the model categories.

The returned array has these values in entries 0, 1 and 2 respectively.

A value of null is returned if model_classifications is null.

An exception is thrown if the user preferences have inappropriate values.

Parameters:
classifier_results - Classification scores in the form of relative scores for each of the possible categories for each recording (not normalized). Scores should fall in the range between 0.0 and 1.0 and higher scores correspond to a greater certainty that a recording has a corresponding label. The first indice corresponds to recording and the second corresponds to the category.
model_classifications - The correct category(ies) for each of the recordings. The indice identifies the recording that the category(ies) correspond to. A value of null indicates that model results should not be printed.
classifier_categories - The categories into which the classifiers classified test rerordings.
Throws:
java.lang.Exception

getWinners

public boolean[][] getWinners(double[][] classifier_scores)
                       throws java.lang.Exception
Returns true for the winning categories for each recording and false otherwise. See the getWinningScores method for details on how this is determined.

Parameters:
classifier_scores - Classification scores in the form of relative scores for each of the possible categories for each recording (not normalized). Scores should fall in the range between 0.0 and 1.0 and higher scores correspond to a greater certainty that a recording has a corresponding label. The first indice corresponds to recording and the second corresponds to the category.
Returns:
The winners. First indice indicates recording and second indicates category.
Throws:
java.lang.Exception

getWinningScores

public float[][] getWinningScores(double[][] classifier_scores)
                           throws java.lang.Exception
Determines which categories are winners, which are second choices and which are losers given a set of classifier scores for a set of recordings. See below for a description of the returned data. A given recording can have no, one or many winners.

All categories that meet one of the two following conditions for a recording are considered winners (NOTE THAT BELOW SPECIFIC VALUES ARE DEFAULTS. ACTUAL VALUES ARE SET BY USER PREFERENCES):

- Have a score over 0.5
- Have scores within 20% of the highest score for the recording, and must have a score of at least 0.25

All categories that meet one of the two following conditions for a recording are considered second choices:

- Are not winners
- Have scores within 30% of the highest score for the recording, and must have a score of at least 0.2

An exception is thrown if the user preferences have inappropriate values.

Parameters:
classifier_scores - Classification scores in the form of relative scores for each of the possible categories for each recording (not normalized). Scores should fall in the range between 0.0 and 1.0 and higher scores correspond to a greater certainty that a recording has a corresponding label. The first indice corresponds to recording and the second corresponds to the category.
Returns:
The classifier scores of winners and second choices. First indice indicates recording and second indicates category. A score of -10 means that the recording is not classified as belonging to the category, a positive number means that the category is a first choice (the number is the classifier's score) and a negative number (other than -10) means that the category is a second choice for the recording.
Throws:
java.lang.Exception