ace.datatypes
Class SegmentedClassification

java.lang.Object
  extended by ace.datatypes.SegmentedClassification
All Implemented Interfaces:
java.io.Serializable

public class SegmentedClassification
extends java.lang.Object
implements java.io.Serializable

Objects of this class each hold classifications for an instance. These classifications can be model classifications or the output of a classifier.

Each SegmentedClassification object may be divided into sections, each with its own class. If only overall classifications are needed for an instance, then there is no need for sub-sections, as each SegmentedClassification can have its own class(es). Each instance may be classified as belonging to one, multiple or no classes.

Meta-information can also be stored regarding each SegmentedClassification object.

Static methods are provided to extract the labels (both overall and for sub-sections) of DataSets based on SegmentedClassifications.

See Also:
Serialized Form

Field Summary
 java.lang.String[] classifications
          The class(es) that this top-level SegmentedClassification or section belongs to.
 java.lang.String identifier
          The name of the dataset referred to by a top-level SegmentedClassification.
 java.lang.String[] misc_info_info
          Can store various pieces of meta-data regarding an instance.
 java.lang.String[] misc_info_key
          Stores titles identifying the meta-data in the misc_info_info field.
 java.lang.String role
          Can be used internally by ACE to determine what a particular instance is for (e.g.
 double start
          Identifies the start of a sub-classification.
 double stop
          Identifies the end of a sub-classification.
 SegmentedClassification[] sub_classifications
          Classifications corresponding to sub-sections of an instance.
 
Constructor Summary
SegmentedClassification()
          Generate an empty SegmentedClassification with the name "Undefined Segmented Classification".
SegmentedClassification(weka.core.Instance instance, int i)
          Generates a SegmentedClassification from a Weka ARFF file.
SegmentedClassification(java.lang.String identifier, double start, double stop, java.lang.String[] classifications, java.lang.String[] misc_info, java.lang.String[] misc_info_key, SegmentedClassification[] sub_classifications)
          Explicitly create a new Classification.
 
Method Summary
static SegmentedClassification findMatchingClassification(DataSet instance, SegmentedClassification[] classes)
          Compares the given DataSet object to each of the SegmentedClassification objects in order to find a SegmentedClassification object with the same identifier as the given DataSet object.
 java.lang.String getClassificationDescription(int depth)
          Generate a formatted String detailing the contents of this SegmentedClassification.
static java.lang.String getClassificationDescriptions(SegmentedClassification[] seg_classes)
          Returns a formatted text description of the given SegmentedClassification objects.
static java.lang.String[] getLeafClasses(SegmentedClassification[] seg_classes)
          Returns an array containing the names of all classes that any instances or sub-sections of the given SegmentedClassification belong to.
static java.lang.String[][][] getMergedSectionalClassifications(SegmentedClassification[] sub_classes, DataSet[] sub_set, double[][][] times)
          Finds the timeframes in which the sub-sets overlap with the sub-classifications.
static int getNumberOverallInstancesBelongingToClass(SegmentedClassification[] model_classifications, java.lang.String class_of_interest)
          Returns the number of given instances that belong to the given class.
static int getNumberSectionsInInstancesBelongingToClass(SegmentedClassification[] model_classifications, java.lang.String class_of_interest)
          Returns the number of given instances that have sections belonging to the given class.
static java.lang.String[][] getOverallLabelsOfDataSets(DataSet[] data_sets, SegmentedClassification[] set_classifications)
          Returns a 2-D array describing the top-level label(s) of the given DataSets, according to the given SegmentedClassifications.
static java.lang.String[][][] getSubSectionLabelsOfDataSets(DataSet[] data_sets, SegmentedClassification[] set_classifications)
          Returns a 3-D array describing the sub-section label(s) of the given DataSets, according to the given SegmentedClassifications.
static void mergeAdjacentSections(SegmentedClassification verbose_classifications)
          Merges any overlapping adjacent sub-sections of the given SegmentedClassification if they belong to the same class(es) and the end of the first matches the start of the second.
static SegmentedClassification[] parseClassificationsFile(java.lang.String model_classification_file_path)
          Parses a classifications_file XML file and returns an array of SegmentedClassification objects holding its contents.
static void saveClassifications(SegmentedClassification[] seg_classifications, java.io.File to_save_to, java.lang.String comments)
          Saves a classifications_file XML file with the contents specified in the given SegmentedClassification array and the comments specified in the comments parameter.
static boolean verifyUniquenessOfIdentifiers(SegmentedClassification[] seg_classes)
          Verifies that none of the given set of SegmentedClassification refer to data sets with the same identifiers.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

identifier

public java.lang.String identifier
The name of the dataset referred to by a top-level SegmentedClassification. Should be set to null for SegmentedClassifications that correspond to sub-sections. Should never be null for top-level SegmentedClassifications.


classifications

public java.lang.String[] classifications
The class(es) that this top-level SegmentedClassification or section belongs to. It is possible to have zero, one or no classifications for a given instance. If no classifications are present, then this should be null.


misc_info_info

public java.lang.String[] misc_info_info
Can store various pieces of meta-data regarding an instance. Entries correspond to entries of the misc_info_key field. Should be set to null for SegmentedClassifications that correspond to sub-sections. Set to null if no meta-data is stored.


misc_info_key

public java.lang.String[] misc_info_key
Stores titles identifying the meta-data in the misc_info_info field. Entries correspond to entries of the misc_info_key field. Should be set to null for SegmentedClassifications that correspond to sub-sections. Set to null if no meta-data is stored.


role

public java.lang.String role
Can be used internally by ACE to determine what a particular instance is for (e.g. training, testing, resulting classification of an unknown. Set to null if not used.


sub_classifications

public SegmentedClassification[] sub_classifications
Classifications corresponding to sub-sections of an instance. Set to null if there are no sub-sections.


start

public double start
Identifies the start of a sub-classification. Set to NaN if this object is a top-level SegmentedClassification.


stop

public double stop
Identifies the end of a sub-classification. Set to NaN if this object is a top-level SegmentedClassification.

Constructor Detail

SegmentedClassification

public SegmentedClassification()
Generate an empty SegmentedClassification with the name "Undefined Segmented Classification".


SegmentedClassification

public SegmentedClassification(java.lang.String identifier,
                               double start,
                               double stop,
                               java.lang.String[] classifications,
                               java.lang.String[] misc_info,
                               java.lang.String[] misc_info_key,
                               SegmentedClassification[] sub_classifications)
Explicitly create a new Classification.

Parameters:
identifier - Unique identifier for this instance.
start - Time at which this section begins.
stop - Time at which this section ends.
misc_info - Miscellaneous information about this instance.
classifications - Array of classifications for this instance.
sub_classifications - The sub-classifications of this classification.
misc_info_key - Specifies what type of meta data is contained in misc_info.

SegmentedClassification

public SegmentedClassification(weka.core.Instance instance,
                               int i)
Generates a SegmentedClassification from a Weka ARFF file. Identifier is in the format #_feature values where # is the number of the instance and feature values are separated by a commma just as they are represented in the original Weka ARFF file.

Parameters:
instance - The classified Weka Instance.
i - A counter that increments each time a new SegmentedClassification object is created. Used when naming instances to guarantee uniqueness of identifiers.
Method Detail

getClassificationDescription

public java.lang.String getClassificationDescription(int depth)
Generate a formatted String detailing the contents of this SegmentedClassification.

Parameters:
depth - How deep this SegmentedClassification is in a hierarchy of SegmentedClassification (i.e. through the sub_classifications field). This parameter should generally be 0 when called externally, as this method operates recursively.
Returns:
A formatted string describing this SegmentedClassification.

getClassificationDescriptions

public static java.lang.String getClassificationDescriptions(SegmentedClassification[] seg_classes)
Returns a formatted text description of the given SegmentedClassification objects.

Parameters:
seg_classes - The classifications to describe.
Returns:
The formatted description.

getNumberOverallInstancesBelongingToClass

public static int getNumberOverallInstancesBelongingToClass(SegmentedClassification[] model_classifications,
                                                            java.lang.String class_of_interest)
Returns the number of given instances that belong to the given class. Only top level class membership is considered (i.e. the classes that sections belong to are ignored).

Parameters:
model_classifications - The model classifications to search for the number of instances belonging to the given class.
class_of_interest - The class to search for.
Returns:
The number of instances in model_classifications that belong to class_of_interest.

getNumberSectionsInInstancesBelongingToClass

public static int getNumberSectionsInInstancesBelongingToClass(SegmentedClassification[] model_classifications,
                                                               java.lang.String class_of_interest)
Returns the number of given instances that have sections belonging to the given class. Only section level class membership is considered (i.e. the classes that instances belong to as a whole are ignored).

Parameters:
model_classifications - The model classifications to search for the number of instance sections belonging to the given class.
class_of_interest - The class to search for.
Returns:
The number of sections of instances in model_classifications that belong to class_of_interest.

getLeafClasses

public static java.lang.String[] getLeafClasses(SegmentedClassification[] seg_classes)
Returns an array containing the names of all classes that any instances or sub-sections of the given SegmentedClassification belong to. All duplicates removed, so each class is referred to once and only once.

Parameters:
seg_classes - The classifications whose classes are to be returned.
Returns:
The classes, with duplicates removed.

verifyUniquenessOfIdentifiers

public static boolean verifyUniquenessOfIdentifiers(SegmentedClassification[] seg_classes)
Verifies that none of the given set of SegmentedClassification refer to data sets with the same identifiers.

Parameters:
seg_classes - The SegmentedClassifications to verify the uniqueness of.
Returns:
True if the identifiers are all unique, false if they are not.

getOverallLabelsOfDataSets

public static java.lang.String[][] getOverallLabelsOfDataSets(DataSet[] data_sets,
                                                              SegmentedClassification[] set_classifications)
                                                       throws java.lang.Exception
Returns a 2-D array describing the top-level label(s) of the given DataSets, according to the given SegmentedClassifications. The first indice of the returned array identifies the DataSet, and entries correspond in number and order to the data_sets parameter. The second indice indentifies the label(s) for the given DataSet. No order is enforced on the labels.

The returned array has a first dimension size equal to the number of data sets in the data_sets parameter. The first dimension will be set to null if the corresponding DataSet does not have a corresponding entry in the given SegmentedClassifications with the same identifier or if the classifications in the corresponding SegmentedClassifications entry are null.

Parameters:
data_sets - The DataSets to find top-level labels for.
set_classifications - Model classifications.
Returns:
The top-level labels for the data_sets parameter.
Throws:
java.lang.Exception - An exception is thrown if the given SegmentedClassification contain multiple data sets with the same identifier.

getSubSectionLabelsOfDataSets

public static java.lang.String[][][] getSubSectionLabelsOfDataSets(DataSet[] data_sets,
                                                                   SegmentedClassification[] set_classifications)
                                                            throws java.lang.Exception
Returns a 3-D array describing the sub-section label(s) of the given DataSets, according to the given SegmentedClassifications. The first indice of the returned array identifies the DataSet, and entries correspond in number and order to the data_sets parameter. The second indice identifies the sub-section, and entries correspond in number and order to the data_sets parameter. The third indice indentifies the label(s) for the given sub-section. No order is enforced on the labels.

The returned array has a first dimension size equal to the number of data sets in the data_sets parameter. The first dimension of the returned array will be set to null if the corresponding DataSet does not have a corresponding entry in the given SegmentedClassifications with the same identifier. The second dimension of the returned array will be null if there is no corresponding section is available in the set_classifications for the given section. The third dimension of the returned array will be null if no classifications are available in the corresponding SegmentedClassification for the given section.

Parameters:
data_sets - The DataSets to find sub-section labels for.
set_classifications - Model classifications.
Returns:
The labels of all sub-sections.
Throws:
java.lang.Exception - An exception is thrown if the given SegmentedClassification contain multiple data sets with the same identifier.

parseClassificationsFile

public static SegmentedClassification[] parseClassificationsFile(java.lang.String model_classification_file_path)
                                                          throws java.lang.Exception
Parses a classifications_file XML file and returns an array of SegmentedClassification objects holding its contents. An exception is thrown if the file is invalid in some way.

Parameters:
model_classification_file_path - The path of the XML file to parse.
Returns:
Array of SegmentedClassification objects that were derived from the given ACE XML model classification file.
Throws:
java.lang.Exception - Informative exceptions is thrown if an invalid file or file path is specified.

saveClassifications

public static void saveClassifications(SegmentedClassification[] seg_classifications,
                                       java.io.File to_save_to,
                                       java.lang.String comments)
                                throws java.lang.Exception
Saves a classifications_file XML file with the contents specified in the given SegmentedClassification array and the comments specified in the comments parameter.

Parameters:
seg_classifications - The SegmentedClassifications to save.
to_save_to - The file to save to.
comments - Any comments to be saved inside the comments element of the XML file.
Throws:
java.lang.Exception - An informative exception is thrown if the file cannot be saved.

mergeAdjacentSections

public static void mergeAdjacentSections(SegmentedClassification verbose_classifications)
Merges any overlapping adjacent sub-sections of the given SegmentedClassification if they belong to the same class(es) and the end of the first matches the start of the second. No change is made if there are no such sections or if there are no sub-sections at all.

IMPORTANT: It is assumed that sections whose start and stop sections overlap have adjacent indices, as these are all that are checked. It is also assumed that the start of a section is before the start of the next section in memory and that the stop of a section is before the stop of the next section in memory.

Parameters:
verbose_classifications - The classifications whose sections are to be merged if appropriate.

getMergedSectionalClassifications

public static java.lang.String[][][] getMergedSectionalClassifications(SegmentedClassification[] sub_classes,
                                                                       DataSet[] sub_set,
                                                                       double[][][] times)
Finds the timeframes in which the sub-sets overlap with the sub-classifications. Used when displaying the DataSet objects and SegmentedClassification objects in the Instances Panel in the GUI. Assumes that sub_set and sub_classes are sections of the same overall instance.

Parameters:
sub_classes - The classifications of the subsections of this instance.
sub_set - The subsections of the overall instance.
times - Will be filled with the exact start and stop time that sub classifications overlap with the subsections. These start and stop times will be included in a formatted String that will be displayed in the "Classes" column.
Returns:
A 3D array containing the classifications for each subsection. 1st index indicated the subsection, 2nd index indicates the subclassifications that overlap with the subsections. These subclassifications are arrays themselves, hence the 3rd index.

findMatchingClassification

public static SegmentedClassification findMatchingClassification(DataSet instance,
                                                                 SegmentedClassification[] classes)
Compares the given DataSet object to each of the SegmentedClassification objects in order to find a SegmentedClassification object with the same identifier as the given DataSet object. This is used in the GUI when displaying DataSet and SegmentedClassification objects in the same Instances Panel.

Parameters:
instance - The overall section to match with a classification.
classes - An array of all possible classifications.
Returns:
The SegmentedClassification in the given array that corresponds to the given DataSet. Returns null if none of the classifications match.