Class FeatureManager


  • public class FeatureManager
    extends java.lang.Object
    • Constructor Summary

      Constructors 
      Constructor Description
      FeatureManager()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static int[] getFeatureFromSampleVector​(java.util.List<RankList> samples)
      Obtain all features present in a sample set.
      static void main​(java.lang.String[] args)  
      static void prepareCV​(java.util.List<RankList> samples, int nFold, float tvs, java.util.List<java.util.List<RankList>> trainingData, java.util.List<java.util.List<RankList>> validationData, java.util.List<java.util.List<RankList>> testData)
      Split the input sample set into k chunks (folds) of roughly equal size and create train/test data for each fold.
      static void prepareCV​(java.util.List<RankList> samples, int nFold, java.util.List<java.util.List<RankList>> trainingData, java.util.List<java.util.List<RankList>> testData)
      Split the input sample set into k chunks (folds) of roughly equal size and create train/test data for each fold.
      static void prepareSplit​(java.util.List<RankList> samples, double percentTrain, java.util.List<RankList> trainingData, java.util.List<RankList> testData)
      Split the input sample set into 2 chunks: one for training and one for either validation or testing
      static void printQueriesForSplit​(java.lang.String name, java.util.List<java.util.List<RankList>> split)  
      static int[] readFeature​(java.lang.String featureDefFile)
      Read features specified in an input feature file.
      static java.util.List<RankList> readInput​(java.lang.String inputFile)
      Read a set of rankings from a single file.
      static java.util.List<RankList> readInput​(java.lang.String inputFile, boolean mustHaveRelDoc, boolean useSparseRepresentation)
      Read a set of rankings from a single file.
      static java.util.List<RankList> readInput​(java.util.List<java.lang.String> inputFiles)
      Read sets of rankings from multiple files.
      static void save​(java.util.List<RankList> samples, java.lang.String outputFile)
      Save a sample set to file
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • FeatureManager

        public FeatureManager()
    • Method Detail

      • main

        public static void main​(java.lang.String[] args)
        Parameters:
        args -
      • readInput

        public static java.util.List<RankList> readInput​(java.lang.String inputFile)
        Read a set of rankings from a single file.
        Parameters:
        inputFile -
        Returns:
      • readInput

        public static java.util.List<RankList> readInput​(java.lang.String inputFile,
                                                         boolean mustHaveRelDoc,
                                                         boolean useSparseRepresentation)
        Read a set of rankings from a single file.
        Parameters:
        inputFile -
        mustHaveRelDoc -
        useSparseRepresentation -
        Returns:
      • readInput

        public static java.util.List<RankList> readInput​(java.util.List<java.lang.String> inputFiles)
        Read sets of rankings from multiple files. Then merge them altogether into a single ranking.
        Parameters:
        inputFiles -
        Returns:
      • readFeature

        public static int[] readFeature​(java.lang.String featureDefFile)
        Read features specified in an input feature file. Expecting one feature per line.
        Parameters:
        featureDefFile -
        Returns:
      • getFeatureFromSampleVector

        public static int[] getFeatureFromSampleVector​(java.util.List<RankList> samples)
        Obtain all features present in a sample set.
        Parameters:
        samples -
        Returns:
      • prepareCV

        public static void prepareCV​(java.util.List<RankList> samples,
                                     int nFold,
                                     java.util.List<java.util.List<RankList>> trainingData,
                                     java.util.List<java.util.List<RankList>> testData)
        Split the input sample set into k chunks (folds) of roughly equal size and create train/test data for each fold. Note that NO randomization is done. If you want to randomly split the data, make sure that you randomize the order in the input samples prior to calling this function.
        Parameters:
        samples -
        nFold -
        trainingData -
        testData -
      • prepareCV

        public static void prepareCV​(java.util.List<RankList> samples,
                                     int nFold,
                                     float tvs,
                                     java.util.List<java.util.List<RankList>> trainingData,
                                     java.util.List<java.util.List<RankList>> validationData,
                                     java.util.List<java.util.List<RankList>> testData)
        Split the input sample set into k chunks (folds) of roughly equal size and create train/test data for each fold. Then it further splits the training data in each fold into train and validation. Note that NO randomization is done. If you want to randomly split the data, make sure that you randomize the order in the input samples prior to calling this function.
        Parameters:
        samples -
        nFold -
        tvs - Train/validation split ratio
        trainingData -
        validationData -
        testData -
      • printQueriesForSplit

        public static void printQueriesForSplit​(java.lang.String name,
                                                java.util.List<java.util.List<RankList>> split)
      • prepareSplit

        public static void prepareSplit​(java.util.List<RankList> samples,
                                        double percentTrain,
                                        java.util.List<RankList> trainingData,
                                        java.util.List<RankList> testData)
        Split the input sample set into 2 chunks: one for training and one for either validation or testing
        Parameters:
        samples -
        percentTrain - The percentage of data used for training
        trainingData -
        testData -
      • save

        public static void save​(java.util.List<RankList> samples,
                                java.lang.String outputFile)
        Save a sample set to file
        Parameters:
        samples -
        outputFile -