Class TokenizerApplication


  • public class TokenizerApplication
    extends java.lang.Object
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static de.julielab.jcore.ae.jtbd.TokenizerApplication.EvalResult doEvaluation​(java.util.ArrayList<java.lang.String> trainOrgSentences, java.util.ArrayList<java.lang.String> trainTokSentences, java.util.ArrayList<java.lang.String> predictOrgSentences, java.util.ArrayList<java.lang.String> predictTokSentences, java.util.ArrayList<java.lang.String> errors, java.util.ArrayList<java.lang.String> predictions)
      general evaluation function, is called from doCrossEvaluation or do9010Evaluation.
      static void doPrediction​(java.io.File inDir, java.io.File outDir, java.lang.String modelFilename)
      tokenize documents
      static void doTraining​(java.io.File orgSentencesFile, java.io.File tokSentencesFile, java.lang.String modelFilename)
      train a model
      static void main​(java.lang.String[] args)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • TokenizerApplication

        public TokenizerApplication()
    • Method Detail

      • doEvaluation

        public static de.julielab.jcore.ae.jtbd.TokenizerApplication.EvalResult doEvaluation​(java.util.ArrayList<java.lang.String> trainOrgSentences,
                                                                                             java.util.ArrayList<java.lang.String> trainTokSentences,
                                                                                             java.util.ArrayList<java.lang.String> predictOrgSentences,
                                                                                             java.util.ArrayList<java.lang.String> predictTokSentences,
                                                                                             java.util.ArrayList<java.lang.String> errors,
                                                                                             java.util.ArrayList<java.lang.String> predictions)
        general evaluation function, is called from doCrossEvaluation or do9010Evaluation.
        Parameters:
        crf - the crf model
        predictOrgSentences -
        predictTokSentences -
        errors -
        predictions -
        Returns:
      • doPrediction

        public static void doPrediction​(java.io.File inDir,
                                        java.io.File outDir,
                                        java.lang.String modelFilename)
                                 throws java.io.IOException
        tokenize documents
        Parameters:
        inDir - the directory with the documents to be tokenized
        outDir - the directory where the tokenized documents should be written to
        modelFile - the model to use for tokenization
        Throws:
        java.io.IOException
      • doTraining

        public static void doTraining​(java.io.File orgSentencesFile,
                                      java.io.File tokSentencesFile,
                                      java.lang.String modelFilename)
        train a model
        Parameters:
        orgSentencesFile -
        tokSentencesFile -
        modelFilename -
      • main

        public static void main​(java.lang.String[] args)
                         throws java.io.IOException
        Throws:
        java.io.IOException