Package de.julielab.jcore.ae.jtbd
Class TokenizerApplication
- java.lang.Object
-
- de.julielab.jcore.ae.jtbd.TokenizerApplication
-
public class TokenizerApplication extends java.lang.Object
-
-
Constructor Summary
Constructors Constructor Description TokenizerApplication()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static de.julielab.jcore.ae.jtbd.TokenizerApplication.EvalResultdoEvaluation(java.util.ArrayList<java.lang.String> trainOrgSentences, java.util.ArrayList<java.lang.String> trainTokSentences, java.util.ArrayList<java.lang.String> predictOrgSentences, java.util.ArrayList<java.lang.String> predictTokSentences, java.util.ArrayList<java.lang.String> errors, java.util.ArrayList<java.lang.String> predictions)general evaluation function, is called from doCrossEvaluation or do9010Evaluation.static voiddoPrediction(java.io.File inDir, java.io.File outDir, java.lang.String modelFilename)tokenize documentsstatic voiddoTraining(java.io.File orgSentencesFile, java.io.File tokSentencesFile, java.lang.String modelFilename)train a modelstatic voidmain(java.lang.String[] args)
-
-
-
Method Detail
-
doEvaluation
public static de.julielab.jcore.ae.jtbd.TokenizerApplication.EvalResult doEvaluation(java.util.ArrayList<java.lang.String> trainOrgSentences, java.util.ArrayList<java.lang.String> trainTokSentences, java.util.ArrayList<java.lang.String> predictOrgSentences, java.util.ArrayList<java.lang.String> predictTokSentences, java.util.ArrayList<java.lang.String> errors, java.util.ArrayList<java.lang.String> predictions)general evaluation function, is called from doCrossEvaluation or do9010Evaluation.- Parameters:
crf- the crf modelpredictOrgSentences-predictTokSentences-errors-predictions-- Returns:
-
doPrediction
public static void doPrediction(java.io.File inDir, java.io.File outDir, java.lang.String modelFilename) throws java.io.IOExceptiontokenize documents- Parameters:
inDir- the directory with the documents to be tokenizedoutDir- the directory where the tokenized documents should be written tomodelFile- the model to use for tokenization- Throws:
java.io.IOException
-
doTraining
public static void doTraining(java.io.File orgSentencesFile, java.io.File tokSentencesFile, java.lang.String modelFilename)train a model- Parameters:
orgSentencesFile-tokSentencesFile-modelFilename-
-
main
public static void main(java.lang.String[] args) throws java.io.IOException- Throws:
java.io.IOException
-
-