Class POSTagger

    • Constructor Detail

      • POSTagger

        public POSTagger()
        default constructor
      • POSTagger

        public POSTagger​(File featureConfigFile)
        constructor for feature config file
        Parameters:
        featureConfigFile -
    • Method Detail

      • isTrained

        public boolean isTrained()
        returns true when model has been successfully trained.
        Returns:
        true if trained
      • train

        public void train​(ArrayList<Sentence> sentences)
        this is to train a NE model (based on CRF); when trained, the model is stored internally. The model can be saved to disk using the writeModel command.
        Parameters:
        sentences - training data, an ArrayList of Sentence objects, File which contains the feature subset to be used in a text format
      • predictForUIMA

        public void predictForUIMA​(Sentence sentence)
        predicts the entity labels by means of a model. this method is needed by UIMA-JNET!
        Parameters:
        sentence - a Sentence object containing all units (= tokens) of that sentence
      • predictForCLI

        public ArrayList<String> predictForCLI​(ArrayList<Sentence> sentences)
        predict the entity labels by means of a previously learned model. this method is used by JNET stand alone version (for UIMA-JNET see other predict method) Output is an arraylist of IOB
        Parameters:
        sentences - an ArrayList of Sentence objects
        Returns:
        IOB output for the sentences to be predicted. Each element of the ArrayList is a string which refers to one word and its label ("token\tlabel")
      • writeModel

        public void writeModel​(String filename)
        Save the model learned to disk. THis is done via Java's object serialization.
        Parameters:
        filename - where to write it (full path!)
      • getModel

        public Object getModel()
        return the model
      • setFeatureConfig

        public void setFeatureConfig​(Properties featureConfig)
      • getFeatureConfig

        public Properties getFeatureConfig()
      • PPDtoUnits

        public Sentence PPDtoUnits​(String sentence)
        takes a sentence in piped format and returns the corresponding unit sentence as a Sentence object
        Parameters:
        sentence - in piped format to be converted
      • getNumber_Iterations

        public int getNumber_Iterations()
      • set_Number_Iterations

        public void set_Number_Iterations​(int number_iter)