Class ConsistencyPreservation


  • public class ConsistencyPreservation
    extends java.lang.Object
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String MODE_ACRO2FULL  
      static java.lang.String MODE_FULL2ACRO  
      static java.lang.String MODE_STRING
      String matches will be expanded to token boundaries
      static java.lang.String MODE_STRING_TOKEN_BOUNDARIES
      If set, only create new annotations if the matched string begins and ends exactly with token borders.
    • Constructor Summary

      Constructors 
      Constructor Description
      ConsistencyPreservation​(java.lang.String modesString)
      builds the modes used during consistency preservation from a string which is a coma-separated list of modes.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void acroMatch​(org.apache.uima.jcas.JCas aJCas, java.util.Set<java.lang.String> entityMentionClassnames)  
      void stringMatch​(org.apache.uima.jcas.JCas aJCas, java.util.TreeSet<java.lang.String> entityMentionClassnames, double confidenceThresholdForConsistencyPreservation)
      consistency presevation based on (exact) string matching.
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • MODE_STRING

        public static final java.lang.String MODE_STRING
        String matches will be expanded to token boundaries
        See Also:
        Constant Field Values
      • MODE_STRING_TOKEN_BOUNDARIES

        public static final java.lang.String MODE_STRING_TOKEN_BOUNDARIES
        If set, only create new annotations if the matched string begins and ends exactly with token borders. This avoids partial token matches which are then expanded to the whole token. Should be used for full texts.
        See Also:
        Constant Field Values
    • Constructor Detail

      • ConsistencyPreservation

        public ConsistencyPreservation​(java.lang.String modesString)
                                throws org.apache.uima.resource.ResourceInitializationException
        builds the modes used during consistency preservation from a string which is a coma-separated list of modes.
        Parameters:
        tring - coma-separated list of modes to be used
        Throws:
        org.apache.uima.analysis_engine.AnalysisEngineProcessException
        org.apache.uima.resource.ResourceInitializationException
    • Method Detail

      • acroMatch

        public void acroMatch​(org.apache.uima.jcas.JCas aJCas,
                              java.util.Set<java.lang.String> entityMentionClassnames)
                       throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
        Throws:
        org.apache.uima.analysis_engine.AnalysisEngineProcessException
      • stringMatch

        public void stringMatch​(org.apache.uima.jcas.JCas aJCas,
                                java.util.TreeSet<java.lang.String> entityMentionClassnames,
                                double confidenceThresholdForConsistencyPreservation)
                         throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
        consistency presevation based on (exact) string matching. If string was annotated once as entity, all other occurrences of this string get the same label. For mode: _string_ TODO: more intelligent (voting) mechanism needed to avoid false positives
        Parameters:
        aJCas -
        entityMentionClassnames -
        confidenceThresholdForConsistencyPreservation -
        Throws:
        org.apache.uima.analysis_engine.AnalysisEngineProcessException
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object