Class ConsistencyPreservation


  • public class ConsistencyPreservation
    extends Object
    • Field Detail

      • MODE_STRING

        public static final String MODE_STRING
        String matches will be expanded to token boundaries
        See Also:
        Constant Field Values
      • MODE_STRING_TOKEN_BOUNDARIES

        public static final String MODE_STRING_TOKEN_BOUNDARIES
        If set, only create new annotations if the matched string begins and ends exactly with token borders. This avoids partial token matches which are then expanded to the whole token. Should be used for full texts.
        See Also:
        Constant Field Values
    • Constructor Detail

      • ConsistencyPreservation

        public ConsistencyPreservation​(String modesString)
                                throws org.apache.uima.resource.ResourceInitializationException
        builds the modes used during consistency preservation from a string which is a coma-separated list of modes.
        Parameters:
        tring - coma-separated list of modes to be used
        Throws:
        org.apache.uima.analysis_engine.AnalysisEngineProcessException
        org.apache.uima.resource.ResourceInitializationException
    • Method Detail

      • acroMatch

        public void acroMatch​(org.apache.uima.jcas.JCas aJCas,
                              Set<String> entityMentionClassnames)
                       throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
        Throws:
        org.apache.uima.analysis_engine.AnalysisEngineProcessException
      • stringMatch

        public void stringMatch​(org.apache.uima.jcas.JCas aJCas,
                                TreeSet<String> entityMentionClassnames,
                                double confidenceThresholdForConsistencyPreservation)
                         throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
        consistency presevation based on (exact) string matching. If string was annotated once as entity, all other occurrences of this string get the same label. For mode: _string_ TODO: more intelligent (voting) mechanism needed to avoid false positives
        Parameters:
        aJCas -
        entityMentionClassnames -
        confidenceThresholdForConsistencyPreservation -
        Throws:
        org.apache.uima.analysis_engine.AnalysisEngineProcessException