Class NoPunctuationTokenizer

java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.util.CharTokenizer
org.fryske_akademy.exist.lucene.NoPunctuationTokenizer
All Implemented Interfaces:
Closeable, AutoCloseable

public class NoPunctuationTokenizer extends org.apache.lucene.analysis.util.CharTokenizer
For this tokenizer every character is a tokenChar except whitespace and . , ; ? ! : [ ] ( ) { } " '
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

    org.apache.lucene.util.AttributeSource.State
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final char[]
     

    Fields inherited from class org.apache.lucene.analysis.Tokenizer

    input

    Fields inherited from class org.apache.lucene.analysis.TokenStream

    DEFAULT_TOKEN_ATTRIBUTE_FACTORY

    Fields inherited from class org.apache.lucene.util.AttributeSource

    DEFAULT_ATTRIBUTE_FACTORY
  • Constructor Summary

    Constructors
    Constructor
    Description
     
    NoPunctuationTokenizer(org.apache.lucene.util.AttributeFactory factory, Reader input)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected boolean
    isTokenChar(int c)
     

    Methods inherited from class org.apache.lucene.analysis.util.CharTokenizer

    end, incrementToken, normalize, reset

    Methods inherited from class org.apache.lucene.analysis.Tokenizer

    close, correctOffset, setReader

    Methods inherited from class org.apache.lucene.util.AttributeSource

    addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Field Details

    • PUNCTS

      public static final char[] PUNCTS
  • Constructor Details

    • NoPunctuationTokenizer

      public NoPunctuationTokenizer(Reader input)
    • NoPunctuationTokenizer

      public NoPunctuationTokenizer(org.apache.lucene.util.AttributeFactory factory, Reader input)
  • Method Details

    • isTokenChar

      protected boolean isTokenChar(int c)
      Specified by:
      isTokenChar in class org.apache.lucene.analysis.util.CharTokenizer