org.dspace.search
Class DSTokenizer

java.lang.Object
  extended by org.apache.lucene.util.AttributeSource
      extended by org.apache.lucene.analysis.TokenStream
          extended by org.apache.lucene.analysis.Tokenizer
              extended by org.apache.lucene.analysis.CharTokenizer
                  extended by org.dspace.search.DSTokenizer
All Implemented Interfaces:
Closeable

public final class DSTokenizer
extends org.apache.lucene.analysis.CharTokenizer

Customized Lucene Tokenizer, since the standard one rejects numbers from indexing/querying.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State
 
Field Summary
 
Fields inherited from class org.apache.lucene.analysis.Tokenizer
input
 
Constructor Summary
DSTokenizer(org.apache.lucene.util.Version version, Reader in)
          Construct a new LowerCaseTokenizer.
 
Method Summary
protected  boolean isTokenChar(int c)
          Collects only characters which do not satisfy Character.isWhitespace(char).
protected  int normalize(int c)
          Collects only characters which satisfy Character.isLetter(char).
 
Methods inherited from class org.apache.lucene.analysis.CharTokenizer
end, incrementToken, isTokenChar, normalize, reset
 
Methods inherited from class org.apache.lucene.analysis.Tokenizer
close, correctOffset
 
Methods inherited from class org.apache.lucene.analysis.TokenStream
reset
 
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DSTokenizer

public DSTokenizer(org.apache.lucene.util.Version version,
                   Reader in)
Construct a new LowerCaseTokenizer.

Parameters:
version - Lucene version number
Method Detail

normalize

protected int normalize(int c)
Collects only characters which satisfy Character.isLetter(char).

Overrides:
normalize in class org.apache.lucene.analysis.CharTokenizer

isTokenChar

protected boolean isTokenChar(int c)
Collects only characters which do not satisfy Character.isWhitespace(char).

Overrides:
isTokenChar in class org.apache.lucene.analysis.CharTokenizer


Copyright © 2012 DuraSpace. All Rights Reserved.