public final class XmlTextTokenStream extends TextOffsetTokenStream
| Modifier and Type | Field and Description |
|---|---|
protected Reader |
charStream |
protected Iterator<net.sf.saxon.s9api.XdmNode> |
contentIter |
protected net.sf.saxon.s9api.XdmNode |
curNode |
protected static net.sf.saxon.s9api.XdmSequenceIterator |
EMPTY |
protected org.apache.lucene.analysis.tokenattributes.CharTermAttribute |
termAtt |
| Constructor and Description |
|---|
XmlTextTokenStream(String fieldName,
org.apache.lucene.analysis.Analyzer analyzer,
org.apache.lucene.analysis.TokenStream wrapped,
net.sf.saxon.s9api.XdmNode doc,
Offsets offsets)
Creates a TokenStream returning tokens drawn from the text content of the document.
|
| Modifier and Type | Method and Description |
|---|---|
org.apache.lucene.analysis.TokenStream |
getWrappedTokenStream() |
boolean |
incrementToken() |
protected boolean |
incrementWrappedTokenStream() |
void |
reset() |
void |
reset(Reader reader) |
protected void |
setWrappedTokenStream(org.apache.lucene.analysis.TokenStream wrapped) |
resetTokenizeraddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreStateprotected net.sf.saxon.s9api.XdmNode curNode
protected Iterator<net.sf.saxon.s9api.XdmNode> contentIter
protected org.apache.lucene.analysis.tokenattributes.CharTermAttribute termAtt
protected Reader charStream
protected static final net.sf.saxon.s9api.XdmSequenceIterator EMPTY
public XmlTextTokenStream(String fieldName, org.apache.lucene.analysis.Analyzer analyzer, org.apache.lucene.analysis.TokenStream wrapped, net.sf.saxon.s9api.XdmNode doc, Offsets offsets)
fieldName - nominally: the field to be analyzed; the analyzer receives this when the
token stream is reset at node boundariesanalyzer - specifies what text processing to apply to node textwrapped - a TokenStream generated by the analyzerdoc - tokens will be drawn from all of the text in this documentoffsets - if provided, character offsets are captured in this object
In theory this can be used for faster highlighting, but until that is proven,
this should always be null.public void reset()
throws IOException
reset in class org.apache.lucene.analysis.TokenStreamIOExceptionpublic void reset(Reader reader) throws IOException
IOExceptionpublic boolean incrementToken()
throws IOException
incrementToken in class org.apache.lucene.analysis.TokenStreamIOExceptionpublic org.apache.lucene.analysis.TokenStream getWrappedTokenStream()
protected void setWrappedTokenStream(org.apache.lucene.analysis.TokenStream wrapped)
protected boolean incrementWrappedTokenStream()
throws IOException
IOExceptionCopyright © 2013. All Rights Reserved.