public class SimpleTokenizer extends Object
| Constructor and Description |
|---|
SimpleTokenizer()
Initializes a new TTokenizer object.
|
| Modifier and Type | Method and Description |
|---|---|
SDocumentGraph |
getDocumentGraph() |
void |
setDocumentGraph(SDocumentGraph documentGraph) |
List<SToken> |
tokenize(STextualDS textualDSs,
Character... separator)
Sets the
STextualDS to be tokenized. |
List<SToken> |
tokenize(STextualDS textualDS,
Integer startPos,
Integer endPos,
Character... separator)
Sets the
STextualDS to be tokenized and the language of the text. |
public void setDocumentGraph(SDocumentGraph documentGraph)
public SDocumentGraph getDocumentGraph()
public List<SToken> tokenize(STextualDS textualDSs, Character... separator)
STextualDS to be tokenized. Its language will be
detected automatically if possible.textualDSs - public List<SToken> tokenize(STextualDS textualDS, Integer startPos, Integer endPos, Character... separator)
STextualDS to be tokenized and the language of the text.
If language is null, it will be detected automatically if possible.sTextualDSs - STextualDS object containing the text to be tokenizedstartPos - start position, if text to be tokenized is subset (0 assumed
if set to null)startPos - end position, if text to be tokenized is subset (length of
text assumed if set to null)Copyright © 2009–2016 Humboldt-Universität zu Berlin, INRIA. All rights reserved.