| Package | Description |
|---|---|
| eus.ixa.ixa.pipe.ml.tok |
Package containing the Sentence Segmenter and Tokenizer classes.
|
| eus.ixa.ixa.pipe.ml.utils |
Utility classes.
|
| Modifier and Type | Method and Description |
|---|---|
Token |
TokenFactory.createToken(String tokenString,
int offset,
int length)
Constructs a Token as a String with corresponding offsets and length from
which to calculate start and end position of the Token.
|
| Modifier and Type | Method and Description |
|---|---|
List<List<Token>> |
Tokenizer.tokenize(String[] sentence) |
List<List<Token>> |
RuleBasedTokenizer.tokenize(String[] sentences) |
| Modifier and Type | Method and Description |
|---|---|
static void |
Normalizer.convertNonCanonicalStrings(List<Token> sentence,
String lang)
Converts non-unicode and other strings into their unicode counterparts.
|
static void |
Normalizer.normalizeDoubleQuotes(List<Token> sentence,
String lang)
Normalizes double and ambiguous quotes according to language and corpus.
|
static void |
Normalizer.normalizeQuotes(List<Token> sentence,
String lang)
Normalizes non-ambiguous quotes according to language and corpus.
|
static void |
RuleBasedTokenizer.normalizeTokens(List<List<Token>> tokens,
String lang)
Set as value of the token its normalized counterpart.
|
| Modifier and Type | Method and Description |
|---|---|
static String[] |
StringUtils.convertListTokenToArrayStrings(List<Token> tokenizedSentence)
Convert a list of token objects (e.g.
|
Copyright © 2017 IXA pipes. All rights reserved.