| Interface | Description |
|---|---|
| Tokenizer |
| Class | Description |
|---|---|
| Annotate |
This class provides the annotation functions to output the tokenized text
into:
A list of WF elements inside a NAF document (DEFAULT)
As running tokenized and segmented text
CoNLL format, namely, one token per line and two newlines for each
sentence.
|
| CLI |
ixa-pipe-tok provides several configuration parameters:
lang: choose language to create the lang attribute in KAF header.
|
| FMeasure |
Evaluation results are the arithmetic mean of the precision scores calculated
for each reference sample and the arithmetic mean of the recall scores
calculated for each reference sample.
|
| NonPeriodBreaker |
This class implements exceptions for periods as sentence breakers and tokens.
|
| Normalizer |
Normalizer class for converting punctuation mostly following various corpora
conventions such as Penn TreeBank, Ancora, Tutpenn, Tiger and CTAG.
|
| RuleBasedTokenizer |
This class provides a multilingual rule based tokenizer.
|
| StringUtils |
Several string utils.
|
| Token |
A
Token object contains a single String, a startOffset and the
length of the String. |
| TokenFactory | |
| TokenizerEvaluator |
The
TokenizerEvaluator measures the performance of a tokenizer wrt to
some reference Tokens. |
Copyright © 2015 IXA pipes. All rights reserved.