- ellipsis - Static variable in class eus.ixa.ixa.pipe.tok.Normalizer
-
- endInsideQuotesPara - Static variable in class eus.ixa.ixa.pipe.seg.RuleBasedSegmenter
-
End of sentence marker, maybe a space, punctuation (quotes, brackets),
space, maybe some more punctuation, maybe some space and uppercase.
- endInsideQuotesSpace - Static variable in class eus.ixa.ixa.pipe.seg.RuleBasedSegmenter
-
End of sentence marker, maybe a space, punctuation (quotes, brackets),
space, maybe some more punctuation, maybe some space and uppercase.
- endLink - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
- endOfSentenceApos - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
Tokenize apostrophes ocurring at the end of the string.
- endPunctLinkPara - Static variable in class eus.ixa.ixa.pipe.seg.RuleBasedSegmenter
-
End of sentence markers, paragraph mark and link.
- endPunctLinkSpace - Static variable in class eus.ixa.ixa.pipe.seg.RuleBasedSegmenter
-
End of sentence punctuation, maybe spaces and link.
- englishApos - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
Split English apostrophes.
- eus.ixa.ixa.pipe.seg - package eus.ixa.ixa.pipe.seg
-
- eus.ixa.ixa.pipe.tok - package eus.ixa.ixa.pipe.tok
-
- evaluate(List<Token>, List<Token>) - Method in class eus.ixa.ixa.pipe.tok.TokenizerEvaluator
-
Evaluates the given reference Token list wrt to the predicted Token list.
- noAlphaAposNoAlpha - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
No alphabetic apostrophe and no alphabetic.
- noAlphaDigitAposAlpha - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
Non alpha, digit, apostrophe and alpha.
- noDigitComma - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
No digit comma.
- noDigitCommaDigit - Static variable in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
Non digit comma and digit.
- NON_BREAKER_DIGITS - Static variable in class eus.ixa.ixa.pipe.tok.NonPeriodBreaker
-
Do not split dot after these words if followed by number.
- nonBreakerDigits - Static variable in class eus.ixa.ixa.pipe.tok.NonPeriodBreaker
-
Re-attach segmented dots after non breaker digits.
- NonPeriodBreaker - Class in eus.ixa.ixa.pipe.tok
-
This class implements exceptions for periods as sentence breakers and tokens.
- NonPeriodBreaker(Properties) - Constructor for class eus.ixa.ixa.pipe.tok.NonPeriodBreaker
-
This constructor reads some non breaking prefixes files in resources to
create exceptions of segmentation and tokenization.
- noPeriodSpaceEnd - Static variable in class eus.ixa.ixa.pipe.seg.RuleBasedSegmenter
-
Non-period end of sentence markers (?!), one or more spaces, sentence
starters.
- normalizeDoubleQuotes(List<Token>, String) - Static method in class eus.ixa.ixa.pipe.tok.Normalizer
-
Normalizes double and ambiguous quotes according to language
and corpus.
- normalizeQuotes(List<Token>, String) - Static method in class eus.ixa.ixa.pipe.tok.Normalizer
-
Normalizes non-ambiguous quotes according to language and corpus.
- Normalizer - Class in eus.ixa.ixa.pipe.tok
-
Normalizer class for converting punctuation mostly following various corpora
conventions such as Penn TreeBank, Ancora, Tutpenn, Tiger and CTAG.
- normalizeTokens(List<List<Token>>, String) - Static method in class eus.ixa.ixa.pipe.tok.RuleBasedTokenizer
-
Set as value of the token its normalized counterpart.
- numbers - Static variable in class eus.ixa.ixa.pipe.tok.NonPeriodBreaker
-
Do not segment numbers like 11.1.