- All Superinterfaces:
Serializable
- All Known Implementing Classes:
AnalysisWord,AnsjWord,HanLPWord,JcsegWord,JiebaWord,MmsegWord,MynlpWord,WordWord
Represents a single word or token extracted during Natural Language Processing (NLP) word segmentation. This
interface defines methods to access the textual content and positional information of the segmented word.
- Since:
- Java 17+
- Author:
- Kimi Liu
-
Method Summary
Modifier and TypeMethodDescriptionintRetrieves the ending character offset of this word within the original text.intRetrieves the starting character offset of this word within the original text.getText()Retrieves the textual content of this word.
-
Method Details
-
getText
String getText()Retrieves the textual content of this word.- Returns:
- The text of the word as a
String.
-
getStartOffset
int getStartOffset()Retrieves the starting character offset of this word within the original text. The offset is 0-based.- Returns:
- The starting position (inclusive) of the word.
-
getEndOffset
int getEndOffset()Retrieves the ending character offset of this word within the original text. The offset is 0-based and exclusive.- Returns:
- The ending position (exclusive) of the word.
-