Module bus.extra

Interface NLPWord

All Superinterfaces:
Serializable
All Known Implementing Classes:
AnalysisWord, AnsjWord, HanLPWord, JcsegWord, JiebaWord, MmsegWord, MynlpWord, WordWord

public interface NLPWord extends Serializable
Represents a single word or token extracted during Natural Language Processing (NLP) word segmentation. This interface defines methods to access the textual content and positional information of the segmented word.
Since:
Java 17+
Author:
Kimi Liu
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Retrieves the ending character offset of this word within the original text.
    int
    Retrieves the starting character offset of this word within the original text.
    Retrieves the textual content of this word.
  • Method Details

    • getText

      String getText()
      Retrieves the textual content of this word.
      Returns:
      The text of the word as a String.
    • getStartOffset

      int getStartOffset()
      Retrieves the starting character offset of this word within the original text. The offset is 0-based.
      Returns:
      The starting position (inclusive) of the word.
    • getEndOffset

      int getEndOffset()
      Retrieves the ending character offset of this word within the original text. The offset is 0-based and exclusive.
      Returns:
      The ending position (exclusive) of the word.