Module bus.extra

Class JiebaWord

java.lang.Object
org.miaixz.bus.extra.nlp.provider.jieba.JiebaWord
All Implemented Interfaces:
Serializable, NLPWord

public class JiebaWord extends Object implements NLPWord
Wrapper class for a single word (SegToken) from Jieba word segmentation. This class adapts the Jieba SegToken object to the common NLPWord interface, providing a unified way to access segmented word information.
Since:
Java 17+
Author:
Kimi Liu
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    JiebaWord(com.huaban.analysis.jieba.SegToken segToken)
    Constructs a JiebaWord instance by wrapping a Jieba SegToken.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Retrieves the ending character offset of this word within the original text.
    int
    Retrieves the starting character offset of this word within the original text.
    Retrieves the text of the word from the wrapped Jieba SegToken.
    Returns the textual representation of this word, which is the same as getText().

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • JiebaWord

      public JiebaWord(com.huaban.analysis.jieba.SegToken segToken)
      Constructs a JiebaWord instance by wrapping a Jieba SegToken.
      Parameters:
      segToken - The SegToken object from Jieba word segmentation.
  • Method Details

    • getText

      public String getText()
      Retrieves the text of the word from the wrapped Jieba SegToken.
      Specified by:
      getText in interface NLPWord
      Returns:
      The text of the word as a String.
    • getStartOffset

      public int getStartOffset()
      Retrieves the starting character offset of this word within the original text. This delegates to the startOffset field of the Jieba SegToken.
      Specified by:
      getStartOffset in interface NLPWord
      Returns:
      The starting position (inclusive) of the word.
    • getEndOffset

      public int getEndOffset()
      Retrieves the ending character offset of this word within the original text. This delegates to the endOffset field of the Jieba SegToken.
      Specified by:
      getEndOffset in interface NLPWord
      Returns:
      The ending position (exclusive) of the word.
    • toString

      public String toString()
      Returns the textual representation of this word, which is the same as getText().
      Overrides:
      toString in class Object
      Returns:
      The text of the word.