Module bus.extra

Class HanLPWord

java.lang.Object
org.miaixz.bus.extra.nlp.provider.hanlp.HanLPWord
All Implemented Interfaces:
Serializable, NLPWord

public class HanLPWord extends Object implements NLPWord
Wrapper class for a single word (Term) from HanLP word segmentation. This class adapts the HanLP Term object to the common NLPWord interface, providing a unified way to access segmented word information.
Since:
Java 17+
Author:
Kimi Liu
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    HanLPWord(com.hankcs.hanlp.seg.common.Term term)
    Constructs a HanLPWord instance by wrapping a HanLP Term.
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Retrieves the ending character offset of this word within the original text.
    int
    Retrieves the starting character offset of this word within the original text.
    Retrieves the text of the word from the wrapped HanLP Term.
    Returns the textual representation of this word, which is the same as getText().

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • HanLPWord

      public HanLPWord(com.hankcs.hanlp.seg.common.Term term)
      Constructs a HanLPWord instance by wrapping a HanLP Term.
      Parameters:
      term - The Term object from HanLP word segmentation.
  • Method Details

    • getText

      public String getText()
      Retrieves the text of the word from the wrapped HanLP Term.
      Specified by:
      getText in interface NLPWord
      Returns:
      The text of the word as a String.
    • getStartOffset

      public int getStartOffset()
      Retrieves the starting character offset of this word within the original text. This delegates to the offset field of the HanLP Term.
      Specified by:
      getStartOffset in interface NLPWord
      Returns:
      The starting position (inclusive) of the word.
    • getEndOffset

      public int getEndOffset()
      Retrieves the ending character offset of this word within the original text. This is calculated based on the starting offset and the length of the word.
      Specified by:
      getEndOffset in interface NLPWord
      Returns:
      The ending position (exclusive) of the word.
    • toString

      public String toString()
      Returns the textual representation of this word, which is the same as getText().
      Overrides:
      toString in class Object
      Returns:
      The text of the word.