Module bus.extra

Class MmsegProvider

java.lang.Object
org.miaixz.bus.extra.nlp.provider.mmseg.MmsegProvider
All Implemented Interfaces:
Serializable, org.miaixz.bus.core.Provider, NLPProvider

public class MmsegProvider extends Object implements NLPProvider
mmseg4j word segmentation engine implementation. This class serves as a concrete NLPProvider for the mmseg4j NLP library. Note that MMSeg is not thread-safe, so a new instance is created for each segmentation request. Project homepage: https://github.com/chenlb/mmseg4j-core
Since:
Java 17+
Author:
Kimi Liu
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    Constructs a new MmsegProvider instance with the default segmentation algorithm, which is ComplexSeg using the default singleton dictionary.
    MmsegProvider(com.chenlb.mmseg4j.Seg seg)
    Constructs a new MmsegProvider instance with a specified segmentation algorithm.
  • Method Summary

    Modifier and Type
    Method
    Description
    Performs word segmentation on the given text using the configured mmseg4j Seg instance.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.miaixz.bus.extra.nlp.NLPProvider

    type
  • Constructor Details

    • MmsegProvider

      public MmsegProvider()
      Constructs a new MmsegProvider instance with the default segmentation algorithm, which is ComplexSeg using the default singleton dictionary.
    • MmsegProvider

      public MmsegProvider(com.chenlb.mmseg4j.Seg seg)
      Constructs a new MmsegProvider instance with a specified segmentation algorithm.
      Parameters:
      seg - The Seg algorithm to use for word segmentation (e.g., ComplexSeg, SimpleSeg).
  • Method Details

    • parse

      public NLPResult parse(CharSequence text)
      Performs word segmentation on the given text using the configured mmseg4j Seg instance. A new MMSeg instance is created for each call to ensure thread safety. The result is wrapped in an MmsegResult to conform to the NLPResult interface.
      Specified by:
      parse in interface NLPProvider
      Parameters:
      text - The input text CharSequence to be segmented.
      Returns:
      An NLPResult object containing the segmented words from mmseg4j.