public class TokenSizeTextSplitter extends TextSplitter
| Constructor and Description |
|---|
TokenSizeTextSplitter() |
TokenSizeTextSplitter(int chunkSize) |
TokenSizeTextSplitter(int chunkSize,
int minChunkSizeChars) |
TokenSizeTextSplitter(int chunkSize,
int minChunkSizeChars,
int minChunkLengthToEmbed,
int maxChunkCount,
boolean keepSeparator) |
| Modifier and Type | Method and Description |
|---|---|
protected String |
decodeTokens(com.knuddels.jtokkit.api.Encoding encoding,
List<Integer> tokens)
解码符号
|
protected List<Integer> |
encodeTokens(com.knuddels.jtokkit.api.Encoding encoding,
String text)
编码符号
|
void |
setEncodingRegistry(com.knuddels.jtokkit.api.EncodingRegistry encodingRegistry)
设置编码库
|
void |
setEncodingType(com.knuddels.jtokkit.api.EncodingType encodingType)
设置编码类型
|
protected List<String> |
splitText(String text) |
split, splitDocumentclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitsplitpublic TokenSizeTextSplitter()
public TokenSizeTextSplitter(int chunkSize)
public TokenSizeTextSplitter(int chunkSize,
int minChunkSizeChars)
public TokenSizeTextSplitter(int chunkSize,
int minChunkSizeChars,
int minChunkLengthToEmbed,
int maxChunkCount,
boolean keepSeparator)
public void setEncodingRegistry(com.knuddels.jtokkit.api.EncodingRegistry encodingRegistry)
public void setEncodingType(com.knuddels.jtokkit.api.EncodingType encodingType)
protected List<String> splitText(String text)
splitText in class TextSplitterprotected List<Integer> encodeTokens(com.knuddels.jtokkit.api.Encoding encoding, String text)
Copyright © 2025. All rights reserved.