Encoding Config
class EncodingConfig(val pattern: Regex, val mergeableRanks: Map<ByteString, Int>, val specialTokens: Map<ByteString, Int>, val explicitNVocab: Int? = null)
Manages configurations for token encoding, providing the settings and mappings needed to perform byte pair encoding (BPE) and handle special tokens.
Constructors
Properties
Link copied to clipboard
The number of tokens in the vocabulary. If provided, it is checked that the number of mergeable tokens and special tokens is equal to this number.
Link copied to clipboard
A dictionary mapping mergeable token bytes to their ranks. The ranks must correspond to merge priority.
Link copied to clipboard
A dictionary mapping special token strings to their token values.