Interface ChatModelConfig
public interface ChatModelConfig
-
Nested Class Summary
Nested Classes -
Method Summary
Modifier and TypeMethodDescriptionRepresents the strategy used for picking the tokens during generation of the output text.Passfalseto omit matched stop sequences from the end of the output text.It can be used to exponentially increase the likelihood of the text generation terminating once a specified number of tokens have been generated.Whether chat model requests should be logged.Whether chat model responses should be logged.The maximum number of new tokens to be generated.If stop sequences are given, they are ignored until minimum tokens are generated.modelId()Model id to use.Random number generator seed to use in sampling mode for experimental repeatability.Represents the penalty for penalizing tokens that have already been generated or belong to the context.Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output.A value used to modify the next-token probabilities insamplingmode.topK()The number of highest probability vocabulary tokens to keep for top-k-filtering.topP()Similar totop_kexcept the candidates to generate the next token are the most likely tokens with probabilities that add up to at leasttop_p.Represents the maximum number of input tokens accepted.
-
Method Details
-
modelId
Model id to use.To view the complete model list, click here.
-
decodingMethod
Represents the strategy used for picking the tokens during generation of the output text. During text generation when parameter value is set togreedy, each successive token corresponds to the highest probability token given the text that has already been generated. This strategy can lead to repetitive results especially for longer output sequences. The alternativesamplestrategy generates text by picking subsequent tokens based on the probability distribution of possible next tokens defined by (i.e., conditioned on) the already-generated text and thetop_kandtop_pparameters.Allowable values:
[sample,greedy] -
lengthPenalty
Optional<ChatModelConfig.LengthPenaltyConfig> lengthPenalty()It can be used to exponentially increase the likelihood of the text generation terminating once a specified number of tokens have been generated. -
maxNewTokens
The maximum number of new tokens to be generated. The maximum supported value for this field depends on the model being used. How the "token" is defined depends on the tokenizer and vocabulary size, which in turn depends on the model. Often the tokens are a mix of full words and sub-words. Depending on the users plan, and on the model being used, there may be an enforced maximum number of new tokens.Possible values:
≥ 0 -
minNewTokens
If stop sequences are given, they are ignored until minimum tokens are generated.Possible values:
≥ 0 -
randomSeed
Random number generator seed to use in sampling mode for experimental repeatability.Possible values:
≥ 1 -
stopSequences
Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.Possible values:
0 ≤ number of items ≤ 6, contains only unique items -
temperature
A value used to modify the next-token probabilities insamplingmode. Values less than1.0sharpen the probability distribution, resulting in "less random" output. Values greater than1.0flatten the probability distribution, resulting in "more random" output. A value of1.0has no effect.Possible values:
0 ≤ value ≤ 2 -
topK
The number of highest probability vocabulary tokens to keep for top-k-filtering. Only applies forsamplingmode. When decoding_strategy is set tosample, only thetop_kmost likely tokens are considered as candidates for the next generated token.Possible values:
1 ≤ value ≤ 100 -
topP
Similar totop_kexcept the candidates to generate the next token are the most likely tokens with probabilities that add up to at leasttop_p. Also known as nucleus sampling. A value of1.0is equivalent to disabled.Possible values:
0 < value ≤ 1 -
repetitionPenalty
Represents the penalty for penalizing tokens that have already been generated or belong to the context. The value1.0means that there is no penalty.Possible values:
1 ≤ value ≤ 2 -
truncateInputTokens
Represents the maximum number of input tokens accepted. This can be used to avoid requests failing due to input being longer than configured limits. If the text is truncated, then it truncates the start of the input (on the left), so the end of the input will remain the same. If this value exceeds the maximum sequence length (refer to the documentation to find this value for the model) then the call will fail if the total number of tokens exceeds the maximum sequence length. Zero means don't truncate.Possible values:
≥ 0 -
includeStopSequence
Passfalseto omit matched stop sequences from the end of the output text. The default istrue, meaning that the output will end with the stop sequence text when matched. -
logRequests
@ConfigDocDefault("false") @WithDefault("${quarkus.langchain4j.watsonx.log-requests}") Optional<Boolean> logRequests()Whether chat model requests should be logged. -
logResponses
@ConfigDocDefault("false") @WithDefault("${quarkus.langchain4j.watsonx.log-responses}") Optional<Boolean> logResponses()Whether chat model responses should be logged.
-