Interface ChatModelConfig


public interface ChatModelConfig
  • Method Summary

    Modifier and Type
    Method
    Description
    Represents the strategy used for picking the tokens during generation of the output text.
    The maximum number of new tokens to be generated.
    If stop sequences are given, they are ignored until minimum tokens are generated.
    Model to use
    Random number generator seed to use in sampling mode for experimental repeatability.
    Represents the penalty for penalizing tokens that have already been generated or belong to the context.
    Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output.
    A value used to modify the next-token probabilities in sampling mode.
    The number of highest probability vocabulary tokens to keep for top-k-filtering.
    Similar to top_k except the candidates to generate the next token are the most likely tokens with probabilities that add up to at least top_p.
  • Method Details

    • modelId

      @WithDefault("meta-llama/llama-2-70b-chat") String modelId()
      Model to use
    • decodingMethod

      @WithDefault("greedy") String decodingMethod()
      Represents the strategy used for picking the tokens during generation of the output text. Options are greedy and sample. Value defaults to sample if not specified.

      During text generation when parameter value is set to greedy, each successive token corresponds to the highest probability token given the text that has already been generated. This strategy can lead to repetitive results especially for longer output sequences. The alternative sample strategy generates text by picking subsequent tokens based on the probability distribution of possible next tokens defined by (i.e., conditioned on) the already-generated text and the top_k and top_p parameters described below. See this url for an informative article about text generation.

    • temperature

      @WithDefault("1.0") Double temperature()
      A value used to modify the next-token probabilities in sampling mode. Values less than 1.0 sharpen the probability distribution, resulting in "less random" output. Values greater than 1.0 flatten the probability distribution, resulting in "more random" output. A value of 1.0 has no effect and is the default. The allowed range is 0.0 to 2.0.
    • minNewTokens

      @WithDefault("0") Integer minNewTokens()
      If stop sequences are given, they are ignored until minimum tokens are generated. Defaults to 0.
    • maxNewTokens

      @WithDefault("200") Integer maxNewTokens()
      The maximum number of new tokens to be generated. The range is 0 to 1024.
    • randomSeed

      Optional<Integer> randomSeed()
      Random number generator seed to use in sampling mode for experimental repeatability. Must be >= 1.
    • stopSequences

      Optional<List<String>> stopSequences()
      Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored. The list may contain up to 6 strings.
    • topK

      Optional<Integer> topK()
      The number of highest probability vocabulary tokens to keep for top-k-filtering. Only applies for sampling mode, with range from 1 to 100. When decoding_strategy is set to sample, only the top_k most likely tokens are considered as candidates for the next generated token.
    • topP

      Optional<Double> topP()
      Similar to top_k except the candidates to generate the next token are the most likely tokens with probabilities that add up to at least top_p. The valid range is 0.0 to 1.0 where 1.0 is equivalent to disabled and is the default. Also known as nucleus sampling.
    • repetitionPenalty

      Optional<Double> repetitionPenalty()
      Represents the penalty for penalizing tokens that have already been generated or belong to the context. The range is 1.0 to 2.0 and defaults to 1.0 (no penalty).