Class LlamaContextParams

java.lang.Object
chat.octet.model.beans.LlamaContextParams

public class LlamaContextParams extends Object
Llama context params entity
Author:
William
  • Field Details

    • seed

      public int seed
      RNG seed, -1 for random.
    • ctx

      public int ctx
      text context size.
    • batch

      public int batch
      prompt processing batch size.
    • threads

      public int threads
      number of threads used for generation.
    • threadsBatch

      public int threadsBatch
      number of threads used for prompt and batch processing.
    • ropeScalingType

      public int ropeScalingType
      RoPE scaling type, from `enum llama_rope_scaling_type`.
      See Also:
    • yarnExtFactor

      public float yarnExtFactor
      YaRN extrapolation mix factor, NaN = from model.
    • yarnAttnFactor

      public float yarnAttnFactor
      YaRN magnitude scaling factor.
    • yarnBetaFast

      public float yarnBetaFast
      YaRN low correction dim.
    • yarnBetaSlow

      public float yarnBetaSlow
      YaRN high correction dim.
    • yarnOrigCtx

      public int yarnOrigCtx
      YaRN original context size.
    • ropeFreqBase

      public float ropeFreqBase
      RoPE base frequency.
    • ropeFreqScale

      public float ropeFreqScale
      RoPE frequency scaling factor.
    • dataTypeK

      public int dataTypeK
      data type for K cache.
    • dataTypeV

      public int dataTypeV
      data type for V cache.
    • mulMatQ

      public boolean mulMatQ
      if true, use experimental mul_mat_q kernels.
    • logitsAll

      public boolean logitsAll
      the llama_eval() call computes all logits, not just the last one.
    • embedding

      public boolean embedding
      embedding mode only.
    • offloadKqv

      public boolean offloadKqv
      whether to offload the KQV ops (including the KV cache) to GPU.
  • Constructor Details

    • LlamaContextParams

      public LlamaContextParams()