Package chat.octet.model.beans
Class LlamaModelParams
java.lang.Object
chat.octet.model.beans.LlamaModelParams
Llama model params entity
- Author:
- William
-
Field Summary
FieldsModifier and TypeFieldDescriptionbooleanvalidate model tensor data.intnumber of layers to store in VRAM.intthe GPU that is used for scratch and small tensors.booleanforce system to keep model in RAM.booleanuse mmap if possible.inthow to split the model across multiple GPUs.float[]how to split layers across multiple GPUs (size: LLAMA_MAX_DEVICES).booleanonly load the vocabulary, no weights. -
Constructor Summary
Constructors -
Method Summary
-
Field Details
-
gpuLayers
public int gpuLayersnumber of layers to store in VRAM. -
splitMode
public int splitModehow to split the model across multiple GPUs. LLAMA_SPLIT_NONE = 0 (single GPU) LLAMA_SPLIT_LAYER = 1 (split layers and KV across GPUs) LLAMA_SPLIT_ROW = 2 (split rows across GPUs) -
mainGpu
public int mainGputhe GPU that is used for scratch and small tensors. -
tensorSplit
public float[] tensorSplithow to split layers across multiple GPUs (size: LLAMA_MAX_DEVICES). -
vocabOnly
public boolean vocabOnlyonly load the vocabulary, no weights. -
mmap
public boolean mmapuse mmap if possible. -
mlock
public boolean mlockforce system to keep model in RAM. -
checkTensors
public boolean checkTensorsvalidate model tensor data.
-
-
Constructor Details
-
LlamaModelParams
public LlamaModelParams()
-