Index

A B C D E F G H I K L M N O P Q R S T U V W Y 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form

A

add(StoppingCriteria) - Method in class chat.octet.model.components.criteria.StoppingCriteriaList
 
add(LogitsProcessor) - Method in class chat.octet.model.components.processor.LogitsProcessorList
 
addPastTokensSize(int) - Method in class chat.octet.model.beans.Status
 
AIX - Static variable in class chat.octet.model.utils.Platform
 
allowRequantize - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
allow quantizing non-f32/f16 tensors
ANDROID - Static variable in class chat.octet.model.utils.Platform
 
appendNextToken(Token) - Method in class chat.octet.model.beans.Status
 
appendTokens(int[]) - Method in class chat.octet.model.beans.Status
 
ARCH - Static variable in class chat.octet.model.utils.Platform
 
ASSISTANT - Enum constant in enum class chat.octet.model.beans.ChatMessage.ChatRole
Assistant role

B

batch - Variable in class chat.octet.model.beans.LlamaContextParams
prompt processing batch size.
batchDecode(int, int[], int, int) - Static method in class chat.octet.model.LlamaService
Batch decoding.
black(String) - Static method in class chat.octet.model.utils.ColorConsole
 
BLACK - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
blue(String) - Static method in class chat.octet.model.utils.ColorConsole
 
BLUE - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
BOLD - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
 

C

chat(GenerateParameter, String) - Method in class chat.octet.model.Model
Start a conversation and chat in streaming format.
chat(GenerateParameter, String, String) - Method in class chat.octet.model.Model
Start a conversation and chat in streaming format.
chat(String) - Method in class chat.octet.model.Model
Start a conversation and chat in streaming format.
chat(String, String) - Method in class chat.octet.model.Model
Start a conversation and chat in streaming format.
chat.octet.model - package chat.octet.model
 
chat.octet.model.beans - package chat.octet.model.beans
 
chat.octet.model.components.criteria - package chat.octet.model.components.criteria
 
chat.octet.model.components.criteria.impl - package chat.octet.model.components.criteria.impl
 
chat.octet.model.components.processor - package chat.octet.model.components.processor
 
chat.octet.model.components.processor.impl - package chat.octet.model.components.processor.impl
 
chat.octet.model.enums - package chat.octet.model.enums
 
chat.octet.model.exceptions - package chat.octet.model.exceptions
 
chat.octet.model.parameters - package chat.octet.model.parameters
 
chat.octet.model.utils - package chat.octet.model.utils
 
chatCompletions(GenerateParameter, String) - Method in class chat.octet.model.Model
Start a conversation and chat.
chatCompletions(GenerateParameter, String, String) - Method in class chat.octet.model.Model
Start a conversation and chat.
chatCompletions(String) - Method in class chat.octet.model.Model
Start a conversation and chat.
ChatFormatter - Class in chat.octet.model.utils
Chat formatter.
ChatFormatter() - Constructor for class chat.octet.model.utils.ChatFormatter
 
ChatFormatter(String) - Constructor for class chat.octet.model.utils.ChatFormatter
 
ChatFormatter(String, String, String) - Constructor for class chat.octet.model.utils.ChatFormatter
 
ChatMessage - Class in chat.octet.model.beans
Chat message entity
ChatMessage() - Constructor for class chat.octet.model.beans.ChatMessage
 
ChatMessage(ChatMessage.ChatRole, String) - Constructor for class chat.octet.model.beans.ChatMessage
 
ChatMessage.ChatRole - Enum Class in chat.octet.model.beans
Chat role define
CHATML_BOS_TOKEN - Static variable in class chat.octet.model.utils.ChatFormatter
 
CHATML_CHAT_TEMPLATE - Static variable in class chat.octet.model.utils.ChatFormatter
 
CHATML_EOS_TOKEN - Static variable in class chat.octet.model.utils.ChatFormatter
 
checkTensors - Variable in class chat.octet.model.beans.LlamaModelParams
validate model tensor data.
clearCache(int) - Static method in class chat.octet.model.LlamaService
Clear cache in K-V sequences.
clearCache(int, int, int) - Static method in class chat.octet.model.LlamaService
Clear cache in K-V sequences.
close() - Method in class chat.octet.model.Generator
Close inference generator.
close() - Method in class chat.octet.model.Model
Close the model and release resources.
ColorConsole - Class in chat.octet.model.utils
 
ColorConsole.ColorStyle - Enum Class in chat.octet.model.utils
 
ColorConsole.FontStyle - Enum Class in chat.octet.model.utils
 
CompletionResult - Class in chat.octet.model.beans
Completion result entity
CompletionResult() - Constructor for class chat.octet.model.beans.CompletionResult
 
completions(GenerateParameter, String) - Method in class chat.octet.model.Model
Generate complete text.
completions(String) - Method in class chat.octet.model.Model
Generate complete text.
copyToStatus(Status) - Method in class chat.octet.model.beans.Status
 
createNewContextWithModel(LlamaContextParams) - Static method in class chat.octet.model.LlamaService
Create new context with model.
criteria(int[], float[], Object...) - Method in class chat.octet.model.components.criteria.impl.MaxTimeCriteria
 
criteria(int[], float[], Object...) - Method in class chat.octet.model.components.criteria.impl.StoppingWordCriteria
 
criteria(int[], float[], Object...) - Method in interface chat.octet.model.components.criteria.StoppingCriteria
Stopping criteria
criteria(int[], float[], Object...) - Method in class chat.octet.model.components.criteria.StoppingCriteriaList
 
ctx - Variable in class chat.octet.model.beans.LlamaContextParams
text context size.
CustomBiasLogitsProcessor - Class in chat.octet.model.components.processor.impl
 
CustomBiasLogitsProcessor(LogitBias, int) - Constructor for class chat.octet.model.components.processor.impl.CustomBiasLogitsProcessor
 
cyan(String) - Static method in class chat.octet.model.utils.ColorConsole
 
CYAN - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 

D

dataTypeK - Variable in class chat.octet.model.beans.LlamaContextParams
data type for K cache.
dataTypeV - Variable in class chat.octet.model.beans.LlamaContextParams
data type for V cache.
DecodeException - Exception Class in chat.octet.model.exceptions
Batch decode exception
DecodeException(String) - Constructor for exception class chat.octet.model.exceptions.DecodeException
 
DecodeException(String, Throwable) - Constructor for exception class chat.octet.model.exceptions.DecodeException
 
decodeToken(boolean, int...) - Static method in class chat.octet.model.TokenDecoder
 
decodeToken(int...) - Static method in class chat.octet.model.TokenDecoder
 
DEFAULT - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
 
DEFAULT_COMMON_SYSTEM - Static variable in class chat.octet.model.utils.ChatFormatter
Default system prompt.
defragThold - Variable in class chat.octet.model.beans.LlamaContextParams
defragment the KV cache if holes/size > thold, invalid input: '<' 0 disabled (default).
DISABLED - Enum constant in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
 

E

embedding - Variable in class chat.octet.model.beans.LlamaContextParams
embedding mode only.
equals(Object) - Method in class chat.octet.model.beans.Token
 

F

FINISHED - Enum constant in enum class chat.octet.model.enums.FinishReason
Completed generation.
FinishReason - Enum Class in chat.octet.model.enums
Token generate status
flashAttn - Variable in class chat.octet.model.beans.LlamaContextParams
whether to use flash attention [EXPERIMENTAL].
format(boolean, ChatMessage...) - Method in class chat.octet.model.utils.ChatFormatter
Formats a prompt text based on the provided chat messages.
format(ChatMessage...) - Method in class chat.octet.model.utils.ChatFormatter
Formats a prompt text based on the provided chat messages.
format(String) - Method in class chat.octet.model.utils.ChatFormatter
Formats a prompt text based on the provided user question.
format(String, ColorConsole.ColorStyle) - Static method in class chat.octet.model.utils.ColorConsole
 
format(String, ColorConsole.ColorStyle, ColorConsole.FontStyle) - Static method in class chat.octet.model.utils.ColorConsole
 
format(String, String) - Method in class chat.octet.model.utils.ChatFormatter
Formats a prompt text based on the provided system prompt and user question.
FREEBSD - Static variable in class chat.octet.model.utils.Platform
 

G

generate(GenerateParameter, String) - Method in class chat.octet.model.Model
Generate text in stream format.
generate(String) - Method in class chat.octet.model.Model
Generate text in stream format.
GenerateParameter - Class in chat.octet.model.parameters
Generate parameter
GenerateParameter() - Constructor for class chat.octet.model.parameters.GenerateParameter
 
GenerateParameter.MirostatMode - Enum Class in chat.octet.model.parameters
Mirostat sampling mode define
GenerationException - Exception Class in chat.octet.model.exceptions
Generation exception
GenerationException(String) - Constructor for exception class chat.octet.model.exceptions.GenerationException
 
GenerationException(String, Throwable) - Constructor for exception class chat.octet.model.exceptions.GenerationException
 
Generator - Class in chat.octet.model
Model inference generator, Supports streaming output text and generating complete text.
Generator(GenerateParameter, String) - Constructor for class chat.octet.model.Generator
Create inference generator.
Generator(GenerateParameter, String, Status) - Constructor for class chat.octet.model.Generator
Create inference generator.
getByteLength(byte[], int) - Static method in class chat.octet.model.TokenDecoder
 
getContextSize() - Static method in class chat.octet.model.LlamaService
Get model context size.
getEmbedding() - Static method in class chat.octet.model.LlamaService
Get embedding
getFinishReason() - Method in class chat.octet.model.beans.Status
 
getLlamaContextDefaultParams() - Static method in class chat.octet.model.LlamaService
Get llama context default params.
getLlamaModelDefaultParams() - Static method in class chat.octet.model.LlamaService
Get llama model default params.
getLlamaModelQuantizeDefaultParams() - Static method in class chat.octet.model.LlamaService
Get llama model quantize default params.
getLlamaTokenAttr(int) - Static method in class chat.octet.model.LlamaService
Get token type define.
getLogits(int) - Static method in class chat.octet.model.LlamaService
Get Logits based on index, and the default index must be 0.
getLogitsIndex() - Method in class chat.octet.model.beans.Status
 
getOSType() - Static method in class chat.octet.model.utils.Platform
 
getSamplingMetrics(boolean) - Static method in class chat.octet.model.LlamaService
Get sampling metrics
getSystemInfo() - Static method in class chat.octet.model.LlamaService
Get system parameter information.
getTokenAttr(int) - Static method in class chat.octet.model.LlamaService
Get token type code.
getTokenBOS() - Static method in class chat.octet.model.LlamaService
Get special BOS token.
getTokenEOS() - Static method in class chat.octet.model.LlamaService
Get special EOS token.
getUtf8ByteLength(byte) - Static method in class chat.octet.model.TokenDecoder
 
getVocabSize() - Static method in class chat.octet.model.LlamaService
Get model vocab size.
GNU - Static variable in class chat.octet.model.utils.Platform
 
gpuLayers - Variable in class chat.octet.model.beans.LlamaModelParams
number of layers to store in VRAM.
green(String) - Static method in class chat.octet.model.utils.ColorConsole
 
GREEN - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
grey(String) - Static method in class chat.octet.model.utils.ColorConsole
 
GREY - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 

H

hashCode() - Method in class chat.octet.model.beans.Token
 

I

initNative() - Static method in class chat.octet.model.LlamaService
initial JNI context.
isEmpty() - Method in class chat.octet.model.components.criteria.StoppingCriteriaList
 
isEmpty() - Method in class chat.octet.model.components.processor.LogitsProcessorList
 
isFinished() - Method in enum class chat.octet.model.enums.FinishReason
Check if the token has been completed else return false.
isGpuOffloadSupported() - Static method in class chat.octet.model.LlamaService
Check whether gpu_offload is supported.
isLinux() - Static method in class chat.octet.model.utils.Platform
 
isMac() - Static method in class chat.octet.model.utils.Platform
 
isMlockSupported() - Static method in class chat.octet.model.LlamaService
Check whether mlock is supported.
isMmapSupported() - Static method in class chat.octet.model.LlamaService
Check whether mmap is supported.
isWindows() - Static method in class chat.octet.model.utils.Platform
 
ITALIC - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
 
iterator() - Method in class chat.octet.model.Generator
Return inference iterator.

K

keepSplit - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
quantize to the same number of shards.
KFREEBSD - Static variable in class chat.octet.model.utils.Platform
 

L

LENGTH - Enum constant in enum class chat.octet.model.enums.FinishReason
Generation has exceeded the maximum token limit and has been truncated.
LIB_RESOURCE_PATH - Static variable in class chat.octet.model.utils.Platform
 
LINUX - Static variable in class chat.octet.model.utils.Platform
 
LLAMA_FTYPE_ALL_F32 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_GUESSED - Enum constant in enum class chat.octet.model.enums.ModelFileType
not specified in the model file
LLAMA_FTYPE_MOSTLY_BF16 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_F16 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ1_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ1_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ2_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ2_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ2_XS - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ2_XXS - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ3_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ3_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ3_XXS - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ4_NL - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_IQ4_XS - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q2_K - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q2_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q3_K_L - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q3_K_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q3_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q3_K_XS - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q4_0 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q4_1 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q4_1_SOME_F16 - Enum constant in enum class chat.octet.model.enums.ModelFileType
tok_embeddings.weight and output.weight are F16
LLAMA_FTYPE_MOSTLY_Q4_K_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q4_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q5_0 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q5_1 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q5_K_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q5_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q6_K - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_FTYPE_MOSTLY_Q8_0 - Enum constant in enum class chat.octet.model.enums.ModelFileType
 
LLAMA_POOLING_TYPE_CLS - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
pooling cls type.
LLAMA_POOLING_TYPE_MEAN - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
pooling mean type.
LLAMA_POOLING_TYPE_NONE - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
pooling none type.
LLAMA_POOLING_TYPE_UNSPECIFIED - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
unspecified type.
LLAMA_ROPE_SCALING_LINEAR - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
Scaling linear type.
LLAMA_ROPE_SCALING_NONE - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
Scaling none type.
LLAMA_ROPE_SCALING_UNSPECIFIED - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
unspecified type.
LLAMA_ROPE_SCALING_YARN - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
Scaling YaRN type.
LLAMA_SPLIT_MODE_LAYER - Enum constant in enum class chat.octet.model.enums.LlamaSplitMode
split layers and KV across GPUs.
LLAMA_SPLIT_MODE_NONE - Enum constant in enum class chat.octet.model.enums.LlamaSplitMode
single GPU.
LLAMA_SPLIT_MODE_ROW - Enum constant in enum class chat.octet.model.enums.LlamaSplitMode
split rows across GPUs.
LLAMA_TOKEN_ATTR_BYTE - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Byte type.
LLAMA_TOKEN_ATTR_CONTROL - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Control type.
LLAMA_TOKEN_ATTR_LSTRIP - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Left strip type.
LLAMA_TOKEN_ATTR_NORMAL - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Normal type.
LLAMA_TOKEN_ATTR_NORMALIZED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Normalized type.
LLAMA_TOKEN_ATTR_RSTRIP - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Right strip type.
LLAMA_TOKEN_ATTR_SINGLE_WORD - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Single word type.
LLAMA_TOKEN_ATTR_UNDEFINED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Undefined type.
LLAMA_TOKEN_ATTR_UNKNOWN - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Unknown type.
LLAMA_TOKEN_ATTR_UNUSED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
Unused type.
LLAMA_TOKEN_ATTR_USER_DEFINED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
User defined type.
llamaBackendFree() - Static method in class chat.octet.model.LlamaService
Call once at the end of the program.
llamaBackendInit() - Static method in class chat.octet.model.LlamaService
Initialize the llama + ggml backend.
LlamaContextParams - Class in chat.octet.model.beans
Llama context params entity
LlamaContextParams() - Constructor for class chat.octet.model.beans.LlamaContextParams
 
llamaModelMeta(String) - Static method in class chat.octet.model.LlamaService
Retrieves the metadata information of the llama model based on the given key.
LlamaModelParams - Class in chat.octet.model.beans
Llama model params entity
LlamaModelParams() - Constructor for class chat.octet.model.beans.LlamaModelParams
 
llamaModelQuantize(String, String, LlamaModelQuantizeParams) - Static method in class chat.octet.model.LlamaService
Quantize the model.
llamaModelQuantize(String, String, ModelFileType) - Static method in class chat.octet.model.LlamaService
Quantize the model.
LlamaModelQuantizeParams - Class in chat.octet.model.beans
Llama model quantize params entity
LlamaModelQuantizeParams() - Constructor for class chat.octet.model.beans.LlamaModelQuantizeParams
 
llamaNumaInit(int) - Static method in class chat.octet.model.LlamaService
Initialize NUMA optimizations.
LlamaNumaStrategy - Enum Class in chat.octet.model.enums
Llama numa strategy define
LlamaPoolingType - Enum Class in chat.octet.model.enums
Llama Pooling type define
LlamaRoPEScalingType - Enum Class in chat.octet.model.enums
Llama RoPE scaling type define
LlamaService - Class in chat.octet.model
Llama.cpp API
LlamaService() - Constructor for class chat.octet.model.LlamaService
 
LlamaSplitMode - Enum Class in chat.octet.model.enums
Llama split mode define
LlamaTokenAttr - Enum Class in chat.octet.model.enums
Llama token type define
loadLibraryResource() - Static method in class chat.octet.model.utils.Platform
 
loadLlamaGrammar(String) - Static method in class chat.octet.model.LlamaService
Load llama grammar by rules.
loadLlamaModelFromFile(String, LlamaModelParams) - Static method in class chat.octet.model.LlamaService
Load Llama model from file.
loadLoraModelFromFile(String, float, String, int) - Static method in class chat.octet.model.LlamaService
Apply a LoRA adapter to a loaded model path_base_model is the path to a higher quality model to use as a base for the layers modified by the adapter.
LogitBias - Class in chat.octet.model.beans
 
LogitBias() - Constructor for class chat.octet.model.beans.LogitBias
 
logitsAll - Variable in class chat.octet.model.beans.LlamaContextParams
the llama_eval() call computes all logits, not just the last one.
LogitsProcessor - Interface in chat.octet.model.components.processor
Customize a processor to adjust the probability distribution of words and control the generation of model inference results.
LogitsProcessorList - Class in chat.octet.model.components.processor
Stopping criteria list
LogitsProcessorList() - Constructor for class chat.octet.model.components.processor.LogitsProcessorList
 

M

MAC - Static variable in class chat.octet.model.utils.Platform
 
magenta(String) - Static method in class chat.octet.model.utils.ColorConsole
 
MAGENTA - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
mainGpu - Variable in class chat.octet.model.beans.LlamaModelParams
the GPU that is used for scratch and small tensors.
MaxTimeCriteria - Class in chat.octet.model.components.criteria.impl
 
MaxTimeCriteria(long) - Constructor for class chat.octet.model.components.criteria.impl.MaxTimeCriteria
 
MaxTimeCriteria(long, long) - Constructor for class chat.octet.model.components.criteria.impl.MaxTimeCriteria
 
metrics() - Method in class chat.octet.model.Model
Print generation metrics.
Metrics - Class in chat.octet.model.beans
Generation metrics
Metrics() - Constructor for class chat.octet.model.beans.Metrics
 
METRICS_TEMPLATE - Static variable in class chat.octet.model.beans.Metrics
 
mlock - Variable in class chat.octet.model.beans.LlamaModelParams
force system to keep model in RAM.
mmap - Variable in class chat.octet.model.beans.LlamaModelParams
use mmap if possible.
Model - Class in chat.octet.model
LLama model, which provides functions for generating and chatting conversations.
Model(ModelParameter) - Constructor for class chat.octet.model.Model
 
Model(String) - Constructor for class chat.octet.model.Model
 
ModelException - Exception Class in chat.octet.model.exceptions
Model exception
ModelException(String) - Constructor for exception class chat.octet.model.exceptions.ModelException
 
ModelException(String, Throwable) - Constructor for exception class chat.octet.model.exceptions.ModelException
 
modelFileType - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
quantize to this llama_ftype
ModelFileType - Enum Class in chat.octet.model.enums
Model file type define
ModelParameter - Class in chat.octet.model.parameters
Llama model parameters
ModelParameter() - Constructor for class chat.octet.model.parameters.ModelParameter
 

N

NETBSD - Static variable in class chat.octet.model.utils.Platform
 
NoBadWordsLogitsProcessor - Class in chat.octet.model.components.processor.impl
 
NoBadWordsLogitsProcessor(int[]) - Constructor for class chat.octet.model.components.processor.impl.NoBadWordsLogitsProcessor
 
NONE - Enum constant in enum class chat.octet.model.enums.FinishReason
Default type.
NUMA_STRATEGY_COUNT - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
Count strategy.
NUMA_STRATEGY_DISABLED - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
Disabled strategy.
NUMA_STRATEGY_DISTRIBUTE - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
Distribute strategy.
NUMA_STRATEGY_ISOLATE - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
Isolate strategy.
NUMA_STRATEGY_MIRROR - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
Mirror strategy.
NUMA_STRATEGY_NUMACTL - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
Numa control strategy.

O

offloadKqv - Variable in class chat.octet.model.beans.LlamaContextParams
whether to offload the KQV ops (including the KV cache) to GPU.
onlyCopy - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
only copy tensors - ftype, allow_requantize and quantize_output_tensor are ignored
OPENBSD - Static variable in class chat.octet.model.utils.Platform
 
orange(String) - Static method in class chat.octet.model.utils.ColorConsole
 
ORANGE - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
output() - Method in class chat.octet.model.Generator
Stream outputs the generated text.
outputTensorType - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
output tensor type.

P

Platform - Class in chat.octet.model.utils
 
poolingType - Variable in class chat.octet.model.beans.LlamaContextParams
whether to pool (sum) embedding results by sequence id,(ignored if no pooling layer).
processor(int[], float[], Object...) - Method in class chat.octet.model.components.processor.impl.CustomBiasLogitsProcessor
 
processor(int[], float[], Object...) - Method in class chat.octet.model.components.processor.impl.NoBadWordsLogitsProcessor
 
processor(int[], float[], Object...) - Method in interface chat.octet.model.components.processor.LogitsProcessor
Logits processor
processor(int[], float[], Object...) - Method in class chat.octet.model.components.processor.LogitsProcessorList
 
pure - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
disable k-quant mixtures and quantize all tensors to the same type

Q

quantizeOutputTensor - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
quantize output.weight

R

red(String) - Static method in class chat.octet.model.utils.ColorConsole
 
RED - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
release() - Static method in class chat.octet.model.LlamaService
Close model and release all resources.
removeAllChatStatus() - Method in class chat.octet.model.Model
Delete all user session states.
removeChatStatus(String) - Method in class chat.octet.model.Model
Delete the session state of the specified user.
reset() - Method in class chat.octet.model.beans.Status
 
result() - Method in class chat.octet.model.Generator
Return the generated complete text.
ropeFreqBase - Variable in class chat.octet.model.beans.LlamaContextParams
RoPE base frequency.
ropeFreqScale - Variable in class chat.octet.model.beans.LlamaContextParams
RoPE frequency scaling factor.
ropeScalingType - Variable in class chat.octet.model.beans.LlamaContextParams
RoPE scaling type, from `enum llama_rope_scaling_type`.

S

sampling(float[], int[], int, float, float, float, boolean, int, float, float, float, int, float, float, float, float, float, float, int, int) - Static method in class chat.octet.model.LlamaService
Inference sampling the next token.
sampling(GenerateParameter, float[], int[], int, int) - Static method in class chat.octet.model.LlamaService
Inference sampling the next token.
seed - Variable in class chat.octet.model.beans.LlamaContextParams
RNG seed, -1 for random.
seqMax - Variable in class chat.octet.model.beans.LlamaContextParams
max number of sequences (i.e.
SOLARIS - Static variable in class chat.octet.model.utils.Platform
 
splitMode - Variable in class chat.octet.model.beans.LlamaModelParams
how to split the model across multiple GPUs.
Status - Class in chat.octet.model.beans
 
Status() - Constructor for class chat.octet.model.beans.Status
 
Status(Status) - Constructor for class chat.octet.model.beans.Status
 
STOP - Enum constant in enum class chat.octet.model.enums.FinishReason
Generation stopped by StoppingCriteria.
StoppingCriteria - Interface in chat.octet.model.components.criteria
Customize a controller to implement stop rule control for model inference.
StoppingCriteriaList - Class in chat.octet.model.components.criteria
Stopping criteria list
StoppingCriteriaList() - Constructor for class chat.octet.model.components.criteria.StoppingCriteriaList
 
StoppingWordCriteria - Class in chat.octet.model.components.criteria.impl
 
StoppingWordCriteria(String...) - Constructor for class chat.octet.model.components.criteria.impl.StoppingWordCriteria
 
subInputIds(int) - Method in class chat.octet.model.beans.Status
 
subInputIds(int, int) - Method in class chat.octet.model.beans.Status
 
subTokensBetween(List<Token>, String) - Static method in class chat.octet.model.TokenDecoder
 
subTokensBetween(List<Token>, String, String) - Static method in class chat.octet.model.TokenDecoder
 
SYSTEM - Enum constant in enum class chat.octet.model.beans.ChatMessage.ChatRole
System prompt

T

tensorSplit - Variable in class chat.octet.model.beans.LlamaModelParams
how to split layers across multiple GPUs (size: LLAMA_MAX_DEVICES).
thread - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
number of threads to use for quantizing, if invalid input: '<'=0 will use std::thread::hardware_concurrency()
threads - Variable in class chat.octet.model.beans.LlamaContextParams
number of threads used for generation.
threadsBatch - Variable in class chat.octet.model.beans.LlamaContextParams
number of threads used for prompt and batch processing.
toAssistant(String) - Static method in class chat.octet.model.beans.ChatMessage
 
Token - Class in chat.octet.model.beans
Token
Token(int, LlamaTokenAttr, String) - Constructor for class chat.octet.model.beans.Token
 
TokenDecoder - Class in chat.octet.model
Token decoder
tokenEmbeddingType - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
token embeddings tensor type.
tokenize(byte[], int, int[], int, boolean, boolean) - Static method in class chat.octet.model.LlamaService
Convert the provided text into tokens.
tokenize(String, boolean, boolean) - Static method in class chat.octet.model.LlamaService
Convert the provided text into tokens.
tokens() - Method in class chat.octet.model.Generator
Return the generated tokens.
tokenToPiece(int, byte[], int, boolean) - Static method in class chat.octet.model.LlamaService
Convert the token id to text piece.
toString() - Method in enum class chat.octet.model.beans.ChatMessage.ChatRole
 
toString() - Method in class chat.octet.model.beans.Metrics
 
toString() - Method in enum class chat.octet.model.enums.LlamaNumaStrategy
 
toString() - Method in enum class chat.octet.model.enums.LlamaPoolingType
 
toString() - Method in enum class chat.octet.model.enums.LlamaRoPEScalingType
 
toString() - Method in enum class chat.octet.model.enums.LlamaSplitMode
 
toString() - Method in enum class chat.octet.model.enums.LlamaTokenAttr
 
toString() - Method in enum class chat.octet.model.enums.ModelFileType
 
toString() - Method in class chat.octet.model.Model
 
toSystem(String) - Static method in class chat.octet.model.beans.ChatMessage
 
toUser(String) - Static method in class chat.octet.model.beans.ChatMessage
 
TRUNCATED - Enum constant in enum class chat.octet.model.enums.FinishReason
Generation has exceeded the maximum context limit and has been truncated.

U

ubatch - Variable in class chat.octet.model.beans.LlamaContextParams
physical maximum batch size.
UNDERLINE - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
 
UNKNOWN - Enum constant in enum class chat.octet.model.enums.FinishReason
Unknown type, no available token state.
UNSPECIFIED - Static variable in class chat.octet.model.utils.Platform
 
updateFinishReason(FinishReason) - Method in class chat.octet.model.beans.Token
 
USER - Enum constant in enum class chat.octet.model.beans.ChatMessage.ChatRole
User role

V

V1 - Enum constant in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
 
V2 - Enum constant in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
 
valueOf(String) - Static method in enum class chat.octet.model.beans.ChatMessage.ChatRole
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.FinishReason
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaNumaStrategy
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaPoolingType
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaRoPEScalingType
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaSplitMode
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaTokenAttr
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.enums.ModelFileType
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.utils.ColorConsole.ColorStyle
Returns the enum constant of this class with the specified name.
valueOf(String) - Static method in enum class chat.octet.model.utils.ColorConsole.FontStyle
Returns the enum constant of this class with the specified name.
valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaNumaStrategy
 
valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaPoolingType
 
valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaRoPEScalingType
 
valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaSplitMode
 
valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaTokenAttr
 
valueOfType(int) - Static method in enum class chat.octet.model.enums.ModelFileType
 
values() - Static method in enum class chat.octet.model.beans.ChatMessage.ChatRole
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.FinishReason
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.LlamaNumaStrategy
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.LlamaPoolingType
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.LlamaRoPEScalingType
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.LlamaSplitMode
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.LlamaTokenAttr
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.enums.ModelFileType
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.utils.ColorConsole.ColorStyle
Returns an array containing the constants of this enum class, in the order they are declared.
values() - Static method in enum class chat.octet.model.utils.ColorConsole.FontStyle
Returns an array containing the constants of this enum class, in the order they are declared.
vocabOnly - Variable in class chat.octet.model.beans.LlamaModelParams
only load the vocabulary, no weights.

W

white(String) - Static method in class chat.octet.model.utils.ColorConsole
 
WHITE - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
WINDOWS - Static variable in class chat.octet.model.utils.Platform
 
WINDOWSCE - Static variable in class chat.octet.model.utils.Platform
 

Y

yarnAttnFactor - Variable in class chat.octet.model.beans.LlamaContextParams
YaRN magnitude scaling factor.
yarnBetaFast - Variable in class chat.octet.model.beans.LlamaContextParams
YaRN low correction dim.
yarnBetaSlow - Variable in class chat.octet.model.beans.LlamaContextParams
YaRN high correction dim.
yarnExtFactor - Variable in class chat.octet.model.beans.LlamaContextParams
YaRN extrapolation mix factor, NaN = from model.
yarnOrigCtx - Variable in class chat.octet.model.beans.LlamaContextParams
YaRN original context size.
yellow(String) - Static method in class chat.octet.model.utils.ColorConsole
 
YELLOW - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
 
A B C D E F G H I K L M N O P Q R S T U V W Y 
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form