Index
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
A
- add(StoppingCriteria) - Method in class chat.octet.model.components.criteria.StoppingCriteriaList
- add(LogitsProcessor) - Method in class chat.octet.model.components.processor.LogitsProcessorList
- addPastTokensSize(int) - Method in class chat.octet.model.beans.Status
- AIX - Static variable in class chat.octet.model.utils.Platform
- allowRequantize - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
allow quantizing non-f32/f16 tensors
- ANDROID - Static variable in class chat.octet.model.utils.Platform
- appendNextToken(Token) - Method in class chat.octet.model.beans.Status
- appendTokens(int[]) - Method in class chat.octet.model.beans.Status
- ARCH - Static variable in class chat.octet.model.utils.Platform
- ASSISTANT - Enum constant in enum class chat.octet.model.beans.ChatMessage.ChatRole
-
Assistant role
B
- batch - Variable in class chat.octet.model.beans.LlamaContextParams
-
prompt processing batch size.
- batchDecode(int, int[], int, int) - Static method in class chat.octet.model.LlamaService
-
Batch decoding.
- black(String) - Static method in class chat.octet.model.utils.ColorConsole
- BLACK - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- blue(String) - Static method in class chat.octet.model.utils.ColorConsole
- BLUE - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- BOLD - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
C
- chat(GenerateParameter, String) - Method in class chat.octet.model.Model
-
Start a conversation and chat in streaming format.
- chat(GenerateParameter, String, String) - Method in class chat.octet.model.Model
-
Start a conversation and chat in streaming format.
- chat(String) - Method in class chat.octet.model.Model
-
Start a conversation and chat in streaming format.
- chat(String, String) - Method in class chat.octet.model.Model
-
Start a conversation and chat in streaming format.
- chat.octet.model - package chat.octet.model
- chat.octet.model.beans - package chat.octet.model.beans
- chat.octet.model.components.criteria - package chat.octet.model.components.criteria
- chat.octet.model.components.criteria.impl - package chat.octet.model.components.criteria.impl
- chat.octet.model.components.processor - package chat.octet.model.components.processor
- chat.octet.model.components.processor.impl - package chat.octet.model.components.processor.impl
- chat.octet.model.enums - package chat.octet.model.enums
- chat.octet.model.exceptions - package chat.octet.model.exceptions
- chat.octet.model.parameters - package chat.octet.model.parameters
- chat.octet.model.utils - package chat.octet.model.utils
- chatCompletions(GenerateParameter, String) - Method in class chat.octet.model.Model
-
Start a conversation and chat.
- chatCompletions(GenerateParameter, String, String) - Method in class chat.octet.model.Model
-
Start a conversation and chat.
- chatCompletions(String) - Method in class chat.octet.model.Model
-
Start a conversation and chat.
- ChatFormatter - Class in chat.octet.model.utils
-
Chat formatter.
- ChatFormatter() - Constructor for class chat.octet.model.utils.ChatFormatter
- ChatFormatter(String) - Constructor for class chat.octet.model.utils.ChatFormatter
- ChatFormatter(String, String, String) - Constructor for class chat.octet.model.utils.ChatFormatter
- ChatMessage - Class in chat.octet.model.beans
-
Chat message entity
- ChatMessage() - Constructor for class chat.octet.model.beans.ChatMessage
- ChatMessage(ChatMessage.ChatRole, String) - Constructor for class chat.octet.model.beans.ChatMessage
- ChatMessage.ChatRole - Enum Class in chat.octet.model.beans
-
Chat role define
- CHATML_BOS_TOKEN - Static variable in class chat.octet.model.utils.ChatFormatter
- CHATML_CHAT_TEMPLATE - Static variable in class chat.octet.model.utils.ChatFormatter
- CHATML_EOS_TOKEN - Static variable in class chat.octet.model.utils.ChatFormatter
- checkTensors - Variable in class chat.octet.model.beans.LlamaModelParams
-
validate model tensor data.
- clearCache(int) - Static method in class chat.octet.model.LlamaService
-
Clear cache in K-V sequences.
- clearCache(int, int, int) - Static method in class chat.octet.model.LlamaService
-
Clear cache in K-V sequences.
- close() - Method in class chat.octet.model.Generator
-
Close inference generator.
- close() - Method in class chat.octet.model.Model
-
Close the model and release resources.
- ColorConsole - Class in chat.octet.model.utils
- ColorConsole.ColorStyle - Enum Class in chat.octet.model.utils
- ColorConsole.FontStyle - Enum Class in chat.octet.model.utils
- CompletionResult - Class in chat.octet.model.beans
-
Completion result entity
- CompletionResult() - Constructor for class chat.octet.model.beans.CompletionResult
- completions(GenerateParameter, String) - Method in class chat.octet.model.Model
-
Generate complete text.
- completions(String) - Method in class chat.octet.model.Model
-
Generate complete text.
- copyToStatus(Status) - Method in class chat.octet.model.beans.Status
- createNewContextWithModel(LlamaContextParams) - Static method in class chat.octet.model.LlamaService
-
Create new context with model.
- criteria(int[], float[], Object...) - Method in class chat.octet.model.components.criteria.impl.MaxTimeCriteria
- criteria(int[], float[], Object...) - Method in class chat.octet.model.components.criteria.impl.StoppingWordCriteria
- criteria(int[], float[], Object...) - Method in interface chat.octet.model.components.criteria.StoppingCriteria
-
Stopping criteria
- criteria(int[], float[], Object...) - Method in class chat.octet.model.components.criteria.StoppingCriteriaList
- ctx - Variable in class chat.octet.model.beans.LlamaContextParams
-
text context size.
- CustomBiasLogitsProcessor - Class in chat.octet.model.components.processor.impl
- CustomBiasLogitsProcessor(LogitBias, int) - Constructor for class chat.octet.model.components.processor.impl.CustomBiasLogitsProcessor
- cyan(String) - Static method in class chat.octet.model.utils.ColorConsole
- CYAN - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
D
- dataTypeK - Variable in class chat.octet.model.beans.LlamaContextParams
-
data type for K cache.
- dataTypeV - Variable in class chat.octet.model.beans.LlamaContextParams
-
data type for V cache.
- DecodeException - Exception Class in chat.octet.model.exceptions
-
Batch decode exception
- DecodeException(String) - Constructor for exception class chat.octet.model.exceptions.DecodeException
- DecodeException(String, Throwable) - Constructor for exception class chat.octet.model.exceptions.DecodeException
- decodeToken(boolean, int...) - Static method in class chat.octet.model.TokenDecoder
- decodeToken(int...) - Static method in class chat.octet.model.TokenDecoder
- DEFAULT - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
- DEFAULT_COMMON_SYSTEM - Static variable in class chat.octet.model.utils.ChatFormatter
-
Default system prompt.
- defragThold - Variable in class chat.octet.model.beans.LlamaContextParams
-
defragment the KV cache if holes/size > thold, invalid input: '<' 0 disabled (default).
- DISABLED - Enum constant in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
E
- embedding - Variable in class chat.octet.model.beans.LlamaContextParams
-
embedding mode only.
- equals(Object) - Method in class chat.octet.model.beans.Token
F
- FINISHED - Enum constant in enum class chat.octet.model.enums.FinishReason
-
Completed generation.
- FinishReason - Enum Class in chat.octet.model.enums
-
Token generate status
- flashAttn - Variable in class chat.octet.model.beans.LlamaContextParams
-
whether to use flash attention [EXPERIMENTAL].
- format(boolean, ChatMessage...) - Method in class chat.octet.model.utils.ChatFormatter
-
Formats a prompt text based on the provided chat messages.
- format(ChatMessage...) - Method in class chat.octet.model.utils.ChatFormatter
-
Formats a prompt text based on the provided chat messages.
- format(String) - Method in class chat.octet.model.utils.ChatFormatter
-
Formats a prompt text based on the provided user question.
- format(String, ColorConsole.ColorStyle) - Static method in class chat.octet.model.utils.ColorConsole
- format(String, ColorConsole.ColorStyle, ColorConsole.FontStyle) - Static method in class chat.octet.model.utils.ColorConsole
- format(String, String) - Method in class chat.octet.model.utils.ChatFormatter
-
Formats a prompt text based on the provided system prompt and user question.
- FREEBSD - Static variable in class chat.octet.model.utils.Platform
G
- generate(GenerateParameter, String) - Method in class chat.octet.model.Model
-
Generate text in stream format.
- generate(String) - Method in class chat.octet.model.Model
-
Generate text in stream format.
- GenerateParameter - Class in chat.octet.model.parameters
-
Generate parameter
- GenerateParameter() - Constructor for class chat.octet.model.parameters.GenerateParameter
- GenerateParameter.MirostatMode - Enum Class in chat.octet.model.parameters
-
Mirostat sampling mode define
- GenerationException - Exception Class in chat.octet.model.exceptions
-
Generation exception
- GenerationException(String) - Constructor for exception class chat.octet.model.exceptions.GenerationException
- GenerationException(String, Throwable) - Constructor for exception class chat.octet.model.exceptions.GenerationException
- Generator - Class in chat.octet.model
-
Model inference generator, Supports streaming output text and generating complete text.
- Generator(GenerateParameter, String) - Constructor for class chat.octet.model.Generator
-
Create inference generator.
- Generator(GenerateParameter, String, Status) - Constructor for class chat.octet.model.Generator
-
Create inference generator.
- getByteLength(byte[], int) - Static method in class chat.octet.model.TokenDecoder
- getContextSize() - Static method in class chat.octet.model.LlamaService
-
Get model context size.
- getEmbedding() - Static method in class chat.octet.model.LlamaService
-
Get embedding
- getFinishReason() - Method in class chat.octet.model.beans.Status
- getLlamaContextDefaultParams() - Static method in class chat.octet.model.LlamaService
-
Get llama context default params.
- getLlamaModelDefaultParams() - Static method in class chat.octet.model.LlamaService
-
Get llama model default params.
- getLlamaModelQuantizeDefaultParams() - Static method in class chat.octet.model.LlamaService
-
Get llama model quantize default params.
- getLlamaTokenAttr(int) - Static method in class chat.octet.model.LlamaService
-
Get token type define.
- getLogits(int) - Static method in class chat.octet.model.LlamaService
-
Get Logits based on index, and the default index must be 0.
- getLogitsIndex() - Method in class chat.octet.model.beans.Status
- getOSType() - Static method in class chat.octet.model.utils.Platform
- getSamplingMetrics(boolean) - Static method in class chat.octet.model.LlamaService
-
Get sampling metrics
- getSystemInfo() - Static method in class chat.octet.model.LlamaService
-
Get system parameter information.
- getTokenAttr(int) - Static method in class chat.octet.model.LlamaService
-
Get token type code.
- getTokenBOS() - Static method in class chat.octet.model.LlamaService
-
Get special BOS token.
- getTokenEOS() - Static method in class chat.octet.model.LlamaService
-
Get special EOS token.
- getUtf8ByteLength(byte) - Static method in class chat.octet.model.TokenDecoder
- getVocabSize() - Static method in class chat.octet.model.LlamaService
-
Get model vocab size.
- GNU - Static variable in class chat.octet.model.utils.Platform
- gpuLayers - Variable in class chat.octet.model.beans.LlamaModelParams
-
number of layers to store in VRAM.
- green(String) - Static method in class chat.octet.model.utils.ColorConsole
- GREEN - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- grey(String) - Static method in class chat.octet.model.utils.ColorConsole
- GREY - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
H
- hashCode() - Method in class chat.octet.model.beans.Token
I
- initNative() - Static method in class chat.octet.model.LlamaService
-
initial JNI context.
- isEmpty() - Method in class chat.octet.model.components.criteria.StoppingCriteriaList
- isEmpty() - Method in class chat.octet.model.components.processor.LogitsProcessorList
- isFinished() - Method in enum class chat.octet.model.enums.FinishReason
-
Check if the token has been completed else return false.
- isGpuOffloadSupported() - Static method in class chat.octet.model.LlamaService
-
Check whether gpu_offload is supported.
- isLinux() - Static method in class chat.octet.model.utils.Platform
- isMac() - Static method in class chat.octet.model.utils.Platform
- isMlockSupported() - Static method in class chat.octet.model.LlamaService
-
Check whether mlock is supported.
- isMmapSupported() - Static method in class chat.octet.model.LlamaService
-
Check whether mmap is supported.
- isWindows() - Static method in class chat.octet.model.utils.Platform
- ITALIC - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
- iterator() - Method in class chat.octet.model.Generator
-
Return inference iterator.
K
- keepSplit - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
quantize to the same number of shards.
- KFREEBSD - Static variable in class chat.octet.model.utils.Platform
L
- LENGTH - Enum constant in enum class chat.octet.model.enums.FinishReason
-
Generation has exceeded the maximum token limit and has been truncated.
- LIB_RESOURCE_PATH - Static variable in class chat.octet.model.utils.Platform
- LINUX - Static variable in class chat.octet.model.utils.Platform
- LLAMA_FTYPE_ALL_F32 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_GUESSED - Enum constant in enum class chat.octet.model.enums.ModelFileType
-
not specified in the model file
- LLAMA_FTYPE_MOSTLY_BF16 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_F16 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ1_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ1_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ2_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ2_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ2_XS - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ2_XXS - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ3_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ3_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ3_XXS - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ4_NL - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_IQ4_XS - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q2_K - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q2_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q3_K_L - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q3_K_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q3_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q3_K_XS - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q4_0 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q4_1 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q4_1_SOME_F16 - Enum constant in enum class chat.octet.model.enums.ModelFileType
-
tok_embeddings.weight and output.weight are F16
- LLAMA_FTYPE_MOSTLY_Q4_K_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q4_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q5_0 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q5_1 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q5_K_M - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q5_K_S - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q6_K - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_FTYPE_MOSTLY_Q8_0 - Enum constant in enum class chat.octet.model.enums.ModelFileType
- LLAMA_POOLING_TYPE_CLS - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
-
pooling cls type.
- LLAMA_POOLING_TYPE_MEAN - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
-
pooling mean type.
- LLAMA_POOLING_TYPE_NONE - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
-
pooling none type.
- LLAMA_POOLING_TYPE_UNSPECIFIED - Enum constant in enum class chat.octet.model.enums.LlamaPoolingType
-
unspecified type.
- LLAMA_ROPE_SCALING_LINEAR - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
-
Scaling linear type.
- LLAMA_ROPE_SCALING_NONE - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
-
Scaling none type.
- LLAMA_ROPE_SCALING_UNSPECIFIED - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
-
unspecified type.
- LLAMA_ROPE_SCALING_YARN - Enum constant in enum class chat.octet.model.enums.LlamaRoPEScalingType
-
Scaling YaRN type.
- LLAMA_SPLIT_MODE_LAYER - Enum constant in enum class chat.octet.model.enums.LlamaSplitMode
-
split layers and KV across GPUs.
- LLAMA_SPLIT_MODE_NONE - Enum constant in enum class chat.octet.model.enums.LlamaSplitMode
-
single GPU.
- LLAMA_SPLIT_MODE_ROW - Enum constant in enum class chat.octet.model.enums.LlamaSplitMode
-
split rows across GPUs.
- LLAMA_TOKEN_ATTR_BYTE - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Byte type.
- LLAMA_TOKEN_ATTR_CONTROL - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Control type.
- LLAMA_TOKEN_ATTR_LSTRIP - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Left strip type.
- LLAMA_TOKEN_ATTR_NORMAL - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Normal type.
- LLAMA_TOKEN_ATTR_NORMALIZED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Normalized type.
- LLAMA_TOKEN_ATTR_RSTRIP - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Right strip type.
- LLAMA_TOKEN_ATTR_SINGLE_WORD - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Single word type.
- LLAMA_TOKEN_ATTR_UNDEFINED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Undefined type.
- LLAMA_TOKEN_ATTR_UNKNOWN - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Unknown type.
- LLAMA_TOKEN_ATTR_UNUSED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
Unused type.
- LLAMA_TOKEN_ATTR_USER_DEFINED - Enum constant in enum class chat.octet.model.enums.LlamaTokenAttr
-
User defined type.
- llamaBackendFree() - Static method in class chat.octet.model.LlamaService
-
Call once at the end of the program.
- llamaBackendInit() - Static method in class chat.octet.model.LlamaService
-
Initialize the llama + ggml backend.
- LlamaContextParams - Class in chat.octet.model.beans
-
Llama context params entity
- LlamaContextParams() - Constructor for class chat.octet.model.beans.LlamaContextParams
- llamaModelMeta(String) - Static method in class chat.octet.model.LlamaService
-
Retrieves the metadata information of the llama model based on the given key.
- LlamaModelParams - Class in chat.octet.model.beans
-
Llama model params entity
- LlamaModelParams() - Constructor for class chat.octet.model.beans.LlamaModelParams
- llamaModelQuantize(String, String, LlamaModelQuantizeParams) - Static method in class chat.octet.model.LlamaService
-
Quantize the model.
- llamaModelQuantize(String, String, ModelFileType) - Static method in class chat.octet.model.LlamaService
-
Quantize the model.
- LlamaModelQuantizeParams - Class in chat.octet.model.beans
-
Llama model quantize params entity
- LlamaModelQuantizeParams() - Constructor for class chat.octet.model.beans.LlamaModelQuantizeParams
- llamaNumaInit(int) - Static method in class chat.octet.model.LlamaService
-
Initialize NUMA optimizations.
- LlamaNumaStrategy - Enum Class in chat.octet.model.enums
-
Llama numa strategy define
- LlamaPoolingType - Enum Class in chat.octet.model.enums
-
Llama Pooling type define
- LlamaRoPEScalingType - Enum Class in chat.octet.model.enums
-
Llama RoPE scaling type define
- LlamaService - Class in chat.octet.model
-
Llama.cpp API
- LlamaService() - Constructor for class chat.octet.model.LlamaService
- LlamaSplitMode - Enum Class in chat.octet.model.enums
-
Llama split mode define
- LlamaTokenAttr - Enum Class in chat.octet.model.enums
-
Llama token type define
- loadLibraryResource() - Static method in class chat.octet.model.utils.Platform
- loadLlamaGrammar(String) - Static method in class chat.octet.model.LlamaService
-
Load llama grammar by rules.
- loadLlamaModelFromFile(String, LlamaModelParams) - Static method in class chat.octet.model.LlamaService
-
Load Llama model from file.
- loadLoraModelFromFile(String, float, String, int) - Static method in class chat.octet.model.LlamaService
-
Apply a LoRA adapter to a loaded model path_base_model is the path to a higher quality model to use as a base for the layers modified by the adapter.
- LogitBias - Class in chat.octet.model.beans
- LogitBias() - Constructor for class chat.octet.model.beans.LogitBias
- logitsAll - Variable in class chat.octet.model.beans.LlamaContextParams
-
the llama_eval() call computes all logits, not just the last one.
- LogitsProcessor - Interface in chat.octet.model.components.processor
-
Customize a processor to adjust the probability distribution of words and control the generation of model inference results.
- LogitsProcessorList - Class in chat.octet.model.components.processor
-
Stopping criteria list
- LogitsProcessorList() - Constructor for class chat.octet.model.components.processor.LogitsProcessorList
M
- MAC - Static variable in class chat.octet.model.utils.Platform
- magenta(String) - Static method in class chat.octet.model.utils.ColorConsole
- MAGENTA - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- mainGpu - Variable in class chat.octet.model.beans.LlamaModelParams
-
the GPU that is used for scratch and small tensors.
- MaxTimeCriteria - Class in chat.octet.model.components.criteria.impl
- MaxTimeCriteria(long) - Constructor for class chat.octet.model.components.criteria.impl.MaxTimeCriteria
- MaxTimeCriteria(long, long) - Constructor for class chat.octet.model.components.criteria.impl.MaxTimeCriteria
- metrics() - Method in class chat.octet.model.Model
-
Print generation metrics.
- Metrics - Class in chat.octet.model.beans
-
Generation metrics
- Metrics() - Constructor for class chat.octet.model.beans.Metrics
- METRICS_TEMPLATE - Static variable in class chat.octet.model.beans.Metrics
- mlock - Variable in class chat.octet.model.beans.LlamaModelParams
-
force system to keep model in RAM.
- mmap - Variable in class chat.octet.model.beans.LlamaModelParams
-
use mmap if possible.
- Model - Class in chat.octet.model
-
LLama model, which provides functions for generating and chatting conversations.
- Model(ModelParameter) - Constructor for class chat.octet.model.Model
- Model(String) - Constructor for class chat.octet.model.Model
- ModelException - Exception Class in chat.octet.model.exceptions
-
Model exception
- ModelException(String) - Constructor for exception class chat.octet.model.exceptions.ModelException
- ModelException(String, Throwable) - Constructor for exception class chat.octet.model.exceptions.ModelException
- modelFileType - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
quantize to this llama_ftype
- ModelFileType - Enum Class in chat.octet.model.enums
-
Model file type define
- ModelParameter - Class in chat.octet.model.parameters
-
Llama model parameters
- ModelParameter() - Constructor for class chat.octet.model.parameters.ModelParameter
N
- NETBSD - Static variable in class chat.octet.model.utils.Platform
- NoBadWordsLogitsProcessor - Class in chat.octet.model.components.processor.impl
- NoBadWordsLogitsProcessor(int[]) - Constructor for class chat.octet.model.components.processor.impl.NoBadWordsLogitsProcessor
- NONE - Enum constant in enum class chat.octet.model.enums.FinishReason
-
Default type.
- NUMA_STRATEGY_COUNT - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Count strategy.
- NUMA_STRATEGY_DISABLED - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Disabled strategy.
- NUMA_STRATEGY_DISTRIBUTE - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Distribute strategy.
- NUMA_STRATEGY_ISOLATE - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Isolate strategy.
- NUMA_STRATEGY_MIRROR - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Mirror strategy.
- NUMA_STRATEGY_NUMACTL - Enum constant in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Numa control strategy.
O
- offloadKqv - Variable in class chat.octet.model.beans.LlamaContextParams
-
whether to offload the KQV ops (including the KV cache) to GPU.
- onlyCopy - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
only copy tensors - ftype, allow_requantize and quantize_output_tensor are ignored
- OPENBSD - Static variable in class chat.octet.model.utils.Platform
- orange(String) - Static method in class chat.octet.model.utils.ColorConsole
- ORANGE - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- output() - Method in class chat.octet.model.Generator
-
Stream outputs the generated text.
- outputTensorType - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
output tensor type.
P
- Platform - Class in chat.octet.model.utils
- poolingType - Variable in class chat.octet.model.beans.LlamaContextParams
-
whether to pool (sum) embedding results by sequence id,(ignored if no pooling layer).
- processor(int[], float[], Object...) - Method in class chat.octet.model.components.processor.impl.CustomBiasLogitsProcessor
- processor(int[], float[], Object...) - Method in class chat.octet.model.components.processor.impl.NoBadWordsLogitsProcessor
- processor(int[], float[], Object...) - Method in interface chat.octet.model.components.processor.LogitsProcessor
-
Logits processor
- processor(int[], float[], Object...) - Method in class chat.octet.model.components.processor.LogitsProcessorList
- pure - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
disable k-quant mixtures and quantize all tensors to the same type
Q
- quantizeOutputTensor - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
quantize output.weight
R
- red(String) - Static method in class chat.octet.model.utils.ColorConsole
- RED - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- release() - Static method in class chat.octet.model.LlamaService
-
Close model and release all resources.
- removeAllChatStatus() - Method in class chat.octet.model.Model
-
Delete all user session states.
- removeChatStatus(String) - Method in class chat.octet.model.Model
-
Delete the session state of the specified user.
- reset() - Method in class chat.octet.model.beans.Status
- result() - Method in class chat.octet.model.Generator
-
Return the generated complete text.
- ropeFreqBase - Variable in class chat.octet.model.beans.LlamaContextParams
-
RoPE base frequency.
- ropeFreqScale - Variable in class chat.octet.model.beans.LlamaContextParams
-
RoPE frequency scaling factor.
- ropeScalingType - Variable in class chat.octet.model.beans.LlamaContextParams
-
RoPE scaling type, from `enum llama_rope_scaling_type`.
S
- sampling(float[], int[], int, float, float, float, boolean, int, float, float, float, int, float, float, float, float, float, float, int, int) - Static method in class chat.octet.model.LlamaService
-
Inference sampling the next token.
- sampling(GenerateParameter, float[], int[], int, int) - Static method in class chat.octet.model.LlamaService
-
Inference sampling the next token.
- seed - Variable in class chat.octet.model.beans.LlamaContextParams
-
RNG seed, -1 for random.
- seqMax - Variable in class chat.octet.model.beans.LlamaContextParams
-
max number of sequences (i.e.
- SOLARIS - Static variable in class chat.octet.model.utils.Platform
- splitMode - Variable in class chat.octet.model.beans.LlamaModelParams
-
how to split the model across multiple GPUs.
- Status - Class in chat.octet.model.beans
- Status() - Constructor for class chat.octet.model.beans.Status
- Status(Status) - Constructor for class chat.octet.model.beans.Status
- STOP - Enum constant in enum class chat.octet.model.enums.FinishReason
-
Generation stopped by StoppingCriteria.
- StoppingCriteria - Interface in chat.octet.model.components.criteria
-
Customize a controller to implement stop rule control for model inference.
- StoppingCriteriaList - Class in chat.octet.model.components.criteria
-
Stopping criteria list
- StoppingCriteriaList() - Constructor for class chat.octet.model.components.criteria.StoppingCriteriaList
- StoppingWordCriteria - Class in chat.octet.model.components.criteria.impl
- StoppingWordCriteria(String...) - Constructor for class chat.octet.model.components.criteria.impl.StoppingWordCriteria
- subInputIds(int) - Method in class chat.octet.model.beans.Status
- subInputIds(int, int) - Method in class chat.octet.model.beans.Status
- subTokensBetween(List<Token>, String) - Static method in class chat.octet.model.TokenDecoder
- subTokensBetween(List<Token>, String, String) - Static method in class chat.octet.model.TokenDecoder
- SYSTEM - Enum constant in enum class chat.octet.model.beans.ChatMessage.ChatRole
-
System prompt
T
- tensorSplit - Variable in class chat.octet.model.beans.LlamaModelParams
-
how to split layers across multiple GPUs (size: LLAMA_MAX_DEVICES).
- thread - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
number of threads to use for quantizing, if invalid input: '<'=0 will use std::thread::hardware_concurrency()
- threads - Variable in class chat.octet.model.beans.LlamaContextParams
-
number of threads used for generation.
- threadsBatch - Variable in class chat.octet.model.beans.LlamaContextParams
-
number of threads used for prompt and batch processing.
- toAssistant(String) - Static method in class chat.octet.model.beans.ChatMessage
- Token - Class in chat.octet.model.beans
-
Token
- Token(int, LlamaTokenAttr, String) - Constructor for class chat.octet.model.beans.Token
- TokenDecoder - Class in chat.octet.model
-
Token decoder
- tokenEmbeddingType - Variable in class chat.octet.model.beans.LlamaModelQuantizeParams
-
token embeddings tensor type.
- tokenize(byte[], int, int[], int, boolean, boolean) - Static method in class chat.octet.model.LlamaService
-
Convert the provided text into tokens.
- tokenize(String, boolean, boolean) - Static method in class chat.octet.model.LlamaService
-
Convert the provided text into tokens.
- tokens() - Method in class chat.octet.model.Generator
-
Return the generated tokens.
- tokenToPiece(int, byte[], int, boolean) - Static method in class chat.octet.model.LlamaService
-
Convert the token id to text piece.
- toString() - Method in enum class chat.octet.model.beans.ChatMessage.ChatRole
- toString() - Method in class chat.octet.model.beans.Metrics
- toString() - Method in enum class chat.octet.model.enums.LlamaNumaStrategy
- toString() - Method in enum class chat.octet.model.enums.LlamaPoolingType
- toString() - Method in enum class chat.octet.model.enums.LlamaRoPEScalingType
- toString() - Method in enum class chat.octet.model.enums.LlamaSplitMode
- toString() - Method in enum class chat.octet.model.enums.LlamaTokenAttr
- toString() - Method in enum class chat.octet.model.enums.ModelFileType
- toString() - Method in class chat.octet.model.Model
- toSystem(String) - Static method in class chat.octet.model.beans.ChatMessage
- toUser(String) - Static method in class chat.octet.model.beans.ChatMessage
- TRUNCATED - Enum constant in enum class chat.octet.model.enums.FinishReason
-
Generation has exceeded the maximum context limit and has been truncated.
U
- ubatch - Variable in class chat.octet.model.beans.LlamaContextParams
-
physical maximum batch size.
- UNDERLINE - Enum constant in enum class chat.octet.model.utils.ColorConsole.FontStyle
- UNKNOWN - Enum constant in enum class chat.octet.model.enums.FinishReason
-
Unknown type, no available token state.
- UNSPECIFIED - Static variable in class chat.octet.model.utils.Platform
- updateFinishReason(FinishReason) - Method in class chat.octet.model.beans.Token
- USER - Enum constant in enum class chat.octet.model.beans.ChatMessage.ChatRole
-
User role
V
- V1 - Enum constant in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
- V2 - Enum constant in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
- valueOf(String) - Static method in enum class chat.octet.model.beans.ChatMessage.ChatRole
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.FinishReason
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaPoolingType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaRoPEScalingType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaSplitMode
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.LlamaTokenAttr
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.enums.ModelFileType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.utils.ColorConsole.ColorStyle
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class chat.octet.model.utils.ColorConsole.FontStyle
-
Returns the enum constant of this class with the specified name.
- valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaNumaStrategy
- valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaPoolingType
- valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaRoPEScalingType
- valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaSplitMode
- valueOfType(int) - Static method in enum class chat.octet.model.enums.LlamaTokenAttr
- valueOfType(int) - Static method in enum class chat.octet.model.enums.ModelFileType
- values() - Static method in enum class chat.octet.model.beans.ChatMessage.ChatRole
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.FinishReason
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.LlamaNumaStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.LlamaPoolingType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.LlamaRoPEScalingType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.LlamaSplitMode
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.LlamaTokenAttr
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.enums.ModelFileType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.parameters.GenerateParameter.MirostatMode
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.utils.ColorConsole.ColorStyle
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class chat.octet.model.utils.ColorConsole.FontStyle
-
Returns an array containing the constants of this enum class, in the order they are declared.
- vocabOnly - Variable in class chat.octet.model.beans.LlamaModelParams
-
only load the vocabulary, no weights.
W
- white(String) - Static method in class chat.octet.model.utils.ColorConsole
- WHITE - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
- WINDOWS - Static variable in class chat.octet.model.utils.Platform
- WINDOWSCE - Static variable in class chat.octet.model.utils.Platform
Y
- yarnAttnFactor - Variable in class chat.octet.model.beans.LlamaContextParams
-
YaRN magnitude scaling factor.
- yarnBetaFast - Variable in class chat.octet.model.beans.LlamaContextParams
-
YaRN low correction dim.
- yarnBetaSlow - Variable in class chat.octet.model.beans.LlamaContextParams
-
YaRN high correction dim.
- yarnExtFactor - Variable in class chat.octet.model.beans.LlamaContextParams
-
YaRN extrapolation mix factor, NaN = from model.
- yarnOrigCtx - Variable in class chat.octet.model.beans.LlamaContextParams
-
YaRN original context size.
- yellow(String) - Static method in class chat.octet.model.utils.ColorConsole
- YELLOW - Enum constant in enum class chat.octet.model.utils.ColorConsole.ColorStyle
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form