models.config¶
Module Path¶
Source file: src/models/config.zig
Public Types¶
ActivationType¶
NormalizationType¶
PositionEncodingType¶
pub const PositionEncodingType = enum {
Sinusoidal,
Rotary, // RoPE -- used by LLaMA
ALiBi, // Attention with Linear Biases
Learned, // Learned absolute positions
None,
};
ModelSize¶
pub const ModelSize = enum {
LLaMA_7B,
LLaMA_13B,
LLaMA_30B,
LLaMA_65B,
LLaMA2_7B,
LLaMA2_13B,
LLaMA2_70B,
CodeLlama_7B,
CodeLlama_13B,
CodeLlama_34B,
TinyLlama,
};
ModelConfig¶
pub const ModelConfig = struct {
// Architecture
d_model: usize,
num_layers: usize,
num_heads: usize,
num_kv_heads: ?usize, // null = same as num_heads (MHA)
d_ff: usize,
vocab_size: usize,
max_seq_len: usize,
// Normalization
norm_type: NormalizationType,
norm_eps: f32,
// Activation
activation: ActivationType,
// Position encoding
position_encoding: PositionEncodingType,
rope_theta: f32,
// Regularization
dropout: f32,
attention_dropout: f32,
// Quantization
weight_quant: ?QuantType,
};
Unified configuration that describes any supported model architecture.
Public Functions¶
ModelConfig.llama¶
Return a ModelConfig pre-filled for the given LLaMA variant. Sets norm_type = .RMSNorm, activation = .SwiGLU, position_encoding = .Rotary, and variant-specific dimensions.
ModelConfig.validate¶
Check internal consistency:
d_modelmust be divisible bynum_heads.num_kv_heads(if set) must dividenum_headsevenly.d_ffmust be positive.max_seq_lenmust be positive.
Returns error{InvalidConfig} on failure.
ModelConfig.parameterCount¶
Estimate the total number of trainable parameters:
ModelConfig.memoryRequirements¶
Estimate peak memory usage in bytes assuming f32 weights:
For quantized models, divide by the compression ratio of the target format.
Error Types¶
error{InvalidConfig}-- returned byvalidate.
Usage Example¶
const cfg = @import("zigllama").models.config;
// Get a pre-built LLaMA-7B config
var config = cfg.ModelConfig.llama(.LLaMA_7B);
try config.validate();
const params = config.parameterCount();
const mem = config.memoryRequirements();
std.debug.print("LLaMA-7B: {} parameters, {} bytes\n", .{ params, mem });
// ~6.7B parameters, ~26.8 GB in f32
Related Modules¶
models.llama-- UsesLLaMAConfig(a subset ofModelConfig).linear_algebra.quantization-- Quantization types referenced byweight_quant.