models.gguf¶
Module Path¶
Source file: src/models/gguf.zig
Public Types¶
GGUFHeader¶
The first 24 bytes of every GGUF file. magic must equal GGUF_MAGIC (0x46554747) and version must be 3.
GGUFTensorInfo¶
pub const GGUFTensorInfo = struct {
name: []const u8,
n_dims: u32,
dimensions: [4]u64,
ggml_type: GGMLType,
offset: u64,
};
Metadata for a single tensor stored in the GGUF file. The offset is relative to the start of the tensor data section.
GGUFValue¶
pub const GGUFValue = union(GGUFValueType) {
UINT8: u8,
INT8: i8,
UINT32: u32,
INT32: i32,
FLOAT32: f32,
STRING: []const u8,
ARRAY: []GGUFValue,
BOOL: bool,
// ...
};
Tagged union representing a single metadata value in the GGUF key-value store.
GGUFReader¶
pub const GGUFReader = struct {
file: std.fs.File,
header: GGUFHeader,
metadata: std.StringHashMap(GGUFValue),
tensor_infos: std.StringHashMap(GGUFTensorInfo),
data_offset: u64,
allocator: std.mem.Allocator,
};
Stateful reader that opens a GGUF file, parses the header and metadata, and provides random access to individual tensors.
Public Functions¶
GGUFReader.open¶
Open a GGUF file and parse the header. Does not read tensor data immediately -- tensors are loaded on demand.
GGUFReader.close¶
Close the underlying file and free metadata.
GGUFReader.readHeader¶
Parse and validate the file header. Called automatically by open.
GGUFReader.readMetadata¶
Read all key-value pairs from the metadata section into self.metadata.
GGUFReader.readTensorInfo¶
Read tensor descriptors into self.tensor_infos. After this call, tensor names and shapes are available without loading the actual data.
GGUFReader.loadTensor¶
pub fn loadTensor(
self: *GGUFReader,
name: []const u8,
allocator: std.mem.Allocator,
) !Tensor(f32)
Read and dequantize a named tensor from the file. The returned Tensor(f32) is always in full precision regardless of the on-disk quantization format.
GGUFReader.findTensor¶
Look up tensor metadata by name without loading data. Returns null if the tensor is not present in the file.
Error Types¶
error{InvalidMagic}-- file does not start with the GGUF magic number.error{UnsupportedVersion}-- file version is not 3.error{TensorNotFound}-- requested tensor name is not in the file.error{UnsupportedQuantType}-- tensor uses a quantization format not yet implemented.std.fs.File.OpenError
Usage Example¶
const gguf = @import("zigllama").models.gguf;
var reader = try gguf.GGUFReader.open("llama-7b.Q4_0.gguf", allocator);
defer reader.close();
// Inspect metadata
try reader.readMetadata();
try reader.readTensorInfo();
// Load a specific weight tensor
if (reader.findTensor("blk.0.attn_q.weight")) |info| {
std.debug.print("Q weight: {}x{}, type={}\n", .{
info.dimensions[0], info.dimensions[1], info.ggml_type,
});
}
var q_weight = try reader.loadTensor("blk.0.attn_q.weight", allocator);
defer q_weight.deinit();
Related Modules¶
foundation.gguf_format-- Constants (GGUF_MAGIC,GGMLType) used by the reader.foundation.memory_mapping-- Optional memory-mapped access for large files.models.llama-- Load GGUF weights into aLLaMAModel.