foundation.threading¶
Module Path¶
Source file: src/foundation/threading.zig
Internal module. This API may change between releases.
Public Types¶
ThreadPoolConfig¶
pub const ThreadPoolConfig = struct {
num_threads: ?usize = null, // null = auto-detect CPU count
stack_size: usize = 8 * 1024 * 1024,
enable_work_stealing: bool = true,
numa_aware: bool = false,
pin_threads: bool = false,
};
Configuration for the thread pool. When num_threads is null, the pool sizes itself to the number of available hardware threads.
ThreadPool¶
pub const ThreadPool = struct {
workers: []Worker,
config: ThreadPoolConfig,
running: std.atomic.Value(bool),
};
Fixed-size pool of worker threads with optional work-stealing scheduling.
Worker¶
Internal per-thread state. Each worker owns a local work queue and can steal from siblings.
WorkStealingQueue¶
Lock-free deque used by each Worker for local task storage. Other workers can steal from the tail when their own queues are empty.
WorkItem¶
pub const WorkItem = struct {
func: *const fn (*anyopaque) void,
context: *anyopaque,
done: std.atomic.Value(bool),
};
A unit of work submitted to the pool.
ParallelOps¶
High-level parallel primitives that partition tensor operations across the pool.
NumaAllocator¶
Allocator that binds allocations to a specific NUMA node. Falls back to the system allocator on non-NUMA hardware.
NumaPolicy¶
Memory placement policy for NUMA-aware allocations.
CpuTopology¶
pub const CpuTopology = struct {
num_cores: usize,
num_threads: usize,
num_numa_nodes: usize,
cache_line_size: usize,
};
Hardware topology detected at runtime via /sys/devices or cpuid.
Public Functions¶
ThreadPool.init¶
Create and start the thread pool. Workers begin in an idle spin-wait state.
ThreadPool.deinit¶
Signal all workers to exit and join their threads.
ThreadPool.submit¶
Enqueue a work item. Returns a handle whose done field can be polled or awaited.
ParallelOps.matmul¶
Parallel matrix multiplication that tiles rows of a across workers.
ParallelOps.softmax¶
Row-parallel softmax. Each worker processes a contiguous slice of rows.
Error Types¶
ThreadPool.initcan returnerror{SystemResources, OutOfMemory}.submitreturnserror{QueueFull}when the work-stealing queue is at capacity.
Usage Example¶
const threading = @import("zigllama").foundation.threading;
var pool = try threading.ThreadPool.init(.{
.num_threads = 4,
.enable_work_stealing = true,
});
defer pool.deinit();
const ops = threading.ParallelOps{ .pool = &pool };
var result = try ops.matmul(weights, input, allocator);
defer result.deinit();
Related Modules¶
foundation.blas_integration-- BLAS backends that may use their own threading.linear_algebra.matrix_ops-- SIMD matrix ops that can leverageParallelOps.