Installation¶
This page covers every step required to build and test ZigLlama on a fresh machine. By the end you will have a working checkout with all 285+ tests passing.
Prerequisites¶
Zig 0.13+¶
ZigLlama targets Zig 0.13.0 or later. Zig is distributed as a single static binary -- no installer, no system packages, no dependency chains.
Verify the installation:
Version compatibility
ZigLlama uses language features stabilised in 0.13. Earlier releases (0.11, 0.12) will fail to compile. If your distribution ships an older Zig, use the upstream tarball instead of the package-manager version.
Hardware Requirements¶
| Component | Minimum | Recommended | Notes |
|---|---|---|---|
| CPU | Any x86-64 or AArch64 | AVX2-capable x86-64 | SIMD kernels auto-detect; scalar fallback always available |
| RAM | 4 GB | 16 GB | Running tests requires < 1 GB; loading a 7B model in Q4 needs ~4 GB |
| Disk | 100 MB (source) | 10 GB+ | Space for downloaded GGUF model files |
Optional Tools¶
| Tool | Purpose | Install |
|---|---|---|
git | Clone the repository | System package manager |
python 3.9+ | Build MkDocs documentation locally | apt install python3 / brew install python |
perf (Linux) | CPU profiling of benchmarks | apt install linux-tools-common |
Clone the Repository¶
The checkout is self-contained. There are no Git submodules, no vendored C libraries, and no code-generation steps.
Verify the Installation¶
Run the full test suite to confirm everything works:
Expected output
On a clean checkout the command should exit with status 0 and report 285+ tests passing. The first build compiles the entire project from scratch; subsequent runs use the Zig build cache and finish in seconds.
You can also run tests for individual layers to isolate any problems:
# Foundation layer only (tensors, memory management)
zig build test-foundation
# Linear algebra layer only (SIMD ops, quantisation)
zig build test-linear-algebra
If all tests pass, your installation is complete. Jump to the Quick Start to run your first inference demo.
Optional: BLAS Libraries¶
ZigLlama includes a pure-Zig matrix multiplication kernel that works everywhere. For maximum throughput on large models, you can optionally link against an optimised BLAS library.
When do you need BLAS?
For educational exploration and running the test suite, the built-in kernels are more than sufficient. BLAS integration matters when you are benchmarking against llama.cpp or loading full-size models (7B+) and need production-grade matrix throughput.
OpenBLAS¶
Intel MKL¶
Intel oneAPI Math Kernel Library provides the fastest BLAS on Intel CPUs.
# Install via the Intel package repository (Debian/Ubuntu)
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] \
https://apt.repos.intel.com/oneapi all main" \
| sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update && sudo apt install intel-oneapi-mkl-devel
source /opt/intel/oneapi/setvars.sh
Apple Accelerate¶
On macOS, the Accelerate framework is pre-installed. No additional configuration is needed; ZigLlama's BLAS integration layer detects it automatically at build time.
# Verify Accelerate is available
xcrun --show-sdk-path
# Should print something like /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
Platform Support Matrix¶
| Platform | Architecture | Status | SIMD | BLAS |
|---|---|---|---|---|
| Linux | x86-64 | Fully supported | AVX, AVX2 | OpenBLAS, MKL |
| Linux | AArch64 | Fully supported | NEON | OpenBLAS |
| macOS | Apple Silicon | Fully supported | NEON | Accelerate |
| macOS | x86-64 (Intel) | Fully supported | AVX, AVX2 | OpenBLAS, MKL, Accelerate |
| Windows | x86-64 | Supported (community) | AVX, AVX2 | OpenBLAS |
| FreeBSD | x86-64 | Expected to work | AVX, AVX2 | OpenBLAS |
Cross-compilation
Zig's cross-compilation support means you can build a Linux AArch64 binary on your x86-64 workstation:
See Building from Source for the full cross-compilation guide.
Troubleshooting¶
error: expected expression or parser failures¶
Cause: Zig version too old.
Fix: Upgrade to Zig 0.13.0 or later. Check with zig version.
error: FileNotFound when running tests¶
Cause: Working directory is not the repository root.
Fix: Make sure you cd zigllama before running zig build test.
BLAS not detected at build time¶
Cause: The BLAS shared library is not on the default library search path.
Fix: Set LIBRARY_PATH (compile time) and LD_LIBRARY_PATH (run time) to the directory containing libopenblas.so or libmkl_rt.so.
export LIBRARY_PATH=/opt/openblas/lib:$LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/openblas/lib:$LD_LIBRARY_PATH
Out-of-memory during large model loading¶
Cause: The model file exceeds available RAM.
Fix: Use a quantised model (Q4_K or smaller) or enable memory mapping. ZigLlama's GGUF loader uses mmap by default on Linux and macOS, so the operating system can page in only the portions of the model file that are actively needed.
Slow SIMD performance¶
Cause: The CPU does not support AVX2, or the build was compiled in Debug mode.
Fix: Verify CPU support with:
# Linux
grep -o 'avx2' /proc/cpuinfo | head -1
# macOS
sysctl -a | grep machdep.cpu.features | grep AVX2
If AVX2 is available, rebuild in ReleaseFast mode:
See Building from Source for guidance on build modes.
Tests pass but examples fail to run¶
Cause: Examples are standalone Zig files that import from src/. They must be run from the repository root.
Fix:
Next Steps¶
- Quick Start Guide -- run your first inference.
- Building from Source -- customise the build for your platform.
- Project Structure -- understand the codebase layout.