Installation¶

This guide covers installing Llamafu and setting up your first model.

Prerequisites¶

Flutter SDK 3.16 or later
Dart SDK 3.2 or later
A GGUF model file

Adding the Dependency¶

Add Llamafu to your pubspec.yaml:

dependencies:
  llamafu: ^1.0.0

Then run:

flutter pub get

Platform-Specific Setup¶

Android¶

Add the following to your android/app/build.gradle:

android {
    defaultConfig {
        minSdkVersion 21  // Minimum API level
        ndk {
            abiFilters 'arm64-v8a', 'armeabi-v7a', 'x86_64'
        }
    }
}

Memory Considerations

Quantized models (Q4, Q8) are recommended for Android to reduce memory usage.

iOS¶

No additional configuration required. Metal GPU acceleration is automatically enabled on supported devices (A7 chip or later).

Add to your ios/Podfile if not present:

platform :ios, '12.0'

macOS¶

Add the following entitlements to enable file access:

macos/Runner/DebugProfile.entitlements and Release.entitlements:

<key>com.apple.security.files.user-selected.read-only</key>
<true/>

Linux¶

Install build dependencies:

# Ubuntu/Debian
sudo apt-get install build-essential cmake

# Fedora
sudo dnf install gcc-c++ cmake

Windows¶

Ensure you have Visual Studio 2019 or later with C++ build tools installed.

Obtaining Models¶

Recommended Sources¶

Hugging Face - Largest collection of GGUF models
TheBloke's Models
ggml-org Official
Model Recommendations by Use Case:

Use Case	Recommended Model	Size
Mobile Chat	SmolLM-135M-Q8	~150MB
Desktop Chat	Llama-3.2-1B-Q4	~800MB
Vision (Mobile)	nanoLLaVA	~2GB
Vision (Desktop)	LLaVA-1.5-7B-Q4	~4GB

Downloading a Model¶

# Example: Download SmolLM for testing
wget https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct-GGUF/resolve/main/smollm-135m-instruct-q8_0.gguf

Verifying Installation¶

Create a simple test to verify everything works:

import 'package:llamafu/llamafu.dart';

void main() async {
  try {
    final llamafu = await Llamafu.init(
      modelPath: 'path/to/your/model.gguf',
      contextSize: 512,
    );

    print('Model loaded successfully!');
    print('Vocab size: ${llamafu.vocabSize}');
    print('Context size: ${llamafu.contextSize}');

    llamafu.dispose();
  } catch (e) {
    print('Error: $e');
  }
}

Troubleshooting¶

"Model file not found"¶

Ensure the model path is correct and the file exists. On mobile, models should be in the app's documents directory or bundled as assets.

"Out of memory"¶

Try a smaller quantization (Q4 instead of Q8) or reduce contextSize.

"Unsupported model format"¶

Llamafu only supports GGUF format. Convert older GGML models using llama.cpp's conversion tools.

Next Steps¶

Quick Start Guide - Your first completion
Basic Usage - Core concepts and patterns
Text Generation - Detailed generation options