Installation¶
This guide covers installing Llamafu and setting up your first model.
Prerequisites¶
- Flutter SDK 3.16 or later
- Dart SDK 3.2 or later
- A GGUF model file
Adding the Dependency¶
Add Llamafu to your pubspec.yaml:
Then run:
Platform-Specific Setup¶
Android¶
Add the following to your android/app/build.gradle:
android {
defaultConfig {
minSdkVersion 21 // Minimum API level
ndk {
abiFilters 'arm64-v8a', 'armeabi-v7a', 'x86_64'
}
}
}
Memory Considerations
Quantized models (Q4, Q8) are recommended for Android to reduce memory usage.
iOS¶
No additional configuration required. Metal GPU acceleration is automatically enabled on supported devices (A7 chip or later).
Add to your ios/Podfile if not present:
macOS¶
Add the following entitlements to enable file access:
macos/Runner/DebugProfile.entitlements and Release.entitlements:
Linux¶
Install build dependencies:
Windows¶
Ensure you have Visual Studio 2019 or later with C++ build tools installed.
Obtaining Models¶
Recommended Sources¶
- Hugging Face - Largest collection of GGUF models
- TheBloke's Models
-
Model Recommendations by Use Case:
| Use Case | Recommended Model | Size |
|---|---|---|
| Mobile Chat | SmolLM-135M-Q8 | ~150MB |
| Desktop Chat | Llama-3.2-1B-Q4 | ~800MB |
| Vision (Mobile) | nanoLLaVA | ~2GB |
| Vision (Desktop) | LLaVA-1.5-7B-Q4 | ~4GB |
Downloading a Model¶
# Example: Download SmolLM for testing
wget https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct-GGUF/resolve/main/smollm-135m-instruct-q8_0.gguf
Verifying Installation¶
Create a simple test to verify everything works:
import 'package:llamafu/llamafu.dart';
void main() async {
try {
final llamafu = await Llamafu.init(
modelPath: 'path/to/your/model.gguf',
contextSize: 512,
);
print('Model loaded successfully!');
print('Vocab size: ${llamafu.vocabSize}');
print('Context size: ${llamafu.contextSize}');
llamafu.dispose();
} catch (e) {
print('Error: $e');
}
}
Troubleshooting¶
"Model file not found"¶
Ensure the model path is correct and the file exists. On mobile, models should be in the app's documents directory or bundled as assets.
"Out of memory"¶
Try a smaller quantization (Q4 instead of Q8) or reduce contextSize.
"Unsupported model format"¶
Llamafu only supports GGUF format. Convert older GGML models using llama.cpp's conversion tools.
Next Steps¶
- Quick Start Guide - Your first completion
- Basic Usage - Core concepts and patterns
- Text Generation - Detailed generation options