Grammar Constraints¶
Use GBNF (Grammar Backus-Naur Form) grammars to constrain model output to specific formats and structures. This is a powerful mechanism for ensuring the model produces syntactically valid output for any format you define.
What is GBNF?¶
GBNF is a grammar format used by llama.cpp to constrain token generation. During generation, the model can only select tokens that are valid according to the grammar, guaranteeing the output matches the defined structure.
Grammar Syntax¶
GBNF grammars consist of rules that define the structure of valid output:
# Rules are defined as: name ::= expression
root ::= value
value ::= string | number | "true" | "false" | "null"
string ::= "\"" [^"\\]* "\""
number ::= "-"? [0-9]+ ("." [0-9]+)?
Syntax Elements¶
| Element | Description | Example |
|---|---|---|
"text" |
Terminal (literal string) | "hello" |
[chars] |
Character class | [a-zA-Z0-9] |
[^chars] |
Negated character class | [^"\\] |
A B |
Sequence | "[" value "]" |
A \| B |
Alternation (OR) | "true" \| "false" |
A* |
Zero or more | [0-9]* |
A+ |
One or more | [0-9]+ |
A? |
Optional (zero or one) | "-"? |
(A B) |
Grouping | ("," value)* |
Using Grammars¶
import { Model, Context } from 'mullama';
const model = await Model.load('./model.gguf');
const context = new Context(model);
// Define a grammar for a simple list format
const grammar = `
root ::= item ("\\n" item)*
item ::= "- " content
content ::= [^\\n]+
`;
const response = await context.generate(
"List 5 programming languages:",
200,
{ grammar }
);
console.log(response);
// - Python
// - Rust
// - JavaScript
// - Go
// - TypeScript
from mullama import Model, Context
model = Model.load("./model.gguf")
context = Context(model)
# Define a grammar for a simple list format
grammar = r"""
root ::= item ("\n" item)*
item ::= "- " content
content ::= [^\n]+
"""
response = context.generate(
"List 5 programming languages:",
max_tokens=200,
grammar=grammar,
)
print(response)
Common Grammar Patterns¶
JSON Object¶
root ::= "{" ws members ws "}"
members ::= pair ("," ws pair)*
pair ::= string ws ":" ws value
value ::= string | number | "true" | "false" | "null" | array | object
string ::= "\"" [^"\\]* "\""
number ::= "-"? [0-9]+ ("." [0-9]+)?
array ::= "[" ws (value ("," ws value)*)? ws "]"
object ::= "{" ws (pair ("," ws pair)*)? ws "}"
ws ::= [ \t\n]*
CSV Format¶
root ::= header "\n" rows
header ::= cell ("," cell)*
rows ::= row ("\n" row)*
row ::= cell ("," cell)*
cell ::= [^,\n]*
Email Address¶
root ::= local "@" domain
local ::= [a-zA-Z0-9._%+-]+
domain ::= label ("." label)+
label ::= [a-zA-Z0-9-]+
ISO Date¶
root ::= year "-" month "-" day
year ::= [0-9][0-9][0-9][0-9]
month ::= "0" [1-9] | "1" [0-2]
day ::= "0" [1-9] | [1-2] [0-9] | "3" [0-1]
Key-Value Pairs¶
Creating Custom Grammars¶
Build grammars tailored to your application's output format:
// Grammar for a structured review
const reviewGrammar = `
root ::= rating "\\n" summary "\\n" pros "\\n" cons
rating ::= "Rating: " [1-5] "/5"
summary ::= "Summary: " sentence
pros ::= "Pros: " items
cons ::= "Cons: " items
items ::= item (", " item)*
item ::= [a-zA-Z ]+
sentence ::= [A-Z] [a-zA-Z ,.]+ "."
`;
const response = await context.generate(
"Review this restaurant: Great food but slow service.",
300,
{ grammar: reviewGrammar }
);
# Grammar for a structured review
review_grammar = r"""
root ::= rating "\n" summary "\n" pros "\n" cons
rating ::= "Rating: " [1-5] "/5"
summary ::= "Summary: " sentence
pros ::= "Pros: " items
cons ::= "Cons: " items
items ::= item (", " item)*
item ::= [a-zA-Z ]+
sentence ::= [A-Z] [a-zA-Z ,.]+ "."
"""
response = context.generate(
"Review this restaurant: Great food but slow service.",
max_tokens=300,
grammar=review_grammar,
)
let review_grammar = r#"
root ::= rating "\n" summary "\n" pros "\n" cons
rating ::= "Rating: " [1-5] "/5"
summary ::= "Summary: " sentence
pros ::= "Pros: " items
cons ::= "Cons: " items
items ::= item (", " item)*
item ::= [a-zA-Z ]+
sentence ::= [A-Z] [a-zA-Z ,.]+ "."
"#;
let response = context.generate_with_grammar(
"Review this restaurant: Great food but slow service.",
300,
review_grammar
)?;
Grammar Validation¶
Validate grammars before use to catch syntax errors:
Performance Impact¶
Grammar constraints add a small overhead per token due to grammar state tracking:
| Scenario | Overhead | Notes |
|---|---|---|
| No grammar | Baseline | Fastest generation |
| Simple grammar | ~5% slower | Few rules, simple patterns |
| Complex grammar | ~10-15% slower | Many rules, deep nesting |
| Very complex | ~20% slower | Extensive character classes, recursion |
Performance Tips
- Keep grammars as simple as possible
- Avoid deeply recursive rules when a flat structure works
- Use character classes (
[a-z]+) instead of repeated alternatives ("a" \| "b" \| ...) - For JSON output, prefer the built-in
--schemaoption over manual JSON grammars
Combining with Sampling Parameters¶
Grammar constraints work alongside sampling parameters. The grammar filters invalid tokens, then sampling selects among the remaining valid options:
Loading Grammars from Files¶
Load grammar definitions from .gbnf files:
Examples¶
Regex-Like Pattern: Phone Number¶
root ::= "(" area ") " prefix "-" line
area ::= [0-9][0-9][0-9]
prefix ::= [0-9][0-9][0-9]
line ::= [0-9][0-9][0-9][0-9]
Structured Data: SQL Query¶
root ::= select from where? orderby? ";"
select ::= "SELECT " columns
from ::= " FROM " table
where ::= " WHERE " condition
orderby ::= " ORDER BY " column " " direction
columns ::= column (", " column)*
column ::= [a-z_]+
table ::= [a-z_]+
condition ::= column " " operator " " value
operator ::= "=" | "!=" | ">" | "<" | ">=" | "<="
value ::= "'" [^']* "'" | [0-9]+
direction ::= "ASC" | "DESC"
Markdown List with Categories¶
root ::= category+
category ::= "## " title "\n" items "\n"
title ::= [A-Z][a-zA-Z ]+
items ::= item ("\n" item)*
item ::= "- " description
description ::= [^\n]+
See Also¶
- Structured Output -- JSON Schema-based output (uses grammars internally)
- Sampling Strategies -- Combining grammars with sampling
- Text Generation -- Core generation parameters
- API Reference: Configuration -- Grammar configuration options