Skip to content

Samplers API

API reference for sampling configuration.

SamplerChain

A chain of samplers applied in sequence.

class SamplerChain {
  void add(Sampler sampler);
  void remove(int index);
  void clear();
  int get length;
  void dispose();
}

Creating a Chain

final chain = llamafu.createSamplerChain();
chain.add(llamafu.createTopKSampler(40));
chain.add(llamafu.createTempSampler(0.8));

Using a Chain

final response = await llamafu.completeWithSampler(
  prompt,
  sampler: chain,
  maxTokens: 100,
);

Disposing

chain.dispose();  // Free native resources

Sampler Factory Methods

All samplers are created through Llamafu instance methods.

Temperature Sampler

Scales logits before softmax.

Sampler createTempSampler(double temperature)

Parameters: - temperature: 0.0 (greedy) to 2.0+ (random)

Top-K Sampler

Keeps only top K tokens.

Sampler createTopKSampler(int k)

Parameters: - k: Number of tokens to keep (1 to vocab size)

Top-P Sampler

Nucleus sampling - keeps tokens until cumulative probability exceeds p.

Sampler createTopPSampler(double p, {int minKeep = 1})

Parameters: - p: Probability threshold (0.0 to 1.0) - minKeep: Minimum tokens to keep

Min-P Sampler

Filters tokens below minimum probability threshold.

Sampler createMinPSampler(double p, {int minKeep = 1})

Parameters: - p: Minimum probability relative to top token - minKeep: Minimum tokens to keep

Typical Sampler

Selects tokens with entropy close to expected.

Sampler createTypicalSampler(double p, {int minKeep = 1})

Parameters: - p: Typical probability mass - minKeep: Minimum tokens to keep

Mirostat Sampler

Maintains target perplexity.

Sampler createMirostatSampler({
  required double tau,
  required double eta,
  int m = 100,
})

Parameters: - tau: Target entropy (typically 3.0-5.0) - eta: Learning rate (typically 0.1) - m: Mirostat parameter

Mirostat V2 Sampler

Improved Mirostat algorithm.

Sampler createMirostatV2Sampler({
  required double tau,
  required double eta,
})

Parameters: - tau: Target entropy - eta: Learning rate

Penalties Sampler

Applies repetition penalties.

Sampler createPenaltySampler({
  int penaltyLastN = 64,
  double penaltyRepeat = 1.1,
  double penaltyFreq = 0.0,
  double penaltyPresent = 0.0,
})

Parameters: - penaltyLastN: Tokens to look back - penaltyRepeat: Repeat penalty multiplier - penaltyFreq: Frequency-based penalty - penaltyPresent: Presence-based penalty

Greedy Sampler

Always selects most likely token.

Sampler createGreedySampler()

Sampler Order

Samplers are applied in the order added to the chain:

final chain = llamafu.createSamplerChain();

// Order matters!
chain.add(llamafu.createTopKSampler(50));     // 1. Filter to top 50
chain.add(llamafu.createTopPSampler(0.9));    // 2. Apply nucleus
chain.add(llamafu.createTempSampler(0.8));    // 3. Apply temperature
chain.add(llamafu.createPenaltySampler());    // 4. Apply penalties

Recommended order: 1. Top-K (filter by count) 2. Top-P (filter by probability) 3. Min-P (filter by threshold) 4. Temperature (scale) 5. Penalties (adjust for repetition)

Preset Chains

Balanced

SamplerChain createBalancedChain() {
  final chain = llamafu.createSamplerChain();
  chain.add(llamafu.createTopKSampler(40));
  chain.add(llamafu.createTopPSampler(0.9));
  chain.add(llamafu.createTempSampler(0.7));
  chain.add(llamafu.createPenaltySampler(penaltyRepeat: 1.1));
  return chain;
}

Creative

SamplerChain createCreativeChain() {
  final chain = llamafu.createSamplerChain();
  chain.add(llamafu.createTopPSampler(0.95));
  chain.add(llamafu.createTempSampler(1.0));
  chain.add(llamafu.createPenaltySampler(penaltyRepeat: 1.2));
  return chain;
}

Deterministic

SamplerChain createDeterministicChain() {
  final chain = llamafu.createSamplerChain();
  chain.add(llamafu.createGreedySampler());
  return chain;
}

Mirostat (Consistent Quality)

SamplerChain createMirostatChain() {
  final chain = llamafu.createSamplerChain();
  chain.add(llamafu.createMirostatV2Sampler(tau: 5.0, eta: 0.1));
  return chain;
}

Individual Sampler Usage

For simple cases, use built-in parameters:

// Instead of creating a chain:
await llamafu.complete(
  prompt,
  temperature: 0.8,
  topK: 40,
  topP: 0.9,
  repeatPenalty: 1.1,
);

Use sampler chains for: - Custom sampler combinations - Mirostat sampling - Advanced penalty configurations - Reusable sampling configurations

Performance

Sampler overhead is generally minimal:

Sampler Relative Cost
Greedy 1x (baseline)
Temperature 1.1x
Top-K 1.2x
Top-P 1.3x
Mirostat 1.4x
Penalties 1.5x

Chain with multiple samplers: costs are additive.