Module inference

Module inference 

Source
Expand description

Inference configuration types.

Defines shared types for configuring LLM inference parameters (temperature, top_p, top_k, max_tokens, repeat_penalty).

This module provides the core InferenceConfig type that is reused across:

  • Per-model defaults (Model.inference_defaults)
  • Global settings (Settings.inference_defaults)
  • Request-level overrides (flattened in ChatProxyRequest)

Structsยง

InferenceConfig
Inference parameters for LLM sampling.