Expand description
Inference configuration types.
Defines shared types for configuring LLM inference parameters
(temperature, top_p, top_k, max_tokens, repeat_penalty).
This module provides the core InferenceConfig type that is reused across:
- Per-model defaults (
Model.inference_defaults) - Global settings (
Settings.inference_defaults) - Request-level overrides (flattened in
ChatProxyRequest)
Structsยง
- Inference
Config - Inference parameters for LLM sampling.