InferenceConfig

Struct InferenceConfig 

Source
pub struct InferenceConfig {
    pub temperature: Option<f32>,
    pub top_p: Option<f32>,
    pub top_k: Option<i32>,
    pub max_tokens: Option<u32>,
    pub repeat_penalty: Option<f32>,
}
Expand description

Inference parameters for LLM sampling.

All fields are optional to support partial configuration and fallback chains. Intended to be shared across model defaults, global settings, and request overrides.

§Hierarchy Resolution

When making an inference request, parameters are resolved in this order:

  1. Request-level override (user specified for this request)
  2. Per-model defaults (stored in Model.inference_defaults)
  3. Global settings (stored in Settings.inference_defaults)
  4. Hardcoded fallback (e.g., temperature = 0.7)

§Examples

use gglib_core::domain::InferenceConfig;

// Conservative settings for code generation
let code_gen = InferenceConfig {
    temperature: Some(0.2),
    top_p: Some(0.9),
    top_k: Some(40),
    max_tokens: Some(2048),
    repeat_penalty: Some(1.1),
};

// Creative writing settings
let creative = InferenceConfig {
    temperature: Some(1.2),
    top_p: Some(0.95),
    ..Default::default()
};

Fields§

§temperature: Option<f32>

Sampling temperature (0.0 - 2.0).

Controls randomness in token selection:

  • Lower values (0.1-0.5): More deterministic, focused
  • Medium values (0.7-1.0): Balanced creativity
  • Higher values (1.1-2.0): More random, creative
§top_p: Option<f32>

Nucleus sampling threshold (0.0 - 1.0).

Considers only the top tokens whose cumulative probability exceeds this threshold. Common values: 0.9 (default), 0.95 (more diverse)

§top_k: Option<i32>

Top-K sampling limit.

Considers only the K most likely next tokens. Common values: 40 (default), 10 (focused), 100 (diverse)

§max_tokens: Option<u32>

Maximum tokens to generate in response.

Hard limit on response length. Does not include input tokens.

§repeat_penalty: Option<f32>

Repetition penalty (> 0.0, typically 1.0 - 1.3).

Penalizes repeated tokens to reduce repetitive output.

  • 1.0: No penalty (default)
  • 1.1-1.3: Moderate penalty
  • 1.3: Strong penalty (may hurt coherence)

Implementations§

Source§

impl InferenceConfig

Source

pub const fn merge_with(&mut self, other: &Self)

Merge another config into this one, preferring values from other.

For each field, if other has Some(value), use it; otherwise keep self’s value. This is useful for applying fallback chains.

§Example
use gglib_core::domain::InferenceConfig;

let mut request = InferenceConfig {
    temperature: Some(0.8),
    ..Default::default()
};

let model_defaults = InferenceConfig {
    temperature: Some(0.5),
    top_p: Some(0.9),
    ..Default::default()
};

request.merge_with(&model_defaults);
assert_eq!(request.temperature, Some(0.8)); // Request value wins
assert_eq!(request.top_p, Some(0.9));      // Fallback to model default
Source

pub const fn with_hardcoded_defaults() -> Self

Create a new config with all fields set to sensible defaults.

These are the hardcoded fallback values used when no other defaults are configured.

Source

pub fn to_cli_args(&self) -> Vec<String>

Convert inference config to llama CLI arguments.

Returns a vector of argument strings suitable for passing to llama-cli or llama-server. Uses the same flag names as llama.cpp: --temp, --top-p, --top-k, -n, --repeat-penalty.

This is the single source of truth for CLI flag conversion, used by:

  • LlamaCommandBuilder (for CLI commands)
  • GUI server startup (via ServerConfig.extra_args)
§Example
use gglib_core::domain::InferenceConfig;

let config = InferenceConfig {
    temperature: Some(0.8),
    top_p: Some(0.9),
    top_k: None,
    max_tokens: Some(1024),
    repeat_penalty: None,
};

let args = config.to_cli_args();
assert_eq!(args, vec!["--temp", "0.8", "--top-p", "0.9", "-n", "1024"]);

Trait Implementations§

Source§

impl Clone for InferenceConfig

Source§

fn clone(&self) -> InferenceConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for InferenceConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for InferenceConfig

Source§

fn default() -> InferenceConfig

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for InferenceConfig

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl PartialEq for InferenceConfig

Source§

fn eq(&self, other: &InferenceConfig) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Serialize for InferenceConfig

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl StructuralPartialEq for InferenceConfig

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,