capabilities_from_architecture

Function capabilities_from_architecture 

Source
pub fn capabilities_from_architecture(arch: Option<&str>) -> ModelCapabilities
Expand description

Map a GGUF general.architecture value to its inherent ModelCapabilities.

This is the single source of truth for architecture-level behavioural constraints that apply to the request side (message preprocessing). It is consulted during model registration alongside infer_from_chat_template — the two results are OR-ed together so that either signal is sufficient.

§Scope: request preprocessing only

This registry governs ModelCapabilities flags (strict-turn coalescing, system-role conversion, etc.). It does not handle response-stream dialect normalization — that is a separate concern handled by the GgufCapabilities.extensionsformat:* tag → get_parser() pipeline in gglib-core::normalize::registry.

For example:

  • Qwen tool-call XML normalization already flows through detect_tool_support()extensions.insert("format:qwen-xml")to_tags()get_parser()QwenXmlParser. Qwen’s chat template always contains <tool_call> patterns, so infer_from_chat_template (Layer 1) sets SUPPORTS_TOOL_CALLS reliably. No architecture entry is needed here for Qwen.
  • Mistral does need an entry: its templates enforce strict alternation, but many quantised builds ship with the tokenizer section stripped, so the template layer produces no signal. general.architecture = "mistral" is always present and provides the necessary backstop.

§Rationale

Some models ship without a parseable tokenizer.chat_template in the GGUF (stripped quantisation builds, partial uploads). The chat-template layer then returns ModelCapabilities::empty(), silently leaving constraints unapplied. Reading general.architecture from the GGUF gives us a ground-truth signal that is always present and never varies by quantisation.

§Adding a new architecture

  1. Add a new "arch_name" => { … } arm below.
  2. Add a corresponding unit test in the #[cfg(test)] block.
  3. No other file needs touching — all call sites use this function.

§Arguments

  • arch — value of the general.architecture GGUF key (e.g. "mistral", "llama", "qwen2"). None means the key was absent; returns empty() so the model gets pass-through treatment.