pub fn capabilities_from_architecture(arch: Option<&str>) -> ModelCapabilitiesExpand description
Map a GGUF general.architecture value to its inherent ModelCapabilities.
This is the single source of truth for architecture-level behavioural
constraints that apply to the request side (message preprocessing).
It is consulted during model registration alongside
infer_from_chat_template — the two results are OR-ed together so that
either signal is sufficient.
§Scope: request preprocessing only
This registry governs ModelCapabilities flags (strict-turn coalescing,
system-role conversion, etc.). It does not handle response-stream
dialect normalization — that is a separate concern handled by the
GgufCapabilities.extensions → format:* tag → get_parser() pipeline
in gglib-core::normalize::registry.
For example:
- Qwen tool-call XML normalization already flows through
detect_tool_support()→extensions.insert("format:qwen-xml")→to_tags()→get_parser()→QwenXmlParser. Qwen’s chat template always contains<tool_call>patterns, soinfer_from_chat_template(Layer 1) setsSUPPORTS_TOOL_CALLSreliably. No architecture entry is needed here for Qwen. - Mistral does need an entry: its templates enforce strict alternation,
but many quantised builds ship with the tokenizer section stripped, so
the template layer produces no signal.
general.architecture = "mistral"is always present and provides the necessary backstop.
§Rationale
Some models ship without a parseable tokenizer.chat_template in the GGUF
(stripped quantisation builds, partial uploads). The chat-template layer
then returns ModelCapabilities::empty(), silently leaving constraints
unapplied. Reading general.architecture from the GGUF gives us a
ground-truth signal that is always present and never varies by quantisation.
§Adding a new architecture
- Add a new
"arch_name" => { … }arm below. - Add a corresponding unit test in the
#[cfg(test)]block. - No other file needs touching — all call sites use this function.
§Arguments
arch— value of thegeneral.architectureGGUF key (e.g."mistral","llama","qwen2").Nonemeans the key was absent; returnsempty()so the model gets pass-through treatment.