Expand description
Cross-turn “thinking debt” removal for chat history.
Small reasoning models (e.g. Qwen3.5-4B) have a strong tendency to
pattern-match their own previous <think> traces in the conversation
history and produce an unbounded thinking stream that never closes —
the model sees prior reasoning trails and tries to extend them. The
reference fix, mirroring OpenAI’s native behavior, is to drop reasoning
artifacts from prior assistant turns before the model sees them.
This module is the single source of truth for that scrub. Every
surface that builds a chat-completion request body must pipe its
messages array through strip_thinking_debt so the proxy, the
in-process agent loop (CLI / Tauri), and any future direct-mode
consumer all benefit equally.
The transform is:
- Unconditional — there is no per-model gate. Non-reasoning messages simply have nothing to strip and pass through untouched.
- Defensive — only assistant messages are touched; user, system, tool, and developer messages are never modified.
- Conservative on shape — the
reasoning_contentkey is removed outright. Stringcontenthas every<think>...</think>block excised. Non-stringcontent(multi-part array form) is left alone. - Forward-safe on unclosed tags — an unclosed
<think>from the most recent turn is preserved verbatim; the upstream is responsible for closing it.
Functions§
- strip_
think_ 🔒blocks - Remove every
<think>...</think>block froms. - strip_
thinking_ debt - Strip reasoning artifacts from prior assistant messages in
messages.