Module history

Module history 

Source
Expand description

Cross-turn “thinking debt” removal for chat history.

Small reasoning models (e.g. Qwen3.5-4B) have a strong tendency to pattern-match their own previous <think> traces in the conversation history and produce an unbounded thinking stream that never closes — the model sees prior reasoning trails and tries to extend them. The reference fix, mirroring OpenAI’s native behavior, is to drop reasoning artifacts from prior assistant turns before the model sees them.

This module is the single source of truth for that scrub. Every surface that builds a chat-completion request body must pipe its messages array through strip_thinking_debt so the proxy, the in-process agent loop (CLI / Tauri), and any future direct-mode consumer all benefit equally.

The transform is:

  • Unconditional — there is no per-model gate. Non-reasoning messages simply have nothing to strip and pass through untouched.
  • Defensive — only assistant messages are touched; user, system, tool, and developer messages are never modified.
  • Conservative on shape — the reasoning_content key is removed outright. String content has every <think>...</think> block excised. Non-string content (multi-part array form) is left alone.
  • Forward-safe on unclosed tags — an unclosed <think> from the most recent turn is preserved verbatim; the upstream is responsible for closing it.

Functions§

strip_think_blocks 🔒
Remove every <think>...</think> block from s.
strip_thinking_debt
Strip reasoning artifacts from prior assistant messages in messages.