-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Description
Problem
Modern reasoning models (Qwen3.5, DeepSeek R1, Claude with extended thinking) emit <think>...</think> tags containing chain-of-thought reasoning. We need to:
- Parse and display reasoning in the chat widget — collapsible
<think>blocks showing the model's thought process - Strip
<think>from final response but preserve it in metadata for training capture - Change "typing" → "thinking" everywhere — AIs don't have keyboards. The typing indicator should say "thinking" with a brain/thought animation instead of bouncing dots
Current State
- Chat widget shows "typing..." with keyboard-style dots
<think>tags in model output are displayed as raw text or cause parsing errors- The multimodal processor in Qwen3.5 tries to interpret
<think>content as image URLs (broke inference test on 5090) - Persona telemetry shows "typing_start" / "typing_stop" events
What Needs to Change
Chat Widget
- Parse
<think>blocks from message content - Render as collapsible/expandable section (dimmed, smaller font, brain icon)
- Default collapsed — click to expand reasoning
- Keep final response prominent
Typing → Thinking
- All UI: "typing..." → "thinking..."
- Animation: keyboard dots → brain pulse or thought bubble
- Events:
TYPING_START→THINKING_START(or alias both) - PersonaTile: thinking indicator with brain icon
Model Output Processing
- Strip
<think>from displayed message text - Store full output (with
<think>) in message metadata for training - Training capture gets the reasoning traces — gold for distillation
Inference Pipeline
- Text-only inputs must bypass vision processor (Qwen3.5 bug)
<think>content must not be fed to VisionDescriptionService
Why This Matters
- Reasoning models are the future — every new model supports
<think> - Training on reasoning traces is how you distill intelligence (5090 tower: install Unsloth + verify MoE LoRA training works #430, MoE surgery: extract individual experts for targeted training + tiny deployment #439)
- "Thinking" is more honest and engaging than "typing" for AI
- Users want to see HOW the AI arrived at its answer
Related
- Evaluate Qwen3.5-35B-A3B as local inference model — Opus reasoning distilled, 3B active #417 (Qwen3.5 evaluation — hit this during inference test)
- 5090 tower: install Unsloth + verify MoE LoRA training works #430 (training — reasoning traces are training gold)
- MoE surgery: extract individual experts for targeted training + tiny deployment #439 (MoE surgery — code expert reasoning traces)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels