Conversation replay loop after high-volume tool operations (F025 sweep)

## Problem

After running the F025 retrieval sweep (350 API calls via `run_python`), Nous re-executes the entire sweep from scratch when asked simple follow-up questions like "push raw data to git." It replays completed steps, re-checks APIs, re-builds scripts, and re-runs queries instead of recognizing the work is done.

## Observed Behavior

1. Tim asks Nous to push results to git
2. Instead of `git add && git push`, Nous starts: "Let me check the repo state and then run the full sweep"
3. It re-verifies the API, re-examines SQL, rebuilds the sweep script
4. Runs another 350 queries
5. Reports results again

## Root Cause Analysis

The conversation history contains hundreds of near-identical tool call/response pairs from the sweep. The model pattern-matches on this dominant context and continues the "sweep" behavior instead of responding to the new message.

### Why compaction doesn't save us:

- **Compaction threshold:** 60% of context window (~120K tokens for 200K window)
- **Tool pruning tiers:** soft-trim at age 3, metadata-degrade at age 8, hard-clear at age 12
- **`run_python` profile:** "standard" (3/8/12 ages)

The problem: 350 tool calls happen within a SINGLE `run_python` execution. From the pruning system's perspective, that's ONE tool result, not 350. The entire sweep output (1.8MB of JSON) sits in one tool result block. Tool pruning operates on individual tool_result messages, not on the size of individual results.

Even after soft-trimming (keeping first 1500 + last 1500 chars), the conversation still has:
- The full sweep script in a `run_python` tool_use block
- A trimmed but still-present result showing it was a sweep
- All the assistant reasoning around it ("Let me run 350 queries...")

The assistant messages describing the sweep plan and methodology are never pruned (pruning only touches tool results). So the model sees "here's how to run a sweep" instructions in its own prior messages and follows them again.

## Proposed Fixes

### 1. Repetitive operation detection (new)
Detect when the model is re-executing a pattern that already exists in conversation history:
- Track tool call signatures (name + key args hash)
- If the same signature was used >N times in recent history, inject a system hint: "This operation was already completed. Results are at [location]. Proceed with the user's current request."

### 2. Aggressive pruning for bulk operations
- Add a `bulk` or `sweep` decay profile: (1, 2, 4) — aggressively clear repetitive operations
- Auto-detect bulk patterns: same tool called >10 times with similar args
- Summarize the entire sequence into one line: "[Ran 350 search queries across 7 weight ratios. Results saved to docs/F025-sweep-raw-results.json]"

### 3. Task completion markers
- After large multi-tool operations, explicitly inject a completion marker into the conversation: "TASK COMPLETE: F025 sweep finished. 350 queries, results saved."
- The model can use this as a boundary to avoid replaying

### 4. Assistant message pruning (careful)
- Currently only tool results are pruned. Old assistant messages describing completed plans persist forever.
- Consider summarizing old assistant planning messages alongside their tool results
- Risk: losing important context. Needs careful design.

### 5. Compaction awareness of repetitive content
- When `should_compact` triggers, the summarizer should detect repetitive patterns and compress them aggressively
- "350 similar API calls" should become one sentence in the summary, not 350 entries

## Immediate Workaround

`/new` to start a fresh session clears the issue.

## Priority

P1 — this makes Nous unusable after any bulk operation until the session is reset.

— ⚡ Emerson

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversation replay loop after high-volume tool operations (F025 sweep) #179

Problem

Observed Behavior

Root Cause Analysis

Why compaction doesn't save us:

Proposed Fixes

1. Repetitive operation detection (new)

2. Aggressive pruning for bulk operations

3. Task completion markers

4. Assistant message pruning (careful)

5. Compaction awareness of repetitive content

Immediate Workaround

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Conversation replay loop after high-volume tool operations (F025 sweep) #179

Description

Problem

Observed Behavior

Root Cause Analysis

Why compaction doesn't save us:

Proposed Fixes

1. Repetitive operation detection (new)

2. Aggressive pruning for bulk operations

3. Task completion markers

4. Assistant message pruning (careful)

5. Compaction awareness of repetitive content

Immediate Workaround

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions