Skip to content

feat(stats): introduce privacy-first gains tracking feature @hansipie#162

Merged
yoanbernabeu merged 12 commits intoyoanbernabeu:mainfrom
hansipie:001-stats-gains
Mar 16, 2026
Merged

feat(stats): introduce privacy-first gains tracking feature @hansipie#162
yoanbernabeu merged 12 commits intoyoanbernabeu:mainfrom
hansipie:001-stats-gains

Conversation

@hansipie
Copy link
Contributor

Summary

This PR introduces a privacy-first gains monitoring system for grepai. Every search and trace command now automatically records token usage metrics locally, and a new grepai stats command lets users visualize how much they save compared to a traditional grep-based workflow.


Motivation

Users had no visibility into the concrete value grepai brings over grep. This feature answers the question: "How many tokens (and dollars) have I actually saved by using grepai?"


What's new

New stats/ package

  • Entry — one record per search/trace command (timestamp, command type, output mode, result count, tokens consumed, grep-equivalent tokens)
  • Recorder — appends entries as NDJSON to .grepai/stats.json using a file lock (cross-platform: flock on Unix, LockFileEx on Windows)
  • ReadAll / Summarize / HistoryByDay — aggregation helpers
  • GrepEquivalentTokens(n) — estimates what grep would have cost: n × 512 tokens × 3 (expansion factor for context)
  • Cost estimation for cloud providers (OpenAI, OpenRouter) at $5/M tokens; nil for local providers (Ollama, LM Studio)
  • 13 unit tests with -race, all green

New grepai stats command (cli/stats.go)

grepai stats [--json] [--history] [--limit N]
  • Human-readable output with lipgloss (rounded border, colors)
  • --json — structured JSON output for scripting/agents
  • --history — per-day breakdown table
  • --limit N — max days shown (default: 30)

Auto-recording in search and trace

Stats are recorded fire-and-forget in a goroutine with a 100 ms timeout — zero impact on search/trace latency. Output is captured to a string (instead of writing directly to stdout) to enable token counting without stdout hijacking.

MCP integration (mcp/server.go)

Stats are recorded for all MCP tool calls: grepai_search, grepai_trace_callers, grepai_trace_callees, grepai_trace_graph. A new grepai_stats MCP tool allows AI agents to query savings programmatically.


Example output

╭─────────────────────────────────────────────────────╮
│ grepai stats — Token Savings Report                 │
│                                                     │
│ Total queries      42                               │
│ Tokens (grepai)    8,320                            │
│ Tokens (grep est.) 64,512                           │
│ Tokens saved       56,192  ▲ 87.1%                  │
│ Est. cost saved    $0.2810  (cloud provider)        │
│                                                     │
│ By command:  search 38 · trace-callers 4            │
│ By mode:     compact 30 · full 12                   │
╰─────────────────────────────────────────────────────╯

Testing

make test        # all packages pass with -race
make build       # binary builds clean
grepai stats --help
grepai stats --json
grepai stats --history --limit 7

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

How Has This Been Tested?

  • Unit tests
  • Integration tests
  • Manual testing

Test Configuration:

  • OS: Linux Windows
  • Go version:
  • Embedding provider:

Checklist

  • My code follows the project's code style
  • I have run golangci-lint run and fixed any issues
  • I have added tests that prove my fix/feature works
  • I have updated the documentation if needed
  • I have added an entry to CHANGELOG.md (if applicable)
  • My changes generate no new warnings
  • All new and existing tests pass

Screenshots (if applicable)

Additional Notes

config_test:
On Windows, creating symlinks requires elevated privileges or Developer
Mode. Skip the test gracefully on Windows; keep t.Fatalf on other platforms.

Introduce a privacy-first gains tracking feature that automatically
records token usage on every search/trace command and exposes a new
`grepai stats` command to visualize savings vs grep-based workflows.

All data is stored locally in .grepai/stats.json (NDJSON) and never
leaves the machine.

- New stats/ package: Entry, Recorder (flock-based append), ReadAll,
  Summarize, HistoryByDay, cross-platform file locking (Unix/Windows)
- New cli/stats.go: `grepai stats [--json] [--history] [--limit N]`
  with lipgloss human-readable output and JSON mode
- cli/search.go + cli/trace.go: fire-and-forget stats recording
  (100ms timeout goroutine, zero latency impact)
- mcp/server.go: stats recording on all MCP tool calls + new
  grepai_stats MCP tool for agent integration
- 13 unit tests, all passing with -race
On Windows, creating symlinks requires elevated privileges or Developer
Mode. Skip the test gracefully on Windows; keep t.Fatalf on other platforms.
@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 38.65199% with 446 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.58%. Comparing base (a322537) to head (43b938c).
⚠️ Report is 95 commits behind head on main.

Files with missing lines Patch % Lines
cli/search.go 0.00% 114 Missing ⚠️
cli/stats.go 12.28% 99 Missing and 1 partial ⚠️
cli/status.go 0.00% 100 Missing ⚠️
mcp/server.go 31.16% 51 Missing and 2 partials ⚠️
cli/trace.go 9.43% 48 Missing ⚠️
cli/tui_stats.go 91.94% 9 Missing and 3 partials ⚠️
stats/recorder.go 66.66% 6 Missing and 4 partials ⚠️
stats/reader.go 90.27% 4 Missing and 3 partials ⚠️
cli/tui_runtime.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #162       +/-   ##
===========================================
+ Coverage   27.16%   46.58%   +19.41%     
===========================================
  Files          32       78       +46     
  Lines        3711    14680    +10969     
===========================================
+ Hits         1008     6838     +5830     
- Misses       2620     7153     +4533     
- Partials       83      689      +606     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tinker495
Copy link
Contributor

Thanks for the PR — I’m not a maintainer, but as a contributor/user I’m excited to see this merged.

Quick question/idea (feel free to ignore if it’s out of scope): since main recently introduced a Bubble Tea TUI for commands like status (#143, with a --no-ui fallback), would it make sense for grepai stats to follow the same pattern (or even surface stats inside the existing status UI) while keeping --json for scripting?

Totally fine if you’d rather keep this PR focused and tackle any TUI alignment as a follow-up.

The merge of main into 001-stats-gains produced a broken trace.go:
- outputJSON referenced undefined os.Stdout and buf.String()
- outputTOON and captureJSON were missing entirely

Restore the capture/output pattern from the stats-gains branch:
- captureJSON serializes to a buffer (testable, no stdout side-effect)
- outputJSON delegates to captureJSON
- outputTOON delegates to captureTOON
Introduce outputAndRecord() to centralize the repeated pattern
of capturing output and recording trace stats across the three
runTrace* functions (callers, callees, graph).

Restores the recordTraceStats calls lost during the merge of
main into 001-stats-gains.
@yoanbernabeu
Copy link
Owner

Code Review — feat(stats): privacy-first gains tracking

Thanks for the excellent contribution @hansipie! 🎉 This is a well-thought-out feature with clean architecture, cross-platform support, and solid test coverage. Great work overall.


Issues to address before merge

1. Stats file location — stats.json placed in project root instead of .grepai/

StatsPath() in stats/stats.go joins projectRoot with StatsFileName directly:

func StatsPath(projectRoot string) string {
    return filepath.Join(projectRoot, StatsFileName)
}

This creates <project>/stats.json and <project>/stats.json.lock at the project root, which could be accidentally committed to git. It should be inside .grepai/ to match the existing convention (config.yaml, GOB store, etc.):

return filepath.Join(projectRoot, ".grepai", StatsFileName)

2. Dead code _ = symStats in mcp/server.go

There's a leftover no-op after the statssymStats rename in handleTraceGraph:

_ = symStats

This line should be removed.

3. Incomplete output capture in plain text mode (cli/search.go)

In runSearch, the plain text mode writes result headers directly to stdout (fmt.Printf("─── Result %d ...") while body lines go to buf (fmt.Fprintf(&buf, ...)). As a result, outputStr only contains partial output, making the token estimation inaccurate for this mode. All output should be captured through the buf builder before printing.

4. Workspace search not tracked

runWorkspaceSearch doesn't call recordSearchStats, so workspace searches are not recorded in the stats. This should be consistent with runSearch.

5. Minor: trimSuffix reimplements strings.TrimSuffix

The custom trimSuffix in cli/stats.go duplicates strings.TrimSuffix from the standard library.


On Bubble Tea TUI integration

Great suggestion from @tinker495 about aligning grepai stats with the Bubble Tea TUI pattern introduced in #143! I agree this is a good direction — surfacing stats in the existing status UI (or using the same Bubble Tea patterns) would provide a more consistent UX. That said, this can definitely be treated as a follow-up to keep this PR focused.


Summary

Really nice feature that adds tangible value for users. The issues above are fairly straightforward to fix. Looking forward to the updated version!

hansipie added 3 commits March 2, 2026 10:26
- Move stats.json and stats.json.lock from project root to .grepai/
- Auto-create .grepai/ directory in recorder if it doesn't exist
- Capture all text output in buf for accurate token estimation in runSearch
- Add recordSearchStats call to runWorkspaceSearch (was untracked)
- Remove dead code `_ = symStats` in mcp/server.go
- Replace local trimSuffix with strings.TrimSuffix in cli/stats.go
- Update stats_test.go to write test files in .grepai/ subdirectory
- Add --no-ui flag to stats command for plain text output
- Add viewTokenSavings state to status TUI with keyboard nav (s key)
- Add runStatsUI for interactive stats display
- Add tui_stats.go with stats TUI implementation
- Add tui_stats_test.go with tests
@hansipie
Copy link
Contributor Author

hansipie commented Mar 2, 2026

Thanks for the thorough review! All 5 points have been addressed:

  1. Stats file location
    Fixed StatsPath() and LockPath() in stats/stats.go to store files under .grepai/ instead of the project root. .grepai/ is already in .gitignore so stats.json and stats.json.lock won't be accidentally committed.

  2. Dead code _ = symStats
    Removed from mcp/server.go. symStats is already consumed at line 1451 (symStats.TotalSymbols == 0), so Go doesn't complain.

  3. Incomplete output capture in plain text mode
    All fmt.Printf calls in the runSearch result loop (headers, file paths, feature/symbol lines) have been moved into buf, so outputStr now reflects the full output before being passed to recordSearchStats.

  4. Workspace search not tracked
    runWorkspaceSearch now builds its output through a strings.Builder (same pattern as runSearch) and calls recordSearchStats at the end — including the early-exit case when no results are found.

  5. trimSuffix reimplements strings.TrimSuffix
    Replaced both call sites with strings.TrimSuffix and deleted the custom helper.


Bonus — TUI implementation ... done!

@hansipie
Copy link
Contributor Author

hansipie commented Mar 3, 2026

Hello
Less is sometime better...
Perhaps I should remove the "stats" option now. What do you think?

@yoanbernabeu
Copy link
Owner

Code Review — Updated Analysis

Great work addressing all 5 points from the previous review, and the TUI integration bonus is a nice touch!

I found one remaining issue:

Bug: runWorkspaceSearch does not record stats in JSON/TOON output modes

In cli/search.go, runWorkspaceSearch has 4 output branches but only 2 call recordSearchStats:

Output mode runSearch runWorkspaceSearch
JSON ✅ Line 300 ❌ Missing (line 663: return nil without recording)
TOON ✅ Line 317 ❌ Missing (line 679: return nil without recording)
No results ✅ Line 323 ✅ Line 686
Plain text ✅ Line 361 ✅ Line 725

Additionally, projectRoot is resolved at line 682, after the JSON/TOON branches (lines 650-679), so it's not available in those code paths. The fix is to move projectRoot, _ := config.FindProjectRoot() before the JSON output mode block, and add the recordSearchStats calls.

Everything else looks solid — the stats/ package architecture, cross-platform file locking, and Bubble Tea TUI are all well implemented.

hansipie added 3 commits March 9, 2026 09:14
Move projectRoot resolution before JSON/TOON branches so it
is available in all output paths. Add recordSearchStats calls
in the JSON and TOON branches of runWorkspaceSearch, which
were previously skipped.
@hansipie
Copy link
Contributor Author

hansipie commented Mar 9, 2026

Hello

Indeed ... it's done !

@yoanbernabeu
Copy link
Owner

Thanks for this great contribution @hansipie! 🎉 The implementation is clean, well-tested, and integrates nicely across CLI, TUI, and MCP. Merging!

@yoanbernabeu yoanbernabeu merged commit feeb909 into yoanbernabeu:main Mar 16, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants