GitHub - ProgenyAlpha/reddit-lurker: Reddit reader for LLMs — full comment trees, 77% fewer tokens, CLI + MCP server

Lurker

  ██████╗ ███████╗██████╗ ██████╗ ██╗████████╗
  ██╔══██╗██╔════╝██╔══██╗██╔══██╗██║╚══██╔══╝
  ██████╔╝█████╗  ██║  ██║██║  ██║██║   ██║
  ██╔══██╗██╔══╝  ██║  ██║██║  ██║██║   ██║
  ██║  ██║███████╗██████╔╝██████╔╝██║   ██║
  ╚═╝  ╚═╝╚══════╝╚═════╝ ╚═════╝ ╚═╝   ╚═╝
  ██╗     ██╗   ██╗██████╗ ██╗  ██╗███████╗██████╗
  ██║     ██║   ██║██╔══██╗██║ ██╔╝██╔════╝██╔══██╗
  ██║     ██║   ██║██████╔╝█████╔╝ █████╗  ██████╔╝
  ██║     ██║   ██║██╔══██╗██╔═██╗ ██╔══╝  ██╔══██╗
  ███████╗╚██████╔╝██║  ██║██║  ██╗███████╗██║  ██║
  ╚══════╝ ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝

Every comment. Every reply. 94% fewer tokens.

An 800-comment Reddit thread costs ~120K tokens as raw JSON. Lurk delivers the same thread — full depth, every expanded reply — in a fraction of that.

Most Reddit tools fetch top-level comments and stop. The useful stuff is buried 4-5 replies deep. Lurk expands every collapsed branch, resolves every +N more replies placeholder, and reconstructs the full comment tree. Then compresses it into compact tab-delimited notation before it reaches your model.

Post: "Finally We have the best agentic AI at home"
 +-- Comment (180 pts)
 |   +-- Reply (46 pts)              <-- most tools stop here
 |   |   +-- Reply (34 pts)
 |   |       +-- Reply (29 pts)
 |   |           +-- Reply (8 pts)
 |   |               +-- Reply (20 pts)
 |   |                   +-- Reply (2 pts)
 |   |                       +-- Reply (4 pts)
 |   |                           +-- Reply (1 pt)
 |   |                               +-- Reply (2 pts)  <-- lurk gets all of it
 +-- Comment (82 pts)
 |   +-- Reply (45 pts)              <-- lurk gets all of this too
 |       +-- Reply ...
 +-- Comment (60 pts)
     +-- +47 more replies (expanded)  <-- and this

104 of 109 comments. 10 levels deep. Fully automatic.

How Token Savings Work

The Go binary preprocesses everything before tokens reach your model:

Fetch — Hits Reddit's JSON endpoints, recursively expands every collapsed more placeholder
Extract — Strips the 50+ unused fields per comment (gildings, awards, flair, metadata) down to 5-6 that matter
Compress — Formats into compact tab-delimited notation: d0 180 Recent-Success-1520 If you can host Kimi 2.5...

The result (benchmarked across 12 threads, 452 comments, 6 subreddits):

Format	Total Tokens	vs JSON	vs Markdown
Raw Reddit JSON	286,425	—	—
Markdown	28,993	-90%	—
Lurk (compact)	16,186	-94%	-44%

94% fewer tokens than JSON. 44% fewer than Markdown. Savings scale with thread depth — shallow quips save ~10-25% vs markdown, deep technical threads save 50-64%.

Smart Comment Limiting

Threads with 200+ comments get a preview first instead of dumping everything:

#post   r/ClaudeAI   u/poster   422pts   93%   805cmt   2026-01-28
Finally We have the best agentic AI at home

#comments   461
d0  180  Recent-Success-1520  If you can host Kimi 2.5...
...

#warning   805 total comments, showing 461. Use limit=N for top N by score, or limit=0 for all (~31K tokens).

Claude sees the warning and decides whether to fetch everything or grab the top 50 by score. No surprise 31K-token dumps.

What You Get

Full comment trees at any depth — every collapsed branch expanded
94% fewer tokens than JSON, 44% fewer than Markdown
Smart limiting — large threads preview first, expand on demand
Adaptive caching — new feeds: 2min, hot: 5min, threads: 10min, top: 30min, 50MB LRU cap
Multi-subreddit search — comma-separated subs, parallel fetch, deduped results
OAuth support — optional lurk auth for 6x rate limits (60 req/min vs 10)
All URL formats — reddit.com, old.reddit.com, new.reddit.com, np.reddit.com, redd.it short links, m.reddit.com, amp.reddit.com
Cross-posts traced to the original
Galleries, video, media URLs extracted as clickable links
Single Go binary, zero runtime dependencies
Read-only by design — no write operations, no account access

Install

One-Line Install (Recommended)

Linux / macOS:

curl -fsSL https://raw.githubusercontent.com/ProgenyAlpha/reddit-lurker/master/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/ProgenyAlpha/reddit-lurker/master/install.ps1 | iex

Downloads the binary for your platform and walks you through editor setup. Supports Claude Code, Cursor, Windsurf, VS Code (Copilot), Cline, and Zed.

Homebrew

brew install ProgenyAlpha/tap/lurk

npx

npx reddit-lurker

If you already have Node/npm. Also available on Smithery.

Go Install

go install github.com/ProgenyAlpha/reddit-lurker@latest

Builds from source. Requires Go 1.24+. Run ./install.sh afterward for editor configuration.

From Source

git clone https://github.com/ProgenyAlpha/reddit-lurker.git
cd reddit-lurker
./install.sh

The installer walks you through editor selection and integration mode.

Supported Editors

Editor	Config location	MCP key
Claude Code	`~/.claude.json` or `~/.claude/skills/reddit/`	`mcpServers`
Cursor	`~/.cursor/mcp.json`	`mcpServers`
Windsurf	`~/.codeium/windsurf/mcp_config.json`	`mcpServers`
VS Code (Copilot)	`~/.config/Code/User/mcp.json` (Linux) / `~/Library/.../Code/User/mcp.json` (macOS)	`servers`
Cline	VS Code globalStorage (auto-detected)	`mcpServers`
Zed	`~/.config/zed/settings.json`	`context_servers`

Claude Code also supports a Skill mode (~20 tokens overhead vs ~438 for MCP). The installer will ask which you prefer.

Manual Configuration

Add lurk to your editor's MCP config:

Claude Code, Cursor, Windsurf, Cline:

{
  "mcpServers": {
    "lurk": {
      "command": "lurk",
      "args": ["serve"]
    }
  }
}

GitHub Copilot (VS Code):

{
  "servers": {
    "lurk": {
      "command": "lurk",
      "args": ["serve"]
    }
  }
}

Zed:

{
  "context_servers": {
    "lurk": {
      "command": "lurk",
      "args": ["serve"]
    }
  }
}

OAuth (Optional)

Lurk works without any authentication. But if you want 6x the rate limit (60 req/min instead of 10):

lurk auth

This opens Reddit's app creation page, walks you through the 5-minute setup, tests your credentials, and saves them. One-time process. Lurk handles token refresh automatically.

lurk auth --status   # Check if credentials are configured
lurk auth --clear    # Remove saved credentials

You can also set credentials via environment variables in your MCP config:

{
  "mcpServers": {
    "lurk": {
      "command": "npx",
      "args": ["-y", "reddit-lurker"],
      "env": {
        "LURK_CLIENT_ID": "your_client_id",
        "LURK_CLIENT_SECRET": "your_client_secret"
      }
    }
  }
}

Or via a credentials file at ~/.config/lurk/credentials.json:

{"client_id": "your_client_id", "client_secret": "your_client_secret"}

Usage

Just talk to Claude naturally:

"Read this thread" + paste a Reddit URL
"What's trending on r/ClaudeAI?"
"Search r/selfhosted,r/homelab for 'ZFS backup'"
"What has u/spez been up to?"

Claude handles the rest. No commands to memorize.

Real Example

Here's what lurk actually outputs for a 109-comment r/LocalLLM thread about running Kimi K2.5 at home.

What most tools give your LLM:

u/Recent-Success-1520 (180 pts)
  If you can host Kimi 2.5 1T+ model at home then it tells
  me you have a really big home

u/No_Conversation9561 (82 pts)
  not in my home

u/rookan (60 pts)
  yeah, my 16GB VRAM card can easily handle it /s

... 12 top-level comments, no replies

What lurk gives your LLM:

#post	r/LocalLLM	u/moks4tda	422pts	93%	109cmt	2026-01-28
Finally We have the best agentic AI at home

#comments	104
d0	180	Recent-Success-1520	If you can host Kimi 2.5 1T+ model at home...
d1	46	HenkPoley	Apparently it's a native 4 bit weights. So "only" 640 GB needed...
d2	34	TechnicalGeologist99	Sorry...you're going to run that model on RAM?
d3	29	HenkPoley	24 tokens per second on 2x 512GB Max Studio M3 Ultra
d4	8	doradus_novae	See you tomorrow when it answers your question
d5	20	Scrubbingbubblz	You are over exaggerating. 24 tokens per second...
d6	2	Infinite100p	But what is the prompt processing speed?
d7	4	Miserable-Dare5090	It's GPU inference, on two m3 ultras over TB5...
d8	1	Infinite100p	How?
d9	2	Eastern-Group-1993	Via usb-c networking, RDMA.
d0	82	No_Conversation9561	not in my home
d1	45	gonxot	[image] Maybe it's the same guy lol
d0	60	rookan	yeah, my 16GB VRAM card can easily handle it /s
d0	27	keypa_	"at home" we probably don't have the same home...
...

104 of 109 comments. 10 levels deep. ~3,050 tokens. The 5 missing are deleted posts Reddit still counts but no longer serves.

Benchmarks

Real numbers from live Reddit threads:

Thread	Comments	JSON tokens	MD tokens	Lurk tokens	vs JSON	vs MD
r/ClaudeAI (32c)	32	18,721	1,604	1,206	-94%	-25%
r/homelab (32c)	32	19,991	1,318	961	-95%	-27%
r/linux (32c)	32	21,746	3,081	1,416	-93%	-54%
r/selfhosted (34c)	34	19,874	1,949	1,288	-94%	-34%
r/ClaudeAI (36c)	35	20,936	2,160	1,479	-93%	-32%
r/LocalLLaMA (36c)	36	19,901	1,243	1,115	-94%	-10%
r/selfhosted (37c)	36	20,333	1,454	1,140	-94%	-22%
r/selfhosted (40c)	40	22,308	1,227	1,068	-95%	-13%
r/ClaudeAI (43c)	42	29,426	4,065	1,562	-95%	-62%
r/LocalLLaMA (44c)	44	25,797	1,922	1,302	-95%	-32%
r/LocalLLaMA (45c)	43	35,066	4,245	1,519	-96%	-64%
r/LocalLLaMA (48c)	46	32,326	4,725	2,130	-93%	-55%
Total	452	286,425	28,993	16,186	-94%	-44%

Markdown savings vs JSON vary by thread verbosity. Lurk's compact notation consistently saves 93-96% vs JSON and 10-64% vs Markdown, with deeper technical threads showing the largest gains.

Updates

Lurk checks for new versions once every 24 hours (background, non-blocking, 3-second timeout). If a newer release exists, you'll see a one-line notice after your command finishes.

lurk update              # Download and install latest
lurk update --check      # Check only, don't install

If you installed via npm, brew, or go install, lurk update will detect that and tell you to use your package manager instead.

Disable Update Checks

# Option 1: environment variable
export LURK_NO_UPDATE_CHECK=1

# Option 2: config file
mkdir -p ~/.config/lurk && echo "disabled" > ~/.config/lurk/no-update-check

Error Handling

Lurk fails cleanly with a reason, never with raw HTTP dumps or stack traces:

Scenario	Error message
Deleted/nonexistent thread	`not found — check the URL or subreddit name`
Private/quarantined subreddit	`access denied — subreddit may be private or quarantined`
Malformed URL	`not a valid thread URL — expected reddit.com/r/sub/comments/id/title`
Reddit is down	`Reddit server error (HTTP 5xx) — Reddit may be down`
Rate limited	`rate limited — too many requests, try again shortly`

Limitations

Public content only (without OAuth). Private and quarantined subreddits require lurk auth.
Deleted comments are ghosts. Reddit counts them in the total but no longer serves content. A "1,092-comment" thread may only have ~805 live comments.
No media download. Media URLs (images, video, galleries) are extracted as clickable links — not downloaded or embedded.

Reference

Details for the curious.

CLI Commands

lurk thread "https://reddit.com/r/ClaudeAI/comments/..."   # Full thread + comments
lurk subreddit ClaudeAI --sort top --time week --limit 10   # Browse
lurk search "prompt engineering" --sub ClaudeAI --limit 5   # Search
lurk search "ZFS" --sub selfhosted,homelab,datahoarder      # Multi-sub search
lurk user spez --limit 5                                    # User activity
lurk subreddit ClaudeAI --info                              # Subreddit metadata
lurk auth                                                   # OAuth setup
lurk update                                                 # Self-update

Flags

Flag	What it does	Works with
`--sort`	hot, new, top, rising, controversial, relevance, comments	subreddit, search
`--limit`	Max results (default 25)	subreddit, search, user
`--time`	hour, day, week, month, year, all	subreddit, search
`--sub`	Restrict search to subreddit(s) — comma-separated for multi-sub	search
`--after`	Pagination token for next page	subreddit, search (single-sub only)
`--info`	Subreddit metadata instead of posts	subreddit
`--json`	Raw JSON output	all
`--compact`	Compact notation (default in MCP mode)	all
`--no-cache`	Skip cache	all

MCP Tools

Tool	Purpose
`lurk`	Read threads, browse subreddits, search posts, view user activity
`lurk_info`	Get subreddit metadata (subscribers, active users, description)

Understanding Skill vs MCP

Both modes use the same compact notation, so per-call token cost is identical. The differences:

Context overhead. Every message you send, Claude also receives hidden tool definitions. Skill adds ~20 tokens. MCP adds ~438 tokens. On subscription plans this is cached and free. On the API, you pay for it every message.

Caching. MCP runs as a background server. Its adaptive in-memory cache means hitting the same thread or subreddit twice is instant. Skill starts a fresh process each call — no cross-call cache.

Permissions. Skill works through Bash, so Claude needs shell permission. MCP is a native tool call. If you run with Bash restricted, MCP works without it.

Compact Notation

Tab-delimited output designed for LLMs. d0/d1/d2 = comment depth. Score before author. +N = collapsed comments not loaded. #next = pagination token. #warning = smart limit triggered.

#post   r/ClaudeAI   u/BusyBea2   1pts   57%   9cmt   2026-02-23
Email and Claude
Have you figured out how to use Claude to manage your inbox?

#comments   3
d0   6   Ok-Version-8996   I'm surprised gmail hasn't done this already
d1   3   BusyBea2   i hear you, that's one of my first clean up things
d0   2   turtle-toaster   Claude Settings lets you connect your Gmail

Adaptive Cache

Content	TTL	Rationale
`/new` feeds	2 min	Fresh content, stale quickly
`/hot` feeds	5 min	Changes moderately
Threads & comments	10 min	Stable once posted
Search results	10 min	Results shift slowly
User profiles	15 min	Rarely changes
`/top` feeds	30 min	Rankings are stable

50MB LRU cap with automatic eviction. OAuth-authenticated requests use oauth.reddit.com automatically.

Under the Hood

Appends .json to any Reddit URL — no API keys needed for public content
Recursively walks comment trees to arbitrary depth
Fetches /api/morechildren to expand collapsed threads (batched, max 100 IDs per request)
Unauthenticated: 10 req/min with burst allowance (100 tokens / 10-min window)
Authenticated: 60 req/min (OAuth client_credentials grant)
Retries 429/5xx with exponential backoff (3 attempts)
Resolves redd.it short links via HTTP redirect
Single static Go binary, cross-compiled for linux/darwin/windows amd64/arm64

Build from Source

make build                # Local binary
make all                  # Cross-compile all platforms
make install-skill        # Install to ~/.claude/skills/reddit/

Uninstall

rm -rf ~/.claude/skills/reddit                    # Remove skill
lurk auth --clear                                 # Remove saved credentials
# For MCP: edit ~/.claude.json, delete "lurk" from mcpServers

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
assets		assets
cmd		cmd
format		format
npm		npm
reddit		reddit
skill		skill
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install.ps1		install.ps1
install.sh		install.sh
main.go		main.go
smithery.yaml		smithery.yaml

Folders and files

Latest commit

History

Repository files navigation

How Token Savings Work

Smart Comment Limiting

What You Get

Install

One-Line Install (Recommended)

Homebrew

npx

Go Install

From Source

Supported Editors

Manual Configuration

OAuth (Optional)

Usage

Real Example

Benchmarks

Updates

Disable Update Checks

Error Handling

Limitations

Reference

CLI Commands

Flags

MCP Tools

Understanding Skill vs MCP

Compact Notation

Adaptive Cache

Under the Hood

Build from Source

Uninstall

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages