Skip to content

ProgenyAlpha/reddit-lurker

Repository files navigation

Lurker

  ██████╗ ███████╗██████╗ ██████╗ ██╗████████╗
  ██╔══██╗██╔════╝██╔══██╗██╔══██╗██║╚══██╔══╝
  ██████╔╝█████╗  ██║  ██║██║  ██║██║   ██║
  ██╔══██╗██╔══╝  ██║  ██║██║  ██║██║   ██║
  ██║  ██║███████╗██████╔╝██████╔╝██║   ██║
  ╚═╝  ╚═╝╚══════╝╚═════╝ ╚═════╝ ╚═╝   ╚═╝
  ██╗     ██╗   ██╗██████╗ ██╗  ██╗███████╗██████╗
  ██║     ██║   ██║██╔══██╗██║ ██╔╝██╔════╝██╔══██╗
  ██║     ██║   ██║██████╔╝█████╔╝ █████╗  ██████╔╝
  ██║     ██║   ██║██╔══██╗██╔═██╗ ██╔══╝  ██╔══██╗
  ███████╗╚██████╔╝██║  ██║██║  ██╗███████╗██║  ██║
  ╚══════╝ ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝╚═╝  ╚═╝

Release Go License: MIT Platform

Every comment. Every reply. 94% fewer tokens.

An 800-comment Reddit thread costs ~120K tokens as raw JSON. Lurk delivers the same thread — full depth, every expanded reply — in a fraction of that.

Most Reddit tools fetch top-level comments and stop. The useful stuff is buried 4-5 replies deep. Lurk expands every collapsed branch, resolves every +N more replies placeholder, and reconstructs the full comment tree. Then compresses it into compact tab-delimited notation before it reaches your model.

Post: "Finally We have the best agentic AI at home"
 +-- Comment (180 pts)
 |   +-- Reply (46 pts)              <-- most tools stop here
 |   |   +-- Reply (34 pts)
 |   |       +-- Reply (29 pts)
 |   |           +-- Reply (8 pts)
 |   |               +-- Reply (20 pts)
 |   |                   +-- Reply (2 pts)
 |   |                       +-- Reply (4 pts)
 |   |                           +-- Reply (1 pt)
 |   |                               +-- Reply (2 pts)  <-- lurk gets all of it
 +-- Comment (82 pts)
 |   +-- Reply (45 pts)              <-- lurk gets all of this too
 |       +-- Reply ...
 +-- Comment (60 pts)
     +-- +47 more replies (expanded)  <-- and this

104 of 109 comments. 10 levels deep. Fully automatic.

How Token Savings Work

The Go binary preprocesses everything before tokens reach your model:

  1. Fetch — Hits Reddit's JSON endpoints, recursively expands every collapsed more placeholder
  2. Extract — Strips the 50+ unused fields per comment (gildings, awards, flair, metadata) down to 5-6 that matter
  3. Compress — Formats into compact tab-delimited notation: d0 180 Recent-Success-1520 If you can host Kimi 2.5...

The result (benchmarked across 12 threads, 452 comments, 6 subreddits):

Format Total Tokens vs JSON vs Markdown
Raw Reddit JSON 286,425
Markdown 28,993 -90%
Lurk (compact) 16,186 -94% -44%

94% fewer tokens than JSON. 44% fewer than Markdown. Savings scale with thread depth — shallow quips save ~10-25% vs markdown, deep technical threads save 50-64%.

Smart Comment Limiting

Threads with 200+ comments get a preview first instead of dumping everything:

#post   r/ClaudeAI   u/poster   422pts   93%   805cmt   2026-01-28
Finally We have the best agentic AI at home

#comments   461
d0  180  Recent-Success-1520  If you can host Kimi 2.5...
...

#warning   805 total comments, showing 461. Use limit=N for top N by score, or limit=0 for all (~31K tokens).

Claude sees the warning and decides whether to fetch everything or grab the top 50 by score. No surprise 31K-token dumps.

What You Get

  • Full comment trees at any depth — every collapsed branch expanded
  • 94% fewer tokens than JSON, 44% fewer than Markdown
  • Smart limiting — large threads preview first, expand on demand
  • Adaptive caching — new feeds: 2min, hot: 5min, threads: 10min, top: 30min, 50MB LRU cap
  • Multi-subreddit search — comma-separated subs, parallel fetch, deduped results
  • OAuth support — optional lurk auth for 6x rate limits (60 req/min vs 10)
  • All URL formats — reddit.com, old.reddit.com, new.reddit.com, np.reddit.com, redd.it short links, m.reddit.com, amp.reddit.com
  • Cross-posts traced to the original
  • Galleries, video, media URLs extracted as clickable links
  • Single Go binary, zero runtime dependencies
  • Read-only by design — no write operations, no account access

Install

One-Line Install (Recommended)

Linux / macOS:

curl -fsSL https://raw.githubusercontent.com/ProgenyAlpha/reddit-lurker/master/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/ProgenyAlpha/reddit-lurker/master/install.ps1 | iex

Downloads the binary for your platform and walks you through editor setup. Supports Claude Code, Cursor, Windsurf, VS Code (Copilot), Cline, and Zed.

Homebrew

brew install ProgenyAlpha/tap/lurk

npx

npx reddit-lurker

If you already have Node/npm. Also available on Smithery.

Go Install

go install github.com/ProgenyAlpha/reddit-lurker@latest

Builds from source. Requires Go 1.24+. Run ./install.sh afterward for editor configuration.

From Source

git clone https://github.com/ProgenyAlpha/reddit-lurker.git
cd reddit-lurker
./install.sh

The installer walks you through editor selection and integration mode.

Supported Editors

Editor Config location MCP key
Claude Code ~/.claude.json or ~/.claude/skills/reddit/ mcpServers
Cursor ~/.cursor/mcp.json mcpServers
Windsurf ~/.codeium/windsurf/mcp_config.json mcpServers
VS Code (Copilot) ~/.config/Code/User/mcp.json (Linux) / ~/Library/.../Code/User/mcp.json (macOS) servers
Cline VS Code globalStorage (auto-detected) mcpServers
Zed ~/.config/zed/settings.json context_servers

Claude Code also supports a Skill mode (~20 tokens overhead vs ~438 for MCP). The installer will ask which you prefer.

Manual Configuration

Add lurk to your editor's MCP config:

Claude Code, Cursor, Windsurf, Cline:

{
  "mcpServers": {
    "lurk": {
      "command": "lurk",
      "args": ["serve"]
    }
  }
}

GitHub Copilot (VS Code):

{
  "servers": {
    "lurk": {
      "command": "lurk",
      "args": ["serve"]
    }
  }
}

Zed:

{
  "context_servers": {
    "lurk": {
      "command": "lurk",
      "args": ["serve"]
    }
  }
}

OAuth (Optional)

Lurk works without any authentication. But if you want 6x the rate limit (60 req/min instead of 10):

lurk auth

This opens Reddit's app creation page, walks you through the 5-minute setup, tests your credentials, and saves them. One-time process. Lurk handles token refresh automatically.

lurk auth --status   # Check if credentials are configured
lurk auth --clear    # Remove saved credentials

You can also set credentials via environment variables in your MCP config:

{
  "mcpServers": {
    "lurk": {
      "command": "npx",
      "args": ["-y", "reddit-lurker"],
      "env": {
        "LURK_CLIENT_ID": "your_client_id",
        "LURK_CLIENT_SECRET": "your_client_secret"
      }
    }
  }
}

Or via a credentials file at ~/.config/lurk/credentials.json:

{"client_id": "your_client_id", "client_secret": "your_client_secret"}

Usage

Just talk to Claude naturally:

  • "Read this thread" + paste a Reddit URL
  • "What's trending on r/ClaudeAI?"
  • "Search r/selfhosted,r/homelab for 'ZFS backup'"
  • "What has u/spez been up to?"

Claude handles the rest. No commands to memorize.

Real Example

Here's what lurk actually outputs for a 109-comment r/LocalLLM thread about running Kimi K2.5 at home.

What most tools give your LLM:

u/Recent-Success-1520 (180 pts)
  If you can host Kimi 2.5 1T+ model at home then it tells
  me you have a really big home

u/No_Conversation9561 (82 pts)
  not in my home

u/rookan (60 pts)
  yeah, my 16GB VRAM card can easily handle it /s

... 12 top-level comments, no replies

What lurk gives your LLM:

#post	r/LocalLLM	u/moks4tda	422pts	93%	109cmt	2026-01-28
Finally We have the best agentic AI at home

#comments	104
d0	180	Recent-Success-1520	If you can host Kimi 2.5 1T+ model at home...
d1	46	HenkPoley	Apparently it's a native 4 bit weights. So "only" 640 GB needed...
d2	34	TechnicalGeologist99	Sorry...you're going to run that model on RAM?
d3	29	HenkPoley	24 tokens per second on 2x 512GB Max Studio M3 Ultra
d4	8	doradus_novae	See you tomorrow when it answers your question
d5	20	Scrubbingbubblz	You are over exaggerating. 24 tokens per second...
d6	2	Infinite100p	But what is the prompt processing speed?
d7	4	Miserable-Dare5090	It's GPU inference, on two m3 ultras over TB5...
d8	1	Infinite100p	How?
d9	2	Eastern-Group-1993	Via usb-c networking, RDMA.
d0	82	No_Conversation9561	not in my home
d1	45	gonxot	[image] Maybe it's the same guy lol
d0	60	rookan	yeah, my 16GB VRAM card can easily handle it /s
d0	27	keypa_	"at home" we probably don't have the same home...
...

104 of 109 comments. 10 levels deep. ~3,050 tokens. The 5 missing are deleted posts Reddit still counts but no longer serves.

Benchmarks

Real numbers from live Reddit threads:

Thread Comments JSON tokens MD tokens Lurk tokens vs JSON vs MD
r/ClaudeAI (32c) 32 18,721 1,604 1,206 -94% -25%
r/homelab (32c) 32 19,991 1,318 961 -95% -27%
r/linux (32c) 32 21,746 3,081 1,416 -93% -54%
r/selfhosted (34c) 34 19,874 1,949 1,288 -94% -34%
r/ClaudeAI (36c) 35 20,936 2,160 1,479 -93% -32%
r/LocalLLaMA (36c) 36 19,901 1,243 1,115 -94% -10%
r/selfhosted (37c) 36 20,333 1,454 1,140 -94% -22%
r/selfhosted (40c) 40 22,308 1,227 1,068 -95% -13%
r/ClaudeAI (43c) 42 29,426 4,065 1,562 -95% -62%
r/LocalLLaMA (44c) 44 25,797 1,922 1,302 -95% -32%
r/LocalLLaMA (45c) 43 35,066 4,245 1,519 -96% -64%
r/LocalLLaMA (48c) 46 32,326 4,725 2,130 -93% -55%
Total 452 286,425 28,993 16,186 -94% -44%

Markdown savings vs JSON vary by thread verbosity. Lurk's compact notation consistently saves 93-96% vs JSON and 10-64% vs Markdown, with deeper technical threads showing the largest gains.

Updates

Lurk checks for new versions once every 24 hours (background, non-blocking, 3-second timeout). If a newer release exists, you'll see a one-line notice after your command finishes.

lurk update              # Download and install latest
lurk update --check      # Check only, don't install

If you installed via npm, brew, or go install, lurk update will detect that and tell you to use your package manager instead.

Disable Update Checks

# Option 1: environment variable
export LURK_NO_UPDATE_CHECK=1

# Option 2: config file
mkdir -p ~/.config/lurk && echo "disabled" > ~/.config/lurk/no-update-check

Error Handling

Lurk fails cleanly with a reason, never with raw HTTP dumps or stack traces:

Scenario Error message
Deleted/nonexistent thread not found — check the URL or subreddit name
Private/quarantined subreddit access denied — subreddit may be private or quarantined
Malformed URL not a valid thread URL — expected reddit.com/r/sub/comments/id/title
Reddit is down Reddit server error (HTTP 5xx) — Reddit may be down
Rate limited rate limited — too many requests, try again shortly

Limitations

  • Public content only (without OAuth). Private and quarantined subreddits require lurk auth.
  • Deleted comments are ghosts. Reddit counts them in the total but no longer serves content. A "1,092-comment" thread may only have ~805 live comments.
  • No media download. Media URLs (images, video, galleries) are extracted as clickable links — not downloaded or embedded.

Reference

Details for the curious.

CLI Commands

lurk thread "https://reddit.com/r/ClaudeAI/comments/..."   # Full thread + comments
lurk subreddit ClaudeAI --sort top --time week --limit 10   # Browse
lurk search "prompt engineering" --sub ClaudeAI --limit 5   # Search
lurk search "ZFS" --sub selfhosted,homelab,datahoarder      # Multi-sub search
lurk user spez --limit 5                                    # User activity
lurk subreddit ClaudeAI --info                              # Subreddit metadata
lurk auth                                                   # OAuth setup
lurk update                                                 # Self-update

Flags

Flag What it does Works with
--sort hot, new, top, rising, controversial, relevance, comments subreddit, search
--limit Max results (default 25) subreddit, search, user
--time hour, day, week, month, year, all subreddit, search
--sub Restrict search to subreddit(s) — comma-separated for multi-sub search
--after Pagination token for next page subreddit, search (single-sub only)
--info Subreddit metadata instead of posts subreddit
--json Raw JSON output all
--compact Compact notation (default in MCP mode) all
--no-cache Skip cache all

MCP Tools

Tool Purpose
lurk Read threads, browse subreddits, search posts, view user activity
lurk_info Get subreddit metadata (subscribers, active users, description)

Understanding Skill vs MCP

Both modes use the same compact notation, so per-call token cost is identical. The differences:

Context overhead. Every message you send, Claude also receives hidden tool definitions. Skill adds ~20 tokens. MCP adds ~438 tokens. On subscription plans this is cached and free. On the API, you pay for it every message.

Caching. MCP runs as a background server. Its adaptive in-memory cache means hitting the same thread or subreddit twice is instant. Skill starts a fresh process each call — no cross-call cache.

Permissions. Skill works through Bash, so Claude needs shell permission. MCP is a native tool call. If you run with Bash restricted, MCP works without it.

Compact Notation

Tab-delimited output designed for LLMs. d0/d1/d2 = comment depth. Score before author. +N = collapsed comments not loaded. #next = pagination token. #warning = smart limit triggered.

#post   r/ClaudeAI   u/BusyBea2   1pts   57%   9cmt   2026-02-23
Email and Claude
Have you figured out how to use Claude to manage your inbox?

#comments   3
d0   6   Ok-Version-8996   I'm surprised gmail hasn't done this already
d1   3   BusyBea2   i hear you, that's one of my first clean up things
d0   2   turtle-toaster   Claude Settings lets you connect your Gmail

Adaptive Cache

Content TTL Rationale
/new feeds 2 min Fresh content, stale quickly
/hot feeds 5 min Changes moderately
Threads & comments 10 min Stable once posted
Search results 10 min Results shift slowly
User profiles 15 min Rarely changes
/top feeds 30 min Rankings are stable

50MB LRU cap with automatic eviction. OAuth-authenticated requests use oauth.reddit.com automatically.

Under the Hood

  • Appends .json to any Reddit URL — no API keys needed for public content
  • Recursively walks comment trees to arbitrary depth
  • Fetches /api/morechildren to expand collapsed threads (batched, max 100 IDs per request)
  • Unauthenticated: 10 req/min with burst allowance (100 tokens / 10-min window)
  • Authenticated: 60 req/min (OAuth client_credentials grant)
  • Retries 429/5xx with exponential backoff (3 attempts)
  • Resolves redd.it short links via HTTP redirect
  • Single static Go binary, cross-compiled for linux/darwin/windows amd64/arm64

Build from Source

make build                # Local binary
make all                  # Cross-compile all platforms
make install-skill        # Install to ~/.claude/skills/reddit/

Uninstall

rm -rf ~/.claude/skills/reddit                    # Remove skill
lurk auth --clear                                 # Remove saved credentials
# For MCP: edit ~/.claude.json, delete "lurk" from mcpServers

License

MIT

About

Reddit reader for LLMs — full comment trees, 77% fewer tokens, CLI + MCP server

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors