Privacy-preserving, self-hosted HIBP Pwned Passwords k-anonymity API.
Check if a password has been compromised in a data breach — without ever leaking any information to third parties.
When you use the Have I Been Pwned API, you send the first 5 characters of your password's SHA-1 hash to an external service. While the k-anonymity model is clever, it still reveals:
- What prefix you're querying — and when, and how often
- Usage patterns — which passwords are being checked in your organization
- Timing information — that can correlate with user activity
For organizations handling sensitive authentication (password managers, SSO providers, internal security tools), this is a privacy risk. IncogniPwn runs entirely inside your infrastructure. No data leaves your network. No API keys. No tracking.
Threat model note: IncogniPwn eliminates the query-side privacy risk (what you search, when, and how often). However, an attacker with access to your infrastructure will discover that you possess the full HIBP password hash corpus. This is a trade-off: you gain query privacy at the cost of storing ~80GB of breach data locally. Assess whether this is acceptable for your threat model.
Logging warning: Application logs may record the hash prefixes being queried, effectively leaking internally the same information you are trying to protect from HIBP. In production, consider disabling debug logging, sending logs to a restricted-access destination, or configuring log aggregation with appropriate access controls.
IncogniPwn provides a fully compatible implementation of the HIBP Pwned Passwords /range/{prefix} endpoint. It uses the official PwnedPasswordsDownloader to download the full password hash corpus and keep it updated automatically.
Drop-in replacement for https://api.pwnedpasswords.com/range/ — works with any tool that expects the HIBP API format (Ory Kratos, Passbolt, custom password validators, browser extensions, etc.).
- HIBP-compatible API —
GET /range/{prefix}returns hash suffixes with prevalence counts - Add-Padding support — pads responses to 800-1000 entries, matching HIBP privacy behavior
- Automatic updates — CronJob downloads new data weekly via the official downloader
- Atomic updates — downloader validates file count before swapping, never serves partial data
- Kubernetes-ready — Helm chart with HPA, PDB, Ingress, ServiceMonitor
- Local development — docker-compose for testing without a cluster
- Zero dependencies — no database, reads files directly (O(1) lookup by filename)
# First run: download the full corpus (~80GB)
docker compose run --rm downloader
# Then start the API
docker compose up api
# Test it
curl http://localhost:8000/range/21BD1Pre-built images are available at ghcr.io/millaguie/incognipwn-api and ghcr.io/millaguie/incognipwn-downloader.
# Install with production values (images are pulled from GHCR by default)
helm install incognipwn ./chart/incognipwn \
-n incognipwn --create-namespace \
-f ./chart/incognipwn/values/prod.yaml
# Or override with your own registry
helm install incognipwn ./chart/incognipwn \
-n incognipwn --create-namespace \
-f ./chart/incognipwn/values/prod.yaml \
--set api.image.repository=your-registry/incognipwn-api \
--set downloader.image.repository=your-registry/incognipwn-downloader
# Test
curl http://incognipwn.example.com/range/21BD1Returns all SHA-1 hash suffixes matching the 5-character hex prefix, with their prevalence counts.
Parameters:
prefix— 5 hexadecimal characters (case-insensitive)
Headers:
| Header | Value | Description |
|---|---|---|
Add-Padding |
true |
Pads response to 800-1000 entries with random fake suffixes (count=0) |
Example:
$ curl http://localhost:8000/range/21BD1
0018A45C4D1DEF81644B54AB7F969B88D65:1
00D4F6E8FA6EECAD2A3AA415EEC418D38EC:2
011053FD0102E94D6AE2F8B83D76FAF94F6:1
...
Returns {"status": "ok"}. Always returns 200.
Returns {"status": "ready"} if hash data is available, 503 otherwise.
┌───────────────┐
│ Ingress │
│(nginx/traefik)│
└──────┬────────┘
│
┌──────▼──────┐
│ Service │
│ ClusterIP │
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌─────▼─────┐┌─────▼────┐┌──────▼───┐
│ API Pod ││ API Pod ││ API Pod │
│ (FastAPI) ││(FastAPI) ││(FastAPI) │
└─────┬─────┘└─────┬────┘└──────┬───┘
└────────────┼────────────┘
│
┌──────▼───────┐
│ PVC │
│ ReadWriteMany│
│ ~80GB │
│ /data/hashes │
└──────▲───────┘
│
┌──────┴──────┐
│ CronJob │
│ (Downloader)│
│ weekly │
└─────────────┘
How it works:
- The downloader fetches all ~1M hash range files (one per 5-char prefix,
00000.txtthroughFFFFF.txt) from HIBP - Files are stored on a shared PVC (~80GB)
- The API serves each file directly — O(1) lookup, no database needed
- A CronJob re-runs the downloader weekly to pick up new breach data
- Updates are atomic: download to temp dir, validate file count, then rsync into place
| Variable | Default | Description |
|---|---|---|
DATA_DIR |
/data/hashes |
Path to hash files directory |
HASH_OUTPUT_DIR |
/data/hashes |
Final hash output path (downloader) |
HASH_TEMP_DIR |
/data/hashes-new |
Temp download path (downloader) |
PARALLELISM |
64 |
Parallel download threads (downloader) |
OVERWRITE |
true |
Overwrite existing files (downloader) |
See chart/incognipwn/values.yaml for the full reference. Key values:
api:
replicas: 3
ingress:
host: incognipwn.example.com
hpa:
maxReplicas: 10
downloader:
schedule: "0 3 * * 0" # Weekly, Sunday 3AM
storage:
size: 25Gi
storageClassName: nfs # Must be ReadWriteManymake dev # Run full local environment
make dev-api # Run API only (data must exist)
make dev-download # Run downloader only
make test # Run tests
make build # Build Docker images
make chart-install # Install Helm chart
make chart-install-prod # Install with prod values
Contributions are welcome. To get started:
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Make your changes
- Run tests (
make test) - Ensure linting passes (
make lint) - Open a Pull Request
python -m venv .venv
source .venv/bin/activate
pip install fastapi uvicorn httpx pytest
PYTHONPATH=api pytest tests/ -vapi/ # FastAPI application
app/
routes/range.py # /range, /health, /ready endpoints
services/ # Hash lookup + padding logic
downloader/ # .NET downloader + atomic update script
chart/incognipwn/ # Helm chart
templates/ # K8s manifests
values/ # dev.yaml, prod.yaml overlays
tests/ # Pytest test suite
This project was vibecoded — the codebase was generated through an interactive AI-assisted session. The entire architecture, implementation, Helm chart, test suite, and documentation were produced collaboratively with GLM-5.1 as co-author.
I want to run a service on a Kubernetes cluster that is similar to the Have I Been Pwned API, specifically the
/rangeendpoint. The service should receive a partial SHA-1 hash prefix (first 5 characters) and return all matching hash suffixes with their occurrence counts — exactly like the HIBP Pwned Passwords k-anonymity API (GET /range/{prefix}).Additionally, the service must use the PwnedPasswordsDownloader tool to download and keep the password hash database updated.
| Decision | Choice | Why |
|---|---|---|
| Language | Python (FastAPI) | Ergonomic async, broad ecosystem, easy to containerize |
| Storage | Individual files (~1M, one per prefix) | O(1) lookup by filename, no database dependency, ~80GB total |
| Hash format | SHA-1 only | Matches HIBP; NTLM not needed |
| Updates | Weekly CronJob | HIBP updates quarterly; weekly catches changes promptly |
| Deployment | Helm + docker-compose | K8s for production, compose for local dev |
No existing project provides a well-maintained, Kubernetes-ready, HIBP /range/{prefix}-compatible API:
- radekg/hibp (Go, PostgreSQL) — requires 613M+ rows, no K8s
- felix-engelmann/haveibeenpwned-api (Python/Flask) — abandoned, flat file + binary search
- oschonrock/hibp (C++) — active, but NOT
/range/compatible - easybill/easypwned (Rust) — active, bloom filter, NOT
/range/compatible
This project is licensed under the GNU General Public License v3.0 — see the LICENSE file for details.
The password hash data is sourced from Have I Been Pwned by Troy Hunt, licensed under CC BY 4.0.