A production-minded, zero-cost MVP for automated daily monitoring of suspicious open-source packages across PyPI and npm using public GitHub Actions.
This repository discovers recently updated packages, applies explainable metadata heuristics, runs static analysis on a shortlist, scores suspicious behavior, and publishes compact analyst-friendly outputs.
- Public GitHub repository only
- GitHub Actions as scheduler and orchestrator
- Python 3.11 runtime
- Static analysis only: no dynamic malware detonation
- No long-term raw package storage
- Compact JSON + Markdown outputs only
- Low dependency count and bounded runtime
- PyPI
- npm
The architecture isolates ecosystem collectors so RubyGems, Go modules, GitHub Actions, and VS Code extensions can be added later without reworking scoring or reporting.
collectors/: incremental collection and normalizationscanners/: metadata heuristics, GuardDog integration, regex, AST, Semgrep, and YARAscoring/: suppression handling, weighted ranking, and report generationschemas/: JSON schema validation for candidates and findingsdocs/: methodology, rules, and operations guidance.github/workflows/: daily scan and weekly baseline refresh automation
- Collect recent PyPI updates via the public RSS updates feed and enrich with project JSON metadata.
- Collect recent npm changes via the public replication changes feed and filter to the last 24 hours.
- Normalize metadata to a common schema and annotate with baseline similarity.
- Score cheap metadata heuristics first.
- Shortlist candidates for deeper static analysis only when higher-signal conditions are present.
- Download artifacts to temporary storage, extract safely, and scan statically.
- Deduplicate rules, apply suppressions, rank by weighted score, and generate outputs.
Representative weights:
- Typosquat of top package: +25
- Install hook present: +20
- Obfuscation or encoded blob: +15 to +20
- Webhook/network exfil string: +15
- Credential path access: +25
- New publisher: +10
- Repository mismatch: +10
Thresholds:
- 70+: Critical
- 50-69: High
- 30-49: Medium
- Below 30: ignored in the Markdown report
reports/YYYY-MM-DD.md: daily analyst reportdata/latest-findings.json: compact current findings and summarydata/latest-summary.json: GitHub Pages- and workflow-summary-friendly metrics- Optional GitHub workflow job summary
python -m pip install --upgrade pip
pip install -r requirements.txt
python -m collectors.pypi --hours 24 --out data/raw/pypi.json
python -m collectors.npm --hours 24 --out data/raw/npm.json
python -m collectors.common --inputs data/raw/pypi.json data/raw/npm.json --out data/normalized/candidates.json
python -m scanners.guarddog_runner --in data/normalized/candidates.json --out data/normalized/guarddog_results.json
python -m scanners.metadata_rules --in data/normalized/candidates.json --guarddog data/normalized/guarddog_results.json --out data/latest-findings.json
python -m scoring.score --in data/latest-findings.json --report reports/$(date -u +%F).md --out data/latest-findings.jsondaily-scan.ymlruns every day, builds the daily report, and commits updated outputs.weekly-baseline-refresh.ymlrefreshes popular package baselines once per week.
- PyPI and npm both expose imperfect zero-cost “recent updates” views; this MVP uses practical public feeds rather than full registry mirrors.
- GuardDog and Semgrep are optional at runtime; the pipeline continues if those tools are unavailable.
- The baseline files in the repository are intentionally compact seed lists, while the weekly workflow refresh can expand them toward the top 1k package names.
- Heuristics are explainable but not definitive; analyst validation is always required.
- No package execution is performed.
- Temporary extraction is size-bounded and cleaned up automatically.
- Large archives are skipped to reduce zip-bomb risk and runner cost.
To add RubyGems, Go modules, GitHub Actions, or VS Code extensions later:
- Add a collector that emits the shared raw field set.
- Reuse
collectors/common.pynormalization and baseline matching. - Add ecosystem-specific static scan hints only where needed.
- Keep scoring weights and schemas stable so reports remain comparable.