Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 113 additions & 0 deletions PRIVACY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Privacy & Data Handling

This document describes what data nightshift collects, where it is stored,
and what leaves your machine.

## Local Storage

All persistent data lives under XDG-compliant paths:

| Data | Default path | Format | Retention |
|------|-------------|--------|-----------|
| Database | `~/.local/share/nightshift/nightshift.db` | SQLite (WAL mode) | Permanent |
| Logs | `~/.local/share/nightshift/logs/nightshift-YYYY-MM-DD.log` | JSON or text | 7 days (configurable) |
| Audit log | `~/.local/share/nightshift/audit/audit-YYYY-MM-DD.jsonl` | JSONL | Permanent (append-only, no automatic cleanup) |
| Summaries | `~/.local/share/nightshift/summaries/summary-YYYY-MM-DD.md` | Markdown | Permanent |
| Config | `~/.config/nightshift/config.yaml` | YAML | Permanent |

The database directory is created with `0700` permissions (owner-only access).

### What the database stores

- Project paths and execution history
- Task execution timestamps and assignments
- Run history (start/end times, project, tasks, tokens used, status, errors, provider, branch)
- Provider usage snapshots (token counts, daily/weekly usage, inferred budget)
- Bus-factor analysis results

### Provider data directories (read-only)

Nightshift reads — but never writes to — these provider CLI data directories
to track token usage locally:

- `~/.claude` — session history and `stats-cache.json`
- `~/.codex` — session JSONL files and rate-limit info
- `~/.copilot` — nightshift maintains a local request counter at `~/.copilot/nightshift-usage.json`

These paths are configurable via `providers.<name>.data_path` in config.

## External Transmission

Nightshift sends data externally **only** when you explicitly configure it.
Nothing is sent by default.

### AI provider CLIs

When nightshift runs a task, it invokes provider CLIs as subprocesses:

| Provider | Command | Data sent |
|----------|---------|-----------|
| Claude Code | `claude --print <prompt>` | Task prompt + selected file contents |
| Codex | `codex exec <prompt>` | Task prompt + selected file contents |
| Copilot | `gh copilot -- -p <prompt>` | Task prompt + selected file contents |

Each invocation is isolated — no session state persists between calls, and
no cross-project context is shared. The provider CLIs handle their own
authentication and network communication; nightshift does not transmit API
keys over the network itself.

Dangerous permission flags (`--dangerously-skip-permissions`,
`--dangerously-bypass-approvals-and-sandbox`, `--allow-all-tools`) default
to **false** and require explicit opt-in.

### Slack notifications (optional)

When `reporting.slack_webhook` is configured, nightshift posts morning
summaries containing: budget usage, completed task list, project counts,
and failed/skipped task info.

### Email notifications (optional)

When SMTP environment variables are set (`NIGHTSHIFT_SMTP_HOST`, etc.),
nightshift sends the same morning summary via email.

### GitHub integration (optional)

When enabled, nightshift uses the `gh` CLI to read issues (filtered by
label) and post completion comments. It relies on `gh`'s existing
authentication — nightshift does not handle GitHub tokens directly.

## Credential Handling

- **API keys** (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) are read from
environment variables only and are never written to disk.
- **Config file credential protection**: nightshift actively scans config
files for credential patterns (`api_key:`, `secret:`, `sk-` prefixes)
and rejects them.
- **Credential masking**: when credentials appear in log output, they are
masked to show only the first 3 and last 3 characters.
- **SMTP credentials** (`NIGHTSHIFT_SMTP_USER`, `NIGHTSHIFT_SMTP_PASS`)
are read from environment variables only.
- **Slack webhook URL** is stored in plaintext in config YAML — consider
using an environment variable for sensitive deployments.

## Telemetry

Nightshift includes **zero** telemetry, analytics, crash reporting, or
phone-home functionality. All usage tracking is local-only, reading data
from provider CLI directories on disk.

## Deleting Your Data

```bash
# Remove all nightshift data
rm -rf ~/.local/share/nightshift

# Remove configuration
rm -rf ~/.config/nightshift

# Remove nightshift's copilot usage counter
rm -f ~/.copilot/nightshift-usage.json
```

Per-project config (`nightshift.yaml`) lives in each project directory.
79 changes: 75 additions & 4 deletions internal/tasks/tasks.go
Original file line number Diff line number Diff line change
Expand Up @@ -531,10 +531,81 @@ Apply safe updates directly, and leave concise follow-ups for anything uncertain
DefaultInterval: 72 * time.Hour,
},
TaskPrivacyPolicy: {
Type: TaskPrivacyPolicy,
Category: CategoryAnalysis,
Name: "Privacy Policy Consistency Checker",
Description: "Check code against privacy policy claims",
Type: TaskPrivacyPolicy,
Category: CategoryAnalysis,
Name: "Privacy Policy Consistency Checker",
Description: `Cross-reference a project's privacy policy against its actual code behavior. ` +
`This task identifies inconsistencies between what a privacy policy claims and what the code actually does.` +
"\n\n" +
`STEP 1 — LOCATE THE PRIVACY POLICY` +
"\n" +
`Search the repository for privacy policy documents: PRIVACY.md, privacy-policy.md, ` +
`PRIVACY_POLICY.md, privacy.txt, docs/privacy*, website/*/privacy*. Also check README.md ` +
`for a privacy section. If no privacy policy is found, report a single finding of category ` +
`missing-policy with severity high and stop.` +
"\n\n" +
`STEP 2 — PARSE POLICY CLAIMS` +
"\n" +
`Extract each concrete claim from the privacy policy into a checklist. Claims typically cover: ` +
`what data is collected, where it is stored, what is transmitted externally, how credentials ` +
`are handled, data retention periods, third-party services, telemetry/analytics presence or ` +
`absence, and how to delete data.` +
"\n\n" +
`STEP 3 — INVENTORY ACTUAL CODE BEHAVIOR` +
"\n" +
`Scan the codebase for all data-handling code paths:` +
"\n" +
`- Local storage: database writes, file writes, log output, cache directories` +
"\n" +
`- External transmission: HTTP clients, webhook calls, SMTP/email sending, ` +
`CLI subprocess invocations that send data to external services, API calls` +
"\n" +
`- Credential handling: env var reads, config file parsing, secret storage, token management` +
"\n" +
`- Data retention: cleanup routines, TTL logic, log rotation, pruning jobs` +
"\n" +
`- Telemetry: analytics SDKs, usage tracking, crash reporters, phone-home calls` +
"\n" +
`- Third-party integrations: external service clients, SDK imports, webhook consumers` +
"\n\n" +
`STEP 4 — CROSS-REFERENCE AND REPORT` +
"\n" +
`Compare each policy claim against the code inventory. Flag every inconsistency.` +
"\n\n" +
`OUTPUT FORMAT — For each finding, report:` +
"\n" +
`- file: path relative to repo root (or "policy" if the issue is in the policy document)` +
"\n" +
`- line: line number(s) in code, or section heading in policy` +
"\n" +
`- category: one of [data-collection-undisclosed, data-transmission-undisclosed, ` +
`retention-mismatch, credential-handling-mismatch, third-party-undisclosed, ` +
`deletion-incomplete, telemetry-mismatch, missing-policy]` +
"\n" +
`- severity: critical / high / medium / low` +
"\n" +
`- claim: what the policy says (quote or paraphrase)` +
"\n" +
`- actual: what the code actually does` +
"\n" +
`- recommendation: specific fix (update policy, update code, or both)` +
"\n\n" +
`SEVERITY GUIDE:` +
"\n" +
`- critical: code sends data externally that the policy says is never sent, or policy ` +
`claims no telemetry but code includes analytics/tracking` +
"\n" +
`- high: missing policy entirely, undisclosed third-party data sharing, or credential ` +
`handling weaker than claimed` +
"\n" +
`- medium: retention periods differ from documented values, deletion instructions ` +
`incomplete, or storage locations not mentioned in policy` +
"\n" +
`- low: minor wording inaccuracies, optional features not clearly marked as optional, ` +
`or documented paths that differ from defaults` +
"\n\n" +
`Summarize total findings by category and severity at the end. If no inconsistencies ` +
`are found, confirm that the policy accurately reflects the code.`,
CostTier: CostMedium,
RiskLevel: RiskLow,
DefaultInterval: 72 * time.Hour,
Expand Down
2 changes: 1 addition & 1 deletion website/docs/task-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Completed analysis with conclusions. These tasks produce reports without modifyi
| `cost-attribution` | Cost Attribution Estimator | Estimate resource costs by component | Medium | Low | 72h |
| `security-footgun` | Security Foot-Gun Finder | Find common security anti-patterns | Medium | Low | 72h |
| `pii-scanner` | PII Exposure Scanner | Scan for potential PII exposure | Medium | Low | 72h |
| `privacy-policy` | Privacy Policy Consistency Checker | Check code against privacy policy claims | Medium | Low | 72h |
| `privacy-policy` | Privacy Policy Consistency Checker | Cross-reference privacy policy claims against actual code behavior | Medium | Low | 72h |
| `schema-evolution` | Schema Evolution Advisor | Analyze database schema changes | Medium | Low | 72h |
| `event-taxonomy` | Event Taxonomy Normalizer | Normalize event naming and structure | Medium | Low | 72h |
| `roadmap-entropy` | Roadmap Entropy Detector | Detect roadmap scope creep and drift | Medium | Low | 72h |
Expand Down
Loading