Skip to content

feat(cli): add rustinel doctor #36

Description

@Karib0u

Use Case

First-run failures and misconfiguration issues are currently hard to diagnose. Users must manually inspect privileges, BTF availability, tracefs/debugfs mounts, configuration validity, log paths, rule files, service state, managed paths, and telemetry prerequisites. This creates friction during onboarding, managed deployment, and incident triage.

Proposed Solution

Add rustinel doctor and rustinel doctor --json commands that run read-only preflight and health checks without starting the full agent.

Doctor should report pass, warn, or fail results with actionable fix hints. It should also expose a reusable diagnostic result model that rustinel setup can use for final health checks.

Common Checks

  • Supported operating system and architecture.
  • Current Rustinel version.
  • Portable versus managed mode.
  • Required privilege level.
  • Configuration discovery and parsing.
  • Resolved paths.
  • Rule-pack installation and version.
  • Pack compatibility with Rustinel.
  • Rule and IOC parsing errors.
  • Log and alert directory writability.
  • Active-response safety configuration.
  • Service installation and runtime state.
  • Telemetry backend initialization or prerequisites.

Platform Checks

Linux

  • Kernel version.
  • BTF availability.
  • Required eBPF capabilities or root privileges.
  • Relevant filesystem mounts.
  • systemd availability and unit status.
  • DNS hook coverage for sendto, sendmsg, and sendmmsg where applicable.

Windows

  • Administrator privileges.
  • Service registration.
  • Managed path accessibility.
  • ETW initialization prerequisites.

macOS

  • Correct signed application location.
  • Endpoint Security entitlement.
  • Endpoint Security authorization.
  • Full Disk Access state where detectable.
  • BPF access.
  • launchd service state.

Output Requirements

  • Human-readable output by default.
  • --json output for automation.
  • Each check includes status, summary, and fix guidance where useful.
  • Exit code 0 means healthy.
  • Exit code 1 means warnings.
  • Exit code 2 means failures.
  • The command remains strictly read-only.

Acceptance Criteria

  • Human-readable and --json output modes exist.
  • Clear pass, warn, or fail result for each check with fix hints.
  • Works without starting the full agent.
  • Startup failure messages reference rustinel doctor as a next step.
  • Detects portable versus managed mode.
  • Reports managed configuration location.
  • Reports installed pack ID and version.
  • Verifies pack compatibility and recorded checksum.
  • Checks native service installation and state.
  • Checks telemetry prerequisites for each platform.
  • Uses exit codes 0 healthy, 1 warnings, 2 failures.
  • Remains read-only.

Dependencies

Priority

P0

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestp0Must ship in next cycle

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions