The open-source AI SRE platform: Investigate production incidents, find root causes, and suggest fixes — automatically.
Slack | IncidentFox Cloud | Self-Hosting | Docs | Website
IncidentFox is the open-source AI SRE that automatically investigates production incidents, correlates alerts, analyzes logs, and finds root causes. It connects to your observability stack, infrastructure, and code — then reasons through to an answer.
IncidentFox lives in Slack (or Microsoft Teams / Google Chat). It can auto-respond to every alert, investigate in the thread, and post a root cause summary — or you can @mention it on demand. It learns from your codebase, Slack history, and past incidents to get smarter over time.
We're on a mission to make incident response faster and less painful for every on-call engineer, not just teams with dedicated SRE staff.
- Auto-Investigation: Automatically responds to every alert in Slack, investigates in-thread, and posts a root cause summary. Also supports on-demand @mention for any question.
- Chat-First Debugging: Investigate incidents without leaving Slack, Microsoft Teams, or Google Chat. Upload screenshots, attach logs, and get analysis inline.
- Multi-Agent Orchestration: Specialist agents for Kubernetes, AWS, metrics, code analysis, and more — routed automatically based on the problem.
- Smart Log Sampling: Statistics first, then targeted sampling. Stays useful where other tools hit context limits.
- Alert Correlation: 3-layer analysis (temporal + topology + semantic) reduces alert noise by 85-95%.
- Anomaly Detection: Meta's Prophet algorithm with seasonality-aware forecasting. Detects anomalies that account for daily/weekly patterns.
- Dependency Mapping: Automatic service topology discovery with blast radius analysis.
- Knowledge Base (RAPTOR): Hierarchical retrieval over runbooks and docs. Maintains context across long documents where standard RAG fails.
- Continuous Learning: Learns from every investigation. Persists patterns and builds team-specific context over time.
| Category | Integrations |
|---|---|
| Logs & Metrics | |
| Incidents & Alerts | |
| Cloud & Infra | |
| Dev & Project Tools | |
| LLM Providers | Claude · OpenAI · Gemini · DeepSeek · Mistral · Groq · Ollama · Azure OpenAI · Amazon Bedrock · Vertex AI · and 14 more |
| Extensibility | Add new integrations as skills and scripts — no code changes needed |
- Web Console: Dashboard for managing agents, viewing investigations, configuring tools and prompts per team.
- Multi-Tenant Config: Hierarchical org/team configuration with deep merge, RBAC, and audit logging.
- Model Flexibility: 24 LLM providers supported (Claude, OpenAI, Gemini, DeepSeek, Mistral, Ollama, and more) — no vendor lock-in.
- Sandboxed Execution: Each investigation runs in an isolated gVisor Kubernetes sandbox. Credentials never touch the agent (Envoy proxy injects secrets at request time).
- Webhook Routing: Orchestrator routes GitHub, PagerDuty, Incident.io, Blameless, and FireHydrant events to the right team's agent.
- Self-Hosting: Deploy on your own infrastructure with Helm. Air-gapped support available.
Check out the Docs to learn more.
| Use IncidentFox Cloud | Deploy on your infrastructure |
|---|---|
| The fastest and most reliable way to get started with IncidentFox is adding it to Slack or signing up at incidentfox.ai. |
View all deployment options |
Watch: How to set up & run IncidentFox locally
To set up and run IncidentFox locally, make sure you have Git and Docker installed on your system:
git clone https://github.com/incidentfox/incidentfox && cd incidentfox && cp .env.example .env && make devThen edit .env to add your ANTHROPIC_API_KEY (or any other LLM provider — see .env.example for options).
That's it. IncidentFox starts Postgres, config-service, sre-agent, and the web console. Migrations run automatically.
Want to test via Slack? Create a Slack app using the manifest, add SLACK_BOT_TOKEN and SLACK_APP_TOKEN to .env, and run make dev-slack. Full Slack setup guide.
This repo is available under the Apache License 2.0, with the exception of the production security layer (sandbox isolation, credential proxy) which is under the Business Source License 1.1. See LICENSING.md for details.
The open-source agent is fully featured — same AI, same integrations, same intelligence. Enterprise adds the production security layer and management features for organizations with multiple teams.
| Feature | Open Source | Enterprise |
|---|---|---|
| All 45+ integrations | Yes | Yes |
| Auto-investigation on alerts | Yes | Yes |
| Codebase & Slack history learning | Yes | Yes |
| Knowledge base (RAPTOR) | Yes | Yes |
| Alert correlation & anomaly detection | Yes | Yes |
| Web console | Yes | Yes |
| Bring your own LLM keys | Yes | Yes |
| Self-hosted deployment | Yes | Yes |
| Multi-team management & RBAC | - | Yes |
| SSO / OIDC | - | Yes |
| SOC 2 compliance | - | Yes |
| Approval workflows | - | Yes |
| Dedicated support & SLAs | - | Yes |
If you are interested in managed IncidentFox Cloud or Enterprise, take a look at our website or contact us.
Please do not file GitHub issues or post on our public forum for security vulnerabilities, as they are public!
IncidentFox takes security issues very seriously. If you have any concerns about IncidentFox or believe you have uncovered a vulnerability, please get in touch via the e-mail address security@incidentfox.ai. In the message, try to provide a description of the issue and ideally a way of reproducing it. The security team will get back to you as soon as possible.
Note that this security address should be used only for undisclosed vulnerabilities. Please report any security problems to us before disclosing it publicly.
Whether it's big or small, we love contributions. Check out our guide to see how to get started.
Not sure where to get started? You can:
IncidentFox uses a dual-license model:
- Core platform (agent, Slack bot, config service, orchestrator, Helm chart, local dev tooling) is licensed under the Apache License 2.0.
- Security layer (sandbox isolation, credential proxy) is licensed under the Business Source License 1.1.
Enterprise features automatically convert to Apache 2.0 on the Change Date (February 18, 2030). You can use them freely for development, testing, and evaluation.
See LICENSING.md for the complete breakdown of which directories are under which license.
Claude Code Plugin — Standalone SRE tools for individual developers using Claude Code CLI. Not connected to the IncidentFox platform above.
Built with ❤️ by the IncidentFox team

