Skip to content

Latest commit

 

History

History
172 lines (146 loc) · 8.22 KB

File metadata and controls

172 lines (146 loc) · 8.22 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Ballmer Agentic Conception - BAC — Agentic AI for Autonomous Network Operations

Project Overview

Agentic AI system for autonomous ISP network operations with observe→reason→decide→act→learn loop. An "AI Network Guardian" that continuously monitors a simulated ISP, detects anomalies, reasons about root causes using causal inference, verifies safety of actions mathematically, and learns from outcomes.

IMPORTANT: Use pgmpy (NOT causalnex) for causal inference — causalnex is incompatible with Python 3.11+.

Setup

# Python version: see .python-version (3.11+)
cp .env.example .env          # then set OPENAI_API_KEY
cd src/ui/dashboard && cp .env.example .env  # set VITE_API_URL

Commands

Backend

# Install dependencies
pip install -e .
# or
pip install -r requirements.txt

# Run the FastAPI server (starts simulator + agent loop + WebSocket)
uvicorn src.api.main:app --reload --port 8000

# Run the CLI demo (hackathon demo, no UI needed)
python demo.py

# Train RL traffic engineering model on synthetic data
python train_rl_synthetic.py

# Generate evaluation dataset
python -m src.evaluation.generate_dataset

# Run evaluation (precision, recall, F1, MTTD, MTTM)
python -m src.evaluation.evaluate

Frontend

cd src/ui/dashboard
npm install
npm run dev      # dev server at http://localhost:5173
npm run build    # production build

Tests

# Run all tests
pytest

# Run a single test file
pytest tests/test_observer_agent.py -v

# Run with coverage
pytest --cov=src tests/

Linting / Type-checking

ruff check src/
mypy src/

Tech Stack

  • Backend: Python 3.11+, FastAPI, WebSockets
  • Agent Framework: LangGraph (stateful agent orchestration with observe→reason→decide→act→learn graph)
  • LLM: OpenAI GPT-4o (via OPENAI_API_KEY in .env) — used for reasoning, hypothesis formation, debate agents, and explanations
  • Causal Inference: pgmpy (NOTEARS/BayesianNetwork), networkx (topology graph)
  • Safety Verification: z3-solver (formal proof that actions are safe before execution)
  • Anomaly Detection: scikit-learn IsolationForest, EWMA statistical detector, threshold-based rules
  • Traffic Forecasting: LSTM (PyTorch) for predicting congestion before it happens, Prophet as fallback
  • Data: Synthetic ISP telemetry generated with numpy + pandas (labeled anomaly scenarios)
  • Frontend: React + Vite + Tailwind CSS + Recharts + D3.js (force-directed topology graph)
  • Database: SQLite (audit log, action history), FAISS (RAG vector store)
  • RAG: LangChain + OpenAI embeddings + FAISS for network runbook retrieval
  • Streaming: WebSockets for real-time telemetry and agent event streaming to UI

Key Differentiators (What Makes Us Win)

  1. Causal Counterfactual Digital Twin — pgmpy + PC/NOTEARS algorithm. Agent reasons causally ("root cause of latency on link X is congestion on upstream link Y triggered by BGP change 12 min ago"). Runs counterfactual simulations before acting ("if I reroute A→B, what happens to C, D, E?").
  2. Formal Safety Verification with Z3 — Every autonomous action is mathematically proven safe against invariants (no link >85% utilization after reroute, rollback path must exist, blast radius caps). "Provable safety guarantees."
  3. Multi-Agent Adversarial Debate — High-risk decisions trigger a panel: ReliabilityAgent vs PerformanceAgent vs CostSLAAgent, judged by a JudgeAgent. Debate transcript = explainability layer.
  4. LSTM Traffic Forecasting — Predicts congestion 10-30 minutes ahead so the agent can act proactively, not reactively. Trained on synthetic diurnal traffic patterns.
  5. Graph-Based Reasoning — Network topology as a graph (NetworkX). Graph analytics for root cause analysis — anomaly propagation scoring across topology neighbors.

Architecture

src/
  simulator/           — Network topology + synthetic telemetry + anomaly injection + live engine
    topology.py        — NetworkX ISP topology (12 nodes: core, agg, edge, peering)
    telemetry.py       — Synthetic metric generation (diurnal patterns + noise)
    anomaly_injector.py — Labeled failure scenarios (congestion cascade, DDoS, fiber cut, etc.)
    engine.py          — Real-time simulation engine (async, configurable speed)
  agents/              — LangGraph agent orchestration
    orchestrator.py    — Main LangGraph StateGraph (observe→reason→decide→act→learn)
    observer.py        — Telemetry ingestion + multi-method anomaly detection
    reasoner.py        — Causal inference + LLM hypothesis formation
    decider.py         — Decision logic + utility scoring + Z3 verification gate
    actor.py           — Action execution with rollback tokens + auto-rollback monitoring
    learner.py         — Outcome tracking + threshold adjustment + training data export
    debate.py          — Multi-agent adversarial debate (3 specialists + judge)
  models/              — ML models
    anomaly_detection.py — IsolationForest + EWMA + threshold detectors
    forecasting.py     — LSTM traffic forecasting model (PyTorch) + Prophet fallback
    schemas.py         — Pydantic models for all data types
  causal/              — Causal graph + counterfactual engine
    causal_engine.py   — pgmpy NOTEARS, root cause analysis, do-calculus counterfactuals
  safety/              — Z3 safety constraints
    z3_verifier.py     — Formal verification of actions against safety invariants
  api/                 — FastAPI endpoints + WebSocket handlers
    main.py            — REST + WebSocket API, CORS, startup initialization
    routes.py          — Endpoint definitions
  rag/                 — RAG for network runbooks
    knowledge_base.py  — FAISS vector store + OpenAI embeddings + runbook documents
  ui/                  — React frontend (Vite)
    src/
      App.jsx          — Main dashboard layout
      components/
        TopologyGraph.jsx    — D3 force-directed network visualization
        TelemetryCharts.jsx  — Recharts real-time metric charts
        AgentFeed.jsx        — Live agent activity log
        DebateViewer.jsx     — Multi-agent debate transcript display
        CausalGraph.jsx      — Causal relationship visualization
        ControlPanel.jsx     — Start/stop/inject/kill-switch controls
        MetricsPanel.jsx     — MTTD, MTTM, precision, recall, F1
  utils/               — Shared utilities, logging config
  data/                — Generated datasets + topology definitions
  evaluation/          — Evaluation scripts (precision, recall, F1, MTTD, MTTM)

Environment Variables

OPENAI_API_KEY=<your-openai-key>

Agent Loop Architecture (LangGraph)

observe ──→ reason ──→ decide ──┬──→ verify ──→ act ──→ learn ──→ observe
                                │                              (loop)
                                └──→ debate ──→ verify
                                  (if high risk)

Decision Policy

  • confidence < 0.6 → create ticket only (passive)
  • confidence 0.6-0.85 AND low blast radius → auto-execute with monitoring
  • confidence >= 0.85 AND Z3 verified → auto-execute
  • high blast radius OR risk >= 0.7 → require human approval OR trigger debate
  • Z3 verification fails → BLOCK action, explain which constraint violated

Autonomy Boundaries

  • AUTOMATIC: Rate-limit suspicious flow (<0.5% sessions affected, confidence ≥0.85), adjust TE weights
  • AUTOMATIC_CANARY: Reroute on single edge router for 10 min, auto-rollback if metrics worsen
  • HUMAN_APPROVAL: BGP changes, mass route changes, core switch restart, config affecting ≥5% traffic
  • NEVER_AUTOMATE: Billing systems, PII stores, regulatory routing

Rollback Policy

All automated changes embed a rollback token. Auto-rollback triggers if key metrics worsen within 10 minutes.

Conventions

  • Type hints on all functions
  • Docstrings on all public functions
  • Pydantic models for data validation (all data flowing between agents)
  • Structured logging with loguru
  • All agent decisions logged to immutable audit trail
  • OpenAI calls use langchain ChatOpenAI wrapper