AI-Driven Action Planner & Troubleshooter (ADAPT)
- Overview
- Setup
- Running the Application
- Project Structure
- Features
- Configuration
- Observability
- Supported AI Models
- Roadmap and Known Issues
- Contributing
- Acknowledgements
ADAPT is an autonomous network troubleshooting system that uses AI-driven agents to diagnose and solve network issues. The system utilizes a multi-agent workflow powered by LangGraph and PydanticAI to provide intelligent, step-by-step troubleshooting of network problems.
The workflow consists of the following AI agents:
- Fault Summarizer: Analyzes network alerts and summarizes the issue
- Action Planner: Creates a detailed troubleshooting plan with specific commands
- Action Executor: Executes commands on network devices (real or simulated)
- Action Analyzer: Analyzes command outputs and determines next steps
- Result Summary: Provides a comprehensive troubleshooting report
We've provided a recorded walkthrough of the setup process and execution with a live device to help you get started quickly.
- Python 3.12 and above
- Docker (optional, for containerized deployment)
- OpenAI API key (for LLM access) - Sign up for OpenAI
- Logfire token (optional, for observability) - Sign up for Logfire
- Clone this repository
- Create and activate a virtual environment:
python -m venv venv venv\Scripts\activate # Windows # OR source venv/bin/activate # Linux/Mac
- Install dependencies:
pip install -r requirements.txt
- Create a .envfile with your configurations (copy from .env.example):copy .env.example .env # Windows # OR cp .env.example .env # Linux/Mac
- Update the .envfile with your settings:# API settings OPENAI_API_KEY=your_api_key_here LOGFIRE_TOKEN=your_logfire_token_here (optional) # Device credentials DEVICE_USERNAME=admin DEVICE_PASSWORD=password DEVICE_SECRET=enable_password # Configuration paths INVENTORY_PATH=configuration/inventory.yml
- 
Clone this repository 
- 
Prerequisites: - Docker installed on your system
 
- 
Environment Variables Configuration: - Copy the example environment file to create your own:
copy .env.example .env
- Edit the .envfile with your configuration values: ``` OPENAI_API_KEY=your_api_key_here- Configure any device hostname, type, and port settings - Add any other environment variables needed by the application
 
- Copy the example environment file to create your own:
- 
Build and run with Docker Compose: docker compose up -dTo rebuild after code changes: docker compose up --build -dTo stop the container: docker compose down
- 
Mounted Volumes: The following directories are mounted from the host into the container: - ./workbench:/app/workbench: For persistent data storage
- ./configuration:/app/configuration: For device inventories and settings files
 Any changes made to these directories on the host will be immediately reflected in the container. 
Launch the Streamlit application:
streamlit run streamlit_app.py
When using Docker, the application runs automatically with the following services:
- Streamlit Application: Accessible at http://localhost:8501
- Alert Queue Service: Exposed on port 8001 (can be started through the Streamlit interface when needed)
ADAPT provides an API endpoint for receiving network alerts:
- Endpoint: POST /alert
- Port: 8001 (default)
- Input: Any valid JSON content
- Response: Success or error message
To send an alert manually:
curl -X POST http://localhost:8001/alert -H "Content-Type: application/json" -d '{"alert_id":"BGPDOWN-0001","device":"NCS5508-1","severity":"high","message":"BGP neighbor 1.2.3.4 is Down","raw_event":"%ROUTING-BGP-5-ADJCHANGE : neighbor 1.2.3.4 - Hold timer expired"}'
- agents/: Directory containing all agent implementations- hello_world/: Hello World agent implementation
- fault_summary/: Fault Summary agent implementation
- action_planner/: Action Planning agent implementation
- action_executor/: Command execution agent implementation
- action_analyzer/: Output analysis agent implementation
- result_summary/: Result Summary agent implementation
 
- configuration/: Settings and network device inventory- inventory.yml: Network device inventory
- settings.yml: Application settings
 
- graph.py: LangGraph implementation for the multi-agent workflow
- static/: Directory for files served via direct URLs
- streamlit_app.py: Main Streamlit application
- tests/: Test scenarios for simulation mode
- utils/: Utility functions and helpers
- workbench/: Storage for troubleshooting session logs
- alert_queue.py: API service for receiving network alerts
- 
Multiple Operation Modes: - Simulation Mode: Run commands without actual execution on devices
- Test Mode: Use predefined test data from YAML files
- Production Mode: Connect to and execute commands on real network devices
 
- 
Golden Rules: Configure safety rules that are always followed by agents - Edit the golden_rulessection insettings.ymlto add or modify rules
- Rules are enforced during the generation of troubleshooting steps
 
- Edit the 
- 
Direct Result Access: Troubleshooting results and response logs are saved as files with direct URL links for easy access and integration with other systems 
- 
Multi-Agent Workflow: End-to-end troubleshooting using multiple specialized agents 
- 
Approval System: Critical commands can be configured to require explicit user approval 
- 
Individual Agent Testing: Each agent can be tested independently through the UI 
- 
Step Mode: When enabled, requires approval before proceeding to the each step in the workflow 
- 
Custom Instructions: Add specific set of custom instructions for a known issue. These instructions will heavily influence that agent's behavior when generating troubleshooting steps. 
- 
Adaptive Mode: When enabled, allows the system to adapt modify the action plan based upon findings from previous steps in the workflow. This allows the system to dynamically adjust its approach based on results. 
Test scenarios are defined in YAML files in the tests/ directory. Each file contains:
- alert_payload: The simulated alert that triggers the workflow
- custom_instructions: Specific remediation guidelines for this scenario
- command_outputs: Simulated outputs for various network commands
To create a new test scenario, copy an existing file and modify it, or use the utils/generate_test.py script to generate a test scenario using a Test Generation AI Agent.
The settings.yml file in the configuration/ directory contains various settings:
- debug_mode: Enable verbose logging for debugging (currently unsupported)
- simulation_mode: Use LLM to simulate command outputs instead of executing them on devices
- test_mode: Use predefined test data from YAML files
- test_name: The name of the test file to use in test mode
- step_mode: Require approval between workflow steps
- adaptive_mode: Allow the system to adapt its troubleshooting action plan based on results
- golden_rules: Global rules that agents must follow
- max_steps: Maximum number of troubleshooting steps allowed in an action plan
- custom_instructions: Remediation guide for known issues
The inventory.yml file in the configuration/ directory defines network devices:
devices:
  DEVICE-NAME:
    hostname: "device_ip_address"
    device_type: "cisco_xr"  # Netmiko driver type
    username: "username"  # Default from .env if not specified
    password: "password"  # Default from .env if not specified
    optional_args:
      port: 22
      transport: "ssh"
      timeout: 60ADAPT optionally integrates with Logfire for observability and API usage tracking. To enable this feature:
- Get your token from https://logfire.pydantic.dev
- Add it to your .envfile:LOGFIRE_TOKEN=your_logfire_token_here
By default, ADAPT uses OpenAI models for its AI agents. However, any model supported by PydanticAI can be used by updating the corresponding environment variables in your .env file:
# API keys for your chosen provider
OPENAI_API_KEY=your_api_key_here
# For providers other than OpenAI, set the corresponding environment variable:
# OPENROUTER_API_KEY=your_openrouter_api_key
# ANTHROPIC_API_KEY=your_anthropic_api_key
# OLLAMA_API_KEY=your_ollama_api_key
# Model configurations using <provider>:<model> format
REASONER_MODEL=openai:o4-mini          # For complex reasoning (action_planner, action_analyzer)
LARGE_MODEL=openai:gpt-4.1             # For standard operations (action_executor, results_summary)
SMALL_MODEL=openai:gpt-4.1-mini        # For simpler operations (fault_summary, hello_world)
Visit the PydanticAI Models documentation for a comprehensive list of supported models and providers, along with configuration details. Examples of supported model formats include openai:gpt-4o, anthropic:claude-3-opus-20240229, or openrouter:google/gemini-2.5-pro-preview.
See the CHANGELOG.md for a list of known issues and planned improvements for upcoming releases.
We welcome contributions to ADAPT! Here's how you can contribute:
- Fork the repository on GitHub
- Create a branch for your changes
- Make your changes (new agents, bug fixes, documentation, etc.)
- Submit a pull request back to the main repository
For questions or suggestions, please open an issue on GitHub.
We'd like to thank the following people for their impact on this project:
- Cole Medin - Cole's Youtube videos were integral to helping us understand LangGraph and PydanticAI and come up with a strategy for tackling this project. Can't recommend his content enough for anyone looking to get into agentic AI development!
- Ralph Keyser - Ralph kindly volunteered to be the guinea pig for testing the initial version of ADAPT and provided valuable feedback.