Skip to content

Commit fea4bd0

Browse files
Merge branch 'main' into kaiming/BackendBenchIntegration
2 parents b357267 + d11026e commit fea4bd0

File tree

3 files changed

+37
-29
lines changed

3 files changed

+37
-29
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ triton_kernel_logs/
117117
*.log
118118
session_*/
119119
worker_*/
120+
.fuse/
120121

121122
# Generated kernels
122123
kernel.py
@@ -139,6 +140,6 @@ CLAUDE.md
139140
.Spotlight-V100
140141
.Trashes
141142
ehthumbs.db
142-
Thumbs.db
143+
Thumbs.db
143144
# Local batch runner
144145
scripts/run_kernelbench_batch.py

README.md

Lines changed: 32 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,9 @@ KernelAgent turns PyTorch programs into verified Triton kernels. It was designed
77
- Parallel Triton kernel generation with strict runtime verification
88
- End‑to‑end composition that rebuilds the original forward pass using only the synthesized kernels
99

10-
Blog post: [TBD] • Additional docs: coming soon
10+
Blog post: [PyTorch KernelFalcon](https://pytorch.org/blog/kernelfalcon-autonomous-gpu-kernel-generation-via-deep-agents/)
11+
12+
Additional docs: coming soon
1113

1214
## Pipeline Overview
1315

@@ -18,42 +20,46 @@ Every stage writes artifacts to a run directory under `.fuse/<run_id>/`, includi
1820
## Quickstart
1921

2022
### Requirements
23+
- Python 3.8 – 3.12
2124
- Linux or macOS; CUDA‑capable GPU for Triton execution
22-
- Python 3.8–3.12
23-
- Triton (install separately: `pip install triton` or nightly from source)
24-
- At least one LLM provider:
25-
- OpenAI (`OPENAI_API_KEY`, models like `o4-mini`, `gpt-5`)
26-
- Anthropic (`ANTHROPIC_API_KEY`; default fallback model is `claude-sonnet-4-20250514` when `OPENAI_MODEL` is unset)
27-
- Any OpenAI‑compatible relay endpoint (`LLM_RELAY_URL`, optional `LLM_RELAY_API_KEY`; see `triton_kernel_agent/providers/relay_provider.py`)
28-
- Gradio (UI dependencies; installed as part of the core package)
25+
- Triton (installed separately: `pip install triton` or nightly from source)
2926
- PyTorch (https://pytorch.org/get-started/locally/)
27+
- LLM provider ([OpenAI](https://openai.com/api/), [Anthropic](https://www.anthropic.com/), or a self-hosted relay)
3028

31-
### Installation
29+
### Install
3230
```bash
33-
git clone https://github.com/pytorch-labs/KernelAgent.git
34-
cd KernelAgent
35-
python -m venv .venv && source .venv/bin/activate # choose your own env manager
36-
pip install -e .[dev] # project + tooling deps
37-
pip install triton # not part of extras; install the version you need
31+
pip install -e .
32+
```
3833

39-
# (optional) Install KernelBench for problem examples
34+
#### (Optional) Install KernelBench for problem examples
35+
```bash
4036
git clone https://github.com/ScalingIntelligence/KernelBench.git
4137
```
38+
Note: By default, KernelAgent UI searches for KernelBench at the same level as `KernelAgent`. (i.e. `../KernelBench`)
4239

43-
### Configure credentials
44-
You can export keys directly or use an `.env` file that the CLIs load automatically:
40+
### Configure
41+
You can export keys directly or use an `.env` file that the CLIs load automatically.
4542

4643
```bash
47-
OPENAI_API_KEY=sk-...
48-
OPENAI_MODEL=gpt-5 # override default fallback (claude-sonnet-4-20250514)
44+
OPENAI_MODEL=gpt-5 # default model for extraction
4945
NUM_KERNEL_SEEDS=4 # parallel workers per kernel
5046
MAX_REFINEMENT_ROUNDS=10 # retry budget per worker
51-
LOG_LEVEL=INFO
47+
LOG_LEVEL=INFO # logging level
48+
```
49+
50+
#### LLM Providers
51+
KernelAgent currently supports OpenAI and Anthropic out-of-the-box. You can also use a custom OpenAI endpoint.
52+
These can be configured in `.env` or via environment variables.
53+
```bash
54+
# OpenAI (models like `o4-mini`, `gpt-5`)
55+
OPENAI_API_KEY=sk-...
56+
57+
# Anthropic (default; `claude-sonnet-4-20250514` is used when `OPENAI_MODEL` is unset)
58+
ANTHROPIC_API_KEY=sk-ant-...
5259

53-
# Optional relay configuration for self-hosted gateways
54-
# LLM_RELAY_URL=http://127.0.0.1:11434
55-
# LLM_RELAY_API_KEY=your-relay-token
56-
# LLM_RELAY_TIMEOUT_S=120
60+
# Relay configuration for self-hosted gateways
61+
LLM_RELAY_URL=http://127.0.0.1:11434
62+
LLM_RELAY_TIMEOUT_S=120
5763
```
5864

5965
More knobs live in `triton_kernel_agent/agent.py` and `Fuser/config.py`.
@@ -153,9 +159,9 @@ These artifacts are designed for reproducibility: you can re-run a single kernel
153159

154160
## Documentation & Community
155161

156-
- Architecture and deep-dive docs: `docs/kernelfalcon_overview.html`, `docs/kernelfalcon_agents2_overview.html`, `docs/FuserAgent_sketch.html`, `docs/fuser_agent_compare.html`
162+
- Architecture and deep-dive docs: `Coming Soon`
157163
- Issues: https://github.com/pytorch-labs/KernelAgent/issues
158-
- Discussions & blog posts: [TBD]
164+
- Blog post: https://pytorch.org/blog/kernelfalcon-autonomous-gpu-kernel-generation-via-deep-agents/
159165

160166
## License
161167

triton_kernel_agent/providers/relay_provider.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818

1919
import requests
2020
import logging
21+
import os
2122

2223
from .base import BaseProvider, LLMResponse
2324

@@ -34,7 +35,7 @@ class RelayProvider(BaseProvider):
3435
"""
3536

3637
def __init__(self):
37-
self.server_url = "http://127.0.0.1:11434"
38+
self.server_url = os.environ.get("LLM_RELAY_URL", "http://127.0.0.1:11434")
3839
self.is_available_flag = False
3940
super().__init__()
4041

@@ -68,7 +69,7 @@ def get_response(
6869
self.server_url,
6970
json=request_data,
7071
headers={"Content-Type": "application/json"},
71-
timeout=120.0,
72+
timeout=int(os.environ.get("LLM_RELAY_TIMEOUT_S", 120)),
7273
)
7374

7475
if response.status_code != 200:

0 commit comments

Comments
 (0)