Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/models.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ ml_models:

installation:
python: "pip install fastembed"
typescript: "npm install @xenova/transformers"
typescript: "npm install @huggingface/transformers"

# Heuristic Models (No ML Required)
heuristic_models:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ['3.9', '3.10', '3.11', '3.12']
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13']

steps:
- name: Checkout code
Expand Down Expand Up @@ -134,7 +134,7 @@ jobs:
pnpm --filter @cascadeflow/core build

- name: Run TypeScript tests
run: pnpm --filter @cascadeflow/core test || echo "No tests defined yet"
run: pnpm --filter @cascadeflow/core test

- name: Upload coverage
if: always()
Expand Down
63 changes: 63 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.0] - 2026-03-07

### Added

- **Harness API** — `init()`, `run()`, `@agent()` for zero-change observability, scoped budget runs, and decorated agent policy. Three modes: `off`, `observe`, `enforce`.
- **SDK auto-instrumentation** — Patches OpenAI and Anthropic Python SDKs to intercept every LLM call for cost tracking, budget enforcement, compliance gating, and decision tracing.
- **Six-dimension optimization** — Cost, latency, quality, budget, compliance, and energy tracked across every model call.
- **KPI-weighted routing** — Inject business priorities (`quality`, `cost`, `latency`, `energy`) as weights into model selection decisions.
- **Compliance gating** — GDPR, HIPAA, PCI, and strict model allowlists; block non-compliant models before execution.
- **Energy tracking** — Deterministic compute-intensity coefficients for carbon-aware AI operations.
- **Decision traces** — Full per-step audit trail: action, reason, model, cost, budget state, enforcement status.
- **Budget enforcement** — Per-run and per-user budget caps with automatic stop actions when limits are exceeded.
- **Framework integrations** — LangChain (Python + TypeScript), OpenAI Agents SDK, CrewAI, Google ADK, n8n, Vercel AI SDK.
- **TypeScript SDK** — `@cascadeflow/core`, `@cascadeflow/langchain`, `@cascadeflow/vercel-ai`, `@cascadeflow/ml`, `@cascadeflow/n8n-nodes-cascadeflow` published on npm.
- **Proxy Gateway** — Drop-in OpenAI/Anthropic-compatible HTTP server with mock and agent modes, streaming, tool calling, and embeddings support.
- **OpenClaw Server** — Standalone OpenAI-compatible server for OpenClaw deployments with semantic routing.
- **Paygentic integration** — Usage reporting and billing proxy for Paygentic platform.
- **Tool risk classification** — `ToolRiskClassifier` for per-tool-call routing based on risk level.
- **Circuit breaker** — Per-provider circuit breaker with configurable thresholds and recovery.
- **Dynamic configuration** — Runtime config updates via file watcher with change events.
- **Rule engine** — `RuleEngine` for declarative routing and policy rules.
- **Agent loops** — Multi-turn tool execution with automatic tool call, result, re-prompt cycles.
- **Semantic quality validation** — Optional ML-based quality scoring via FastEmbed embeddings.
- **15-domain auto-detection** — Code, math, medical, legal, finance, data, and more with per-domain routing pipelines.
- **Complexity detection** — 500+ technical terms, mathematical notation detection, density-aware scoring for long documents.

### Changed

- **Lazy imports** — `import cascadeflow` no longer eagerly loads all providers, numpy, or heavyweight submodules. Import time reduced from ~1900ms to ~20ms via PEP 562 lazy loading.
- **`__all__` reduced** — From 127 to ~20 essential public symbols. Non-essential exports remain accessible but are not star-exported.
- **`rich` moved to optional** — No longer a core dependency; falls back to stdlib logging when not installed. Install with `pip install cascadeflow[rich]`.
- **Integration import errors** — Failed optional integration imports now return proxy objects that raise `ImportError` with install hints on use, instead of silently returning `None`.
- **Proxy CORS default** — `cors_allow_origin` changed from `"*"` to `None` (opt-in) for secure-by-default deployments.

### Removed

- **Deprecated `CascadeAgent` parameters** — `config`, `tiers`, `workflows`, `enable_caching`, `cache_size`, `enable_callbacks` removed from constructor. Use `HarnessConfig` or dedicated APIs instead.
- **Submodule `__version__` strings** — Removed from `quality`, `streaming`, `telemetry`, `ml`, `tools`, `routing`, `interface` subpackages. Use `cascadeflow.__version__` instead.
- **Benchmark infrastructure** — `tests/benchmarks/`, `benchmark_results/`, and related docs removed (moved to separate benchmark repo).

### Fixed

- **Thread safety** — Added `threading.Lock` around SDK patch/unpatch state. `HarnessRunContext` counters guarded with lock for multi-threaded use.
- **Trace buffer** — `_trace` changed from `list` with manual slicing to `collections.deque(maxlen=1000)` for bounded memory.
- **Regex pre-compilation** — `ComplexityDetector` now pre-compiles all regex patterns in `__init__()` instead of per-`detect()` call.
- **Proxy body limit** — Added `max_body_bytes` (default 10MB) to `ProxyConfig`; returns 413 for oversized requests.
- **Proxy auth** — Added optional `auth_token` to `ProxyConfig`; returns 401 for unauthenticated requests when set.

### Security

- Proxy gateway CORS tightened to opt-in (`None` default).
- Request body size limit prevents memory exhaustion attacks.
- Bearer token authentication for proxy gateway endpoints.
- Updated `SECURITY.md` supported version to 1.0.x.

[1.0.0]: https://github.com/lemony-ai/cascadeflow/releases/tag/v1.0.0
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,11 @@

cascadeflow works where external proxies can't: per-step model decisions based on agent state, per-tool-call budget gating, runtime stop/continue/escalate actions, and business KPI injection during agent loops. It accumulates insight from every model call, tool result, and quality score — the agent gets smarter the more it runs. Sub-5ms overhead. Works with LangChain, OpenAI Agents SDK, CrewAI, Google ADK, n8n, and Vercel AI SDK.

```python
```bash
pip install cascadeflow
```

```tsx
```bash
npm install @cascadeflow/core
```

Expand Down Expand Up @@ -289,7 +289,7 @@ For advanced quality validation, enable ML-based semantic similarity checking to
**Step 1:** Install the optional ML packages:

```bash
npm install @cascadeflow/ml @xenova/transformers
npm install @cascadeflow/ml @huggingface/transformers
```

**Step 2:** Enable semantic validation in your cascade:
Expand Down
8 changes: 4 additions & 4 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ We release security updates for the following versions of cascadeflow:

| Version | Supported |
| ------- | ------------------ |
| 0.7.x | :white_check_mark: |
| < 0.7 | :x: |
| 1.0.x | :white_check_mark: |
| < 1.0 | :x: |

We recommend always using the latest version for the best security and features.

Expand Down Expand Up @@ -351,5 +351,5 @@ Security researchers who have helped improve cascadeflow security:

This security policy may be updated from time to time. Please check back regularly for updates.

**Last Updated:** October 2025
**Version:** 1.0
**Last Updated:** March 2026
**Version:** 1.1
Loading
Loading