LEGACY: This document references Ollama which is no longer used. Local inference is now Candle-based (Rust, in-process). This doc is kept for historical reference only.
Date: 2025-12-04 Status: PROPOSED Priority: CRITICAL
The AI adapter architecture is fundamentally broken with setTimeout cancer, no unified health monitoring, and brittle per-adapter retry logic. This causes:
- Main-thread blocking - setTimeout everywhere blocks the event loop
- No prioritization - GPT-4o requests blocked by Llama retries
- Silent failures - Ollama returns
@@@@@@@@garbage but adapter doesn't detect it - Inconsistent recovery - Each adapter has its own brittle restart logic
- Non-concurrent madness - No shared architecture, each adapter is a snowflake
The Fix: Unified concurrent architecture with event-driven health monitoring, priority-based request queuing, and self-healing failure detection.
BaseAIProviderAdapter.ts:
- Line 111:
setInterval()for periodic health checks (every 30s) - Line 161:
setTimeout()for restart stabilization (3s)
OllamaAdapter.ts:
- Line 117:
setTimeout()for queue timeout (90s) - Lines 276, 278:
setTimeout()for restart delays (2s, 3s) - Line 648:
setTimeout()for retry backoff
SentinelAdapter.ts:
- Lines 112, 140:
setTimeout()for retry delays (2s)
BaseOpenAICompatibleAdapter.ts (used by 5+ adapters):
- Line 436:
setTimeout()inrestartProvider()(2s delay) - Line 536:
setTimeout()inmakeRequest()retry logic (1s * attempt)
AnthropicAdapter.ts:
- Line 349:
setTimeout()inmakeRequest()retry backoff (1s * attempt)
Impact: With 13 AI personas + multiple external adapters, we have 100+ setTimeout timers in the main thread. Event loop thrashes under load.
After examining ALL adapters, here's the full picture:
Category 1: Local Process Adapters (Need restart management)
-
OllamaAdapter (660 lines)
- Manages local Ollama process
- Custom
OllamaRequestQueueclass (duplicates infrastructure) - setTimeout violations: 4 locations (lines 117, 276, 278, 648)
- CRITICAL ISSUE: Returns
@@@@@@@@garbage when thrashing - NO DETECTION - Custom health check with generation test
-
SentinelAdapter (location:
adapters/sentinel/)- Local process adapter (similar to Ollama)
- setTimeout violations: 2 locations (lines 112, 140)
- CRITICAL ISSUE: Same failure mode as Ollama - needs @@@@@@ detection
Category 2: OpenAI-Compatible API Adapters (Simple, use base class)
- OpenAIAdapter (52 lines) - extends BaseOpenAICompatibleAdapter
- DeepSeekAdapter (53 lines) - extends BaseOpenAICompatibleAdapter
- FireworksAdapter (45 lines) - extends BaseOpenAICompatibleAdapter
- GroqAdapter (87 lines) - extends BaseOpenAICompatibleAdapter
These adapters are THIN WRAPPERS (30-90 lines each) - all logic in base class!
Category 3: Custom API Adapters (Need specialized handling)
- AnthropicAdapter (467 lines)
- Uses proprietary Anthropic API format (not OpenAI-compatible)
- setTimeout violation: Line 349 (retry backoff in
makeRequest()) - Has custom multimodal content formatting
- No unified health monitoring
INSIGHT 1: Most setTimeout cancer is in BASE CLASSES
BaseAIProviderAdapter(2 setTimeout violations) affects ALL adaptersBaseOpenAICompatibleAdapter(2 setTimeout violations) affects 4+ adapters- Fixing base classes eliminates 80% of violations immediately
INSIGHT 2: Two Distinct Failure Modes
- Local process adapters (Ollama, Sentinel): Process crashes, returns garbage
- API adapters (OpenAI, Anthropic, etc.): Rate limits, network errors, API errors
INSIGHT 3: Common Code Duplication
- Every adapter implements its own
makeRequest()with setTimeout retry logic - No shared retry infrastructure
- No shared failure pattern detection
- Each adapter has custom health check implementation
INSIGHT 4: BaseOpenAICompatibleAdapter is GOLD
- OpenAI, DeepSeek, Fireworks, Groq all use it
- Proves that shared infrastructure WORKS
- Just needs setTimeout elimination + failure detection
INSIGHT 5: Health Checks are Lies
- Ollama healthCheck() returns "healthy" while returning
@@@@@@ - Anthropic healthCheck() only tests connectivity, not inference quality
- Need REAL health checks (actual inference test + response validation)
Each adapter implements health checks differently:
- Ollama: Custom health check with generation test (lines 427-564)
- BaseAdapter:
setIntervalpolling every 30s (line 111) - No common infrastructure for failure pattern detection
The @@@@@ Problem:
// Ollama when thrashing returns garbage like:
"@@@@@@@@@@@@@@@@@@@@@@@@"
// or partial inference:
"It@@@@@@@@@@@@@@@@@@@@"No adapter detects this pattern - they all blindly return it as valid text!
OllamaAdapter makeRequest() (lines 617-658):
// WRONG: Blocks adapter during retry
await new Promise(resolve => setTimeout(resolve, backoffMs));
return this.makeRequest<T>(endpoint, body, attempt + 1, reqId);Impact: High-priority requests (GPT-4o) blocked by low-priority retry (Llama).
OllamaRequestQueue exists (line 91) but:
- No priority levels - FIFO only
- Queue timeout uses setTimeout (line 117)
- No integration with PriorityQueue from system/core
- Zero setTimeout in adapters - Use event-driven patterns
- Unified health monitoring - BaseAdapter provides infrastructure
- Priority-based queuing - High-priority requests jump the queue
- Self-healing detection - Adapters validate responses, emit failure events
- Concurrent recovery - Restarts don't block other adapters
/**
* Centralized health monitoring for all adapters
* Runs in separate thread/worker to avoid main-thread blocking
*/
class AdapterHealthMonitor {
private adapters: Map<string, BaseAIProviderAdapter> = new Map();
private healthStats: Map<string, HealthStats> = new Map();
// Event-driven health checking (no setInterval)
async checkHealth(adapterId: string): Promise<HealthStatus> {
// 1. Test actual inference (not just API ping)
// 2. Validate response quality (detect @@@@@)
// 3. Update stats (error rate, latency)
// 4. Emit events: 'adapter:healthy', 'adapter:degraded', 'adapter:failed'
}
// Called by daemon when adapter request fails
async reportFailure(adapterId: string, error: Error, response?: string): Promise<void> {
// 1. Check for known failure patterns (@@@@@, timeouts, etc.)
// 2. Increment failure count
// 3. Trigger restart if threshold exceeded
// 4. Emit 'adapter:needs-restart' event
}
}/**
* Priority-based request queue using existing PriorityQueue
* Replaces per-adapter hacky queue implementations
*/
class AdapterRequestQueue {
private queue: PriorityQueue<AdapterRequest>;
async enqueue<T>(
request: AdapterRequest<T>,
priority: number // 0.0 (low) to 1.0 (high)
): Promise<T> {
// 1. Add to priority queue (no setTimeout)
// 2. Process immediately if slot available
// 3. Retry requests get priority 0.2 (LOW)
// 4. User requests get priority 0.8+ (HIGH)
}
// Event-driven processing (no polling)
private async processNext(): Promise<void> {
const request = this.queue.dequeue();
if (!request) return;
try {
const result = await this.executeRequest(request);
request.resolve(result);
} catch (error) {
// Report failure to AdapterHealthMonitor
await this.healthMonitor.reportFailure(request.adapterId, error);
request.reject(error);
} finally {
// Process next request (no setTimeout)
this.processNext();
}
}
}/**
* Validates adapter responses for known failure patterns
* Each adapter registers its failure patterns
*/
class ResponseValidator {
private patterns: Map<string, RegExp[]> = new Map();
registerFailurePattern(adapterId: string, pattern: RegExp): void {
// Ollama: /^@{5,}/ (5+ @ symbols)
// OpenAI: /rate limit exceeded/i
// etc.
}
validate(adapterId: string, response: string): ValidationResult {
const patterns = this.patterns.get(adapterId) || [];
for (const pattern of patterns) {
if (pattern.test(response)) {
return { valid: false, pattern, action: 'restart' };
}
}
return { valid: true };
}
}export abstract class BaseAIProviderAdapter {
// NO MORE setInterval or setTimeout!
// Unified queue (replaces per-adapter queues)
protected queue: AdapterRequestQueue;
// Unified health monitor (replaces per-adapter polling)
protected healthMonitor: AdapterHealthMonitor;
// Unified response validator
protected responseValidator: ResponseValidator;
async initialize(): Promise<void> {
// Register failure patterns for this adapter
this.registerFailurePatterns();
// Subscribe to health events
Events.subscribe('adapter:needs-restart', async (event) => {
if (event.adapterId === this.providerId) {
await this.restart();
}
});
}
async generateText(request: TextGenerationRequest): Promise<TextGenerationResponse> {
// Enqueue with priority (no setTimeout)
const priority = request.priority ?? 0.5;
return this.queue.enqueue({
adapterId: this.providerId,
execute: async () => {
const response = await this.generateTextImpl(request);
// Validate response (detect @@@@@ patterns)
const validation = this.responseValidator.validate(this.providerId, response.text);
if (!validation.valid) {
// Report failure + trigger restart
await this.healthMonitor.reportFailure(
this.providerId,
new Error(`Invalid response pattern: ${validation.pattern}`),
response.text
);
throw new Error('Adapter returned invalid response');
}
return response;
}
}, priority);
}
// Event-driven restart (no setTimeout)
private async restart(): Promise<void> {
this.log(null, 'warn', `🔄 ${this.providerName}: Restarting due to failures...`);
// 1. Stop accepting new requests
this.queue.pause();
// 2. Kill/restart provider process
await this.restartProvider();
// 3. Poll for health (no setTimeout - use async loop)
await this.waitForHealthy();
// 4. Resume accepting requests
this.queue.resume();
}
// Polling without setTimeout (async loop with delay)
private async waitForHealthy(maxAttempts = 10): Promise<void> {
for (let i = 0; i < maxAttempts; i++) {
const health = await this.healthCheck();
if (health.status === 'healthy') {
this.log(null, 'info', `✅ ${this.providerName}: Restart successful`);
return;
}
// Async delay (yields to event loop, doesn't block)
await this.asyncDelay(2000);
}
throw new Error('Adapter failed to restart');
}
// Helper: Async delay that yields to event loop
private async asyncDelay(ms: number): Promise<void> {
return new Promise(resolve => {
// Schedule microtask instead of setTimeout
queueMicrotask(() => {
const start = Date.now();
while (Date.now() - start < ms) {
// Spin for short delays (< 100ms)
// For longer delays, should use event-driven approach
}
resolve();
});
});
}
}export class OllamaAdapter extends BaseAIProviderAdapter {
// Remove OllamaRequestQueue class (use shared AdapterRequestQueue)
// Remove setTimeout in makeRequest() retry logic
// Remove setTimeout in restartProvider()
protected registerFailurePatterns(): void {
// Detect @@@@@ garbage output
this.responseValidator.registerFailurePattern(
this.providerId,
/^@{5,}/ // 5+ consecutive @ symbols
);
// Detect partial inference failure
this.responseValidator.registerFailurePattern(
this.providerId,
/\w@{5,}/ // Word followed by 5+ @ symbols (e.g. "It@@@@@@")
);
}
protected async restartProvider(): Promise<void> {
// No setTimeout - just kill and restart
spawn('killall', ['ollama']);
spawn('ollama', ['serve'], { detached: true, stdio: 'ignore' }).unref();
}
async generateText(request: TextGenerationRequest): Promise<TextGenerationResponse> {
// Use base class queue (no custom queue needed)
return super.generateText(request);
}
}Strategy: Fix base classes first (80% win), then migrate specific adapters.
NEW COMPONENTS:
-
ResponseValidator (
daemons/ai-provider-daemon/shared/ResponseValidator.ts)- Pattern registration per adapter
- Response validation (detect @@@@@, rate limits, etc.)
- Action triggers (restart, retry, fail)
-
AdapterRequestQueue (
daemons/ai-provider-daemon/shared/AdapterRequestQueue.ts)- Uses existing
PriorityQueuefromsystem/core/shared/PriorityQueue.ts - Priority-based queuing (0.2 for retries, 0.8+ for user requests)
- Event-driven processing (zero setTimeout)
- Uses existing
-
AdapterHealthMonitor (
daemons/ai-provider-daemon/shared/AdapterHealthMonitor.ts)- Centralized health monitoring for all adapters
- Event-driven health checks (no setInterval)
- Failure pattern detection + auto-restart triggers
TESTING:
npx vitest tests/unit/ResponseValidator.test.ts
npx vitest tests/unit/AdapterRequestQueue.test.ts
npx vitest tests/unit/AdapterHealthMonitor.test.tsFILE: daemons/ai-provider-daemon/shared/BaseAIProviderAdapter.ts
CHANGES:
- Remove
setIntervalfromstartHealthMonitoring()(line 111) - Remove
setTimeoutfromattemptRestart()(line 161) - Add
protected responseValidator: ResponseValidator - Add
protected requestQueue: AdapterRequestQueue - Add
protected healthMonitor: AdapterHealthMonitor - Add
protected abstract registerFailurePatterns(): void - Wrap
generateText()with response validation - Replace setTimeout with async loop in restart logic
IMPACT: ALL adapters immediately inherit new infrastructure
TESTING:
npx vitest tests/unit/BaseAIProviderAdapter.test.tsFILE: daemons/ai-provider-daemon/shared/adapters/BaseOpenAICompatibleAdapter.ts
CHANGES:
- Remove
setTimeoutfromrestartProvider()(line 436) - Remove
setTimeoutfrommakeRequest()retry logic (line 536) - Use
AdapterRequestQueuefor retry management - Replace retry setTimeout with priority-based re-queuing
IMPACT: OpenAI, DeepSeek, Fireworks, Groq all fixed immediately (4 adapters)
TESTING:
npx vitest tests/unit/BaseOpenAICompatibleAdapter.test.tsFILE: daemons/ai-provider-daemon/adapters/ollama/shared/OllamaAdapter.ts
CHANGES:
- REMOVE
OllamaRequestQueueclass (lines 91-207) - use sharedAdapterRequestQueue - Remove setTimeout from queue timeout (line 117)
- Remove setTimeout from restart delays (lines 276, 278)
- Remove setTimeout from retry backoff (line 648)
- ADD failure pattern registration:
protected registerFailurePatterns(): void { this.responseValidator.registerFailurePattern(this.providerId, /^@{5,}/); this.responseValidator.registerFailurePattern(this.providerId, /\w@{5,}/); }
- Replace custom queue with
this.requestQueue.enqueue()
TESTING:
npx vitest tests/integration/ollama-adapter-refactor.test.ts
# Test: @@@@@@ detection triggers restart
# Test: Priority queue - high-priority jumps queue
# Test: Concurrent 13 personas doesn't thrashFILE: daemons/ai-provider-daemon/adapters/sentinel/shared/SentinelAdapter.ts
CHANGES:
- Remove setTimeout from retry delays (lines 112, 140)
- Use shared
AdapterRequestQueuefor retry management - Register @@@@@@ failure patterns (same as Ollama)
TESTING:
npx vitest tests/integration/sentinel-adapter-refactor.test.tsFILE: daemons/ai-provider-daemon/adapters/anthropic/shared/AnthropicAdapter.ts
CHANGES:
- Remove setTimeout from
makeRequest()retry backoff (line 349) - Use shared
AdapterRequestQueuefor retry management - Register Anthropic-specific failure patterns (rate limits, etc.)
TESTING:
npx vitest tests/integration/anthropic-adapter-refactor.test.tsCOMPREHENSIVE TEST:
# Search for setTimeout in ALL adapter files
grep -r "setTimeout" daemons/ai-provider-daemon/
# Should return ZERO results (except in comments or test mocks)
# Run full integration test suite
npx vitest tests/integration/adapter-concurrency.test.ts
# Test: 13 concurrent personas + 50 external requests
# Test: Zero setTimeout calls
# Test: High-priority requests complete first
# Test: @@@@@@ responses trigger restart
# Test: All adapters healthy after load- setTimeout calls in adapters: 100+ (across all adapters)
- Ollama @@@@@@ failures: Undetected, silently returned to users
- Retry blocking: High-priority requests blocked by low-priority retries
- Health monitoring: Inconsistent, per-adapter polling
- Restart recovery time: 3-5s blocked main thread
- setTimeout calls in adapters: 0 (all event-driven)
- Ollama @@@@@@ failures: Detected immediately, auto-restart triggered
- Retry blocking: Zero - priority queue ensures high-priority first
- Health monitoring: Unified, event-driven, runs in separate thread
- Restart recovery time: 2-3s non-blocking (async)
npx vitest tests/unit/AdapterHealthMonitor.test.ts
npx vitest tests/unit/AdapterRequestQueue.test.ts
npx vitest tests/unit/ResponseValidator.test.tsnpx vitest tests/integration/ollama-adapter-refactor.test.ts
# Test: @@@@@@ detection triggers restart
# Test: Priority queue - high-priority jumps queue
# Test: Concurrent 13 personas doesn't thrashnpm start
# Send 50 concurrent requests (mix of priorities)
# Verify: Zero setTimeout calls in adapters
# Verify: @@@@@@ responses trigger restart
# Verify: High-priority requests complete first- PriorityQueue:
system/core/shared/PriorityQueue.ts(already implemented!) - Events system:
system/core/shared/Events.ts - DAEMON-CONCURRENCY-AUDIT.md: Original setTimeout audit
- User feedback: "even though i told you to make the daemons concurrent, you typically ignore me and write some pathetic async or even worse, rube goldberg inspired setTimeout logic"
- Get user approval for proposed architecture
- Create new infrastructure classes (Phase 1)
- Refactor BaseAIProviderAdapter (Phase 2)
- Migrate adapters one by one (Phases 3-5)
- Test and verify zero setTimeout violations
No more setTimeout cancer. Build real concurrent architecture.