-
Notifications
You must be signed in to change notification settings - Fork 574
Description
Problem
When using Ollama as the embedding provider, the embedQuery timeout (EMBED_TIMEOUT_MS = 10s) does not reliably abort stalled Ollama HTTP requests. This causes the gateway-level autoRecallTimeoutMs (120s) to fire instead.
Root Cause
embedder.ts uses the OpenAI SDK to call Ollama. The SDK HTTP client in Node.js does not reliably abort the underlying TCP connection when AbortController.abort() is called. Ollama keeps processing and the socket hangs until the 120s gateway timeout fires.
Evidence: CPU ~20%, Ollama CPU ~0% — signature of a hanging HTTP connection, not compute bottleneck.
Fix
Use native fetch for Ollama endpoints. Node.js 18+ native fetch correctly respects AbortController — TCP connection is properly closed when signal fires.
Added isOllamaProvider() and embedWithNativeFetch() methods. Modified embedWithRetry() to route Ollama URLs through native fetch.
Test
30 iterations — all passed.
Abort time consistent at ~208-215ms (signal fires at 200ms).
Environment
- memory-lancedb-pro: 1.1.0-beta.10
- Ollama: jina-v5-retrieval-test on localhost:11434
- Node.js: 24.x
- OS: Windows 11