Skip to content

fix: use native fetch for Ollama embedding to ensure AbortController works#383

Merged
rwmjhb merged 2 commits intoCortexReach:masterfrom
jlin53882:fix/ollama-native-fetch-v2
Mar 29, 2026
Merged

fix: use native fetch for Ollama embedding to ensure AbortController works#383
rwmjhb merged 2 commits intoCortexReach:masterfrom
jlin53882:fix/ollama-native-fetch-v2

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

Problem

When using Ollama as the embedding provider, the \embedQuery\ timeout does not reliably abort stalled Ollama HTTP requests, causing the gateway-level 120s timeout.

Root Cause

OpenAI SDK HTTP client does not reliably abort Ollama TCP connections when AbortController.abort() fires in Node.js.

Fix

Use native \ etch\ for Ollama endpoints. Node.js 18+ native fetch correctly respects AbortController.

  • \isOllamaProvider(): detects localhost:11434 / 127.0.0.1:11434 / /ollama URLs
  • \embedWithNativeFetch(): calls Ollama via native fetch with proper signal handling
  • \embedWithRetry()\ routes Ollama URLs through embedWithNativeFetch()

Test

  • Test 8 added to cjk-recursion-regression.test.mjs
  • Assertion updated to match Embedding provider unreachable error (AliceLJY review fix)

Fixes #361.

…works

Root cause: OpenAI SDK HTTP client does not reliably abort Ollama
TCP connections when AbortController.abort() fires in Node.js. This
causes stalled sockets that hang until the gateway-level 120s timeout.

Fix: Add isOllamaProvider() to detect localhost:11434 endpoints, and
embedWithNativeFetch() using Node.js 18+ native fetch instead of the
OpenAI SDK. Native fetch properly closes TCP connections on abort.

Also adds Test 8 to cjk-recursion-regression.test.mjs with assertion
updated to match Embedding provider unreachable error (AliceLJY review fix).

Fixes CortexReach#361.
@jlin53882
Copy link
Copy Markdown
Contributor Author

Summary of Changes

This is a clean rebase of PR #362 addressing your review feedback:

Applied fixes from your CI diagnosis:

  • Assertion regex updated — added \Embedding provider unreachable\ to match the error that CI produces when no Ollama is running
  • Standalone test files removed — \pr354-standalone.mjs\ and \pr354-30iter.mjs\ are not included (they required a running Ollama instance)

Retained:

  • \embedder.ts: \isOllamaProvider()\ + \embedWithNativeFetch()\ + \embedWithRetry()\ routing
  • \cjk-recursion-regression.test.mjs: Test 8 (in CI-safe form)

Excluded:

Original PR #362 has been closed. CI should now pass. Thanks for the detailed diagnosis @AliceLJY!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6782a1dfee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Use an unreachable port + localhost so isOllamaProvider() returns true
// (URL contains 127.0.0.1:11434) but nothing actually listens there.
// This forces native fetch to properly reject, validating the Ollama path.
const ollamaBaseURL = "http://127.0.0.1:11434/v1";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Point Ollama abort test at the local mock server

This test is meant to verify that abort signals propagate through the Ollama native-fetch path, but it hardcodes http://127.0.0.1:11434/v1 instead of using the baseURL from withServer. As written, the request usually fails immediately with connection refusal, so the slow handler (and its abort/destroy behavior) is never exercised; the test can pass even if abort propagation is broken.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

分析完全正確,已修復 ✅

作者的診斷非常精準:

http://127.0.0.1:11434/v1 hardcoded,請求永遠立即 connection refused,根本沒打到 slow handler。

問題根因withServer() 分配隨機 port,但 Embedder 仍指向 127.0.0.1:11434,導致請求從未接觸到 mock server 的 11 秒延遲邏輯。

新 commit (a26fd70) 修復內容

  1. src/embedder.tswithTimeout() 接受外部 AbortSignal,合併至內部 AbortController。embedQuery/embedPassage 全部支援 signal?: AbortSignal 並傳到底層。

  2. test/cjk-recursion-regression.test.mjstestOllamaAbortWithNativeFetch() 現在:

    • Mock server 直接綁定 127.0.0.1:11434(讓 isOllamaProvider() = true)
    • Server 延遲 5 秒回應
    • embedPassage(text, externalSignal) 在 2 秒時 abort
    • 斷言:耗時 ≈ 2s(不是 5s) — 證明 abort 打斷了慢請求
  3. test/embedder-ollama-abort.test.mjs:獨立測試檔,隔離性更好。

測試結果

✔ Ollama embedWithNativeFetch aborts slow request within expected time (2029.86ms)
  PASSED (aborted in 2029ms < 5000ms threshold)

@jlin53882
Copy link
Copy Markdown
Contributor Author

Note on CI failure: reflection-bypass-hook.test.mjs

The cli-smoke failure in this PR comes from test/reflection-bypass-hook.test.mjs, which is not part of our changes (this PR only touches src/embedder.ts and est/cjk-recursion-regression.test.mjs).

The test failures in reflection-bypass-hook are pre-existing on master. The error:

The input did not match /<derived-focus>/. Input: ''

appears to be caused by a date-based seed: the test uses \Date.UTC(2026, 2, 12)\ for seed data, but the current date is 2026-03-26+. This causes derived memory entries to exceed the age filter threshold and get filtered out, returning empty results.

Suggested fix (for a separate PR)

Update the hardcoded seed date in \ est/reflection-bypass-hook.test.mjs\ to a date within the active window (e.g., \Date.UTC(2026, 2, 26)\ or use a relative date).

I'll open a separate issue + PR for this. The Ollama fix itself (embedder.ts changes) is ready and unrelated to this failure.

@jlin53882
Copy link
Copy Markdown
Contributor Author

PR Chain Update

See #362 for the full chain of related issues and PRs:

Recommended: merge #385 before #383 to clear the CI.

@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Mar 28, 2026

新增的 testOllamaAbortWithNativeFetch() 目前没有真正验证“慢 / 挂起请求时 AbortController 能及时打断”这个核心场景。测试里虽然
起了一个 11 秒慢响应的 mock server,但实际 Embedder 还是指向硬编码的 http://127.0.0.1:11434/v1,没有用 withServer(...)
供的 baseURL。所以它现在验证到的是“会走 Ollama/native fetch 分支,并在无服务时按预期报错”,而不是“abort 能正确打断慢请求”。

如果你想保留当前这个测试形态,我建议把测试名/注释收窄成“验证 Ollama URL 会走 native fetch 路径”;如果想把它作为这个修复的核心回
归测试,那就需要真正把请求打到慢 mock server 上,并断言 abort 在预期时间内发生。

@jlin53882
Copy link
Copy Markdown
Contributor Author

分析完全正確。已在新 commit 中修復:

問題根因
withServer() 動態分配一個隨機 port,卻沒有把這個 port 傳給 Embedder——Embedder 仍指向硬編碼的 127.0.0.1:11434。所以請求從頭到尾都是 connection refused,根本沒有打到 mock server 的 11 秒慢回應邏輯。

修復內容

  1. src/embedder.ts — 信號傳遞

    • withTimeout() 現在接受 externalSignal?: AbortSignal,與內部 AbortController 合併,任一信號 abort 都會取消請求
    • embedQuery / embedPassage / embedBatchQuery / embedBatchPassage 全部接受 signal?: AbortSignal 並傳到底層
    • embedMany 也將 signal 傳給 embedSingle
  2. test/cjk-recursion-regression.test.mjs — 修復後的 testOllamaAbortWithNativeFetch()

    • Mock server 直接綁定 127.0.0.1:11434(讓 isOllamaProvider() 返回 true)
    • Server 延遲 5 秒後才回應
    • embedPassage(text, externalSignal) 外部信號在 2 秒時觸發 abort
    • 斷言:耗時 ≈ 2 秒(不是 5 秒)—— 證明 abort 真的打斷了慢請求
  3. test/embedder-ollama-abort.test.mjs — 新的獨立測試檔

    • 使用 Node.js 官方 test() API,隔離性更好
    • 同樣驗證:耗時 ≈ 2s < 5s

驗證結果

✔ Ollama embedWithNativeFetch aborts slow request within expected time (2029ms)
  PASSED (aborted in 2029ms < 5000ms threshold)

@rwmjhb rwmjhb merged commit 6b2397c into CortexReach:master Mar 29, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Ollama embedding AbortController doesn't abort — causes 120s gateway timeout

2 participants