Skip to content

fix(openai): use standard chat completions API for OpenRouter compatibility#1046

Open
NicolasArnouts wants to merge 3 commits intoItzCrazyKns:masterfrom
NicolasArnouts:fix/openrouter-compatibility
Open

fix(openai): use standard chat completions API for OpenRouter compatibility#1046
NicolasArnouts wants to merge 3 commits intoItzCrazyKns:masterfrom
NicolasArnouts:fix/openrouter-compatibility

Conversation

@NicolasArnouts
Copy link

@NicolasArnouts NicolasArnouts commented Mar 9, 2026

Summary

This PR fixes OpenRouter (and other OpenAI-compatible providers like LiteLLM) compatibility by replacing OpenAI-exclusive APIs with standard endpoints:

  • generateObject: Uses chat.completions.create with response_format: { type: 'json_object' } instead of chat.completions.parse (which returns 404 on OpenRouter)
  • streamObject: Uses chat.completions.create with streaming instead of responses.stream (OpenAI Responses API is not supported by other providers)

Also adds a shared parseJson utility for stripping markdown code fences that LLMs sometimes wrap around JSON responses, even when json_object mode is set.

Root Cause

The OpenAI SDK's chat.completions.parse() method calls the /chat/completions/parse endpoint, and responses.stream() uses the Responses API — both are OpenAI-exclusive and not implemented by OpenRouter or other compatible providers.

Tradeoffs

  • Lost: Server-side Zod schema validation via OpenAI's structured outputs
  • Gained: Compatibility with all OpenAI-compatible providers (OpenRouter, LiteLLM, Together, Fireworks, etc.)

The tradeoff is worthwhile since client-side validation with input.schema.parse() is already in place, and most non-OpenAI models don't support structured outputs anyway.

Related

Test Plan

  • Tested with OpenRouter using Grok 4.1 Fast, GPT-4o, and Claude Sonnet 4
  • Basic searches return valid JSON responses
  • Web search mode works correctly
  • Code follows existing patterns (used repairJson + stripMarkdownFences like other providers)

Summary by cubic

Switch to the standard chat completions API to restore compatibility with OpenRouter and other OpenAI‑compatible providers. Adds JSON cleaning and safer parsing, plus stronger guards to prevent crashes during streaming and search.

  • Bug Fixes
    • generateObject: use chat.completions.create with response_format: { type: 'json_object' }; add a system prompt to enforce pure JSON; handle empty content before parsing.
    • streamObject: stream via chat.completions.create; strip code fences; parse partial JSON incrementally, yielding {} until parseable.
    • JSON utils: add stripMarkdownFences/safeParseJson; apply in OpenAI and ollama providers to clean JSON before parsing.
    • Tool calls: add null-safety in convertToOpenAIMessages; parse streaming tool call arguments with try/catch and fall back to {} on errors.
    • Search: guard empty webSearch queries; default chatHistory to []; ensure SearXNG returns arrays on empty responses.

Written for commit 424158b. Summary will update on new commits.

…bility

Replace OpenAI-exclusive APIs with standard endpoints that work across
OpenAI-compatible providers (OpenRouter, LiteLLM, etc.):

- generateObject: Use chat.completions.create with response_format
  instead of chat.completions.parse (returns 404 on OpenRouter)
- streamObject: Use chat.completions.create with streaming instead of
  responses.stream (OpenAI Responses API not supported by other providers)

Also adds shared parseJson utility for stripping markdown code fences
that LLMs sometimes wrap around JSON responses even with json_object mode.

Fixes ItzCrazyKns#959
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/lib/utils/parseJson.ts">

<violation number="1" location="src/lib/utils/parseJson.ts:21">
P2: Opening fence stripping only handles plain or `json` fences; other common language tags (```js, ```jsonc, etc.) leave the tag in the string and cause JSON parsing to fail.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

const trimmed = text.trim();
if (trimmed.startsWith('```')) {
return trimmed
.replace(/^```(?:json)?\s*/i, '')
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Opening fence stripping only handles plain or json fences; other common language tags (js, jsonc, etc.) leave the tag in the string and cause JSON parsing to fail.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/utils/parseJson.ts, line 21:

<comment>Opening fence stripping only handles plain or `json` fences; other common language tags (```js, ```jsonc, etc.) leave the tag in the string and cause JSON parsing to fail.</comment>

<file context>
@@ -0,0 +1,43 @@
+  const trimmed = text.trim();
+  if (trimmed.startsWith('```')) {
+    return trimmed
+      .replace(/^```(?:json)?\s*/i, '')
+      .replace(/```\s*$/, '')
+      .trim();
</file context>
Fix with Cubic

Add check for empty content before calling repairJson to prevent
"is empty" error when the model returns null/empty response.
- Add null safety checks in convertToOpenAIMessages for tool call
  arguments that may be strings or objects
- Wrap tool call argument parsing in try-catch to handle malformed
  JSON gracefully, falling back to empty object on error
- Add guard against undefined/empty queries in webSearch action
- Add fallback for undefined chatHistory in researcher
- Ensure SearXNG always returns arrays even on empty responses

These fixes prevent crashes when OpenRouter/compatible providers
return unexpected data formats during streaming tool calls.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/lib/models/providers/openai/openaiLLM.ts">

<violation number="1" location="src/lib/models/providers/openai/openaiLLM.ts:187">
P2: Parse failures in streaming tool calls are now swallowed and emitted as a synthetic `{}` arguments tool call, which downstream consumers execute without validation. This can trigger incorrect tool execution instead of surfacing the parse error.</violation>

<violation number="2" location="src/lib/models/providers/openai/openaiLLM.ts:257">
P1: Raw model output is logged on JSON-repair failure, which can leak sensitive content into server logs.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

extractJson: true,
}) as string;
} catch (repairErr) {
console.error('repairJson failed on content:', content);
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Raw model output is logged on JSON-repair failure, which can leak sensitive content into server logs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/models/providers/openai/openaiLLM.ts, line 257:

<comment>Raw model output is logged on JSON-repair failure, which can leak sensitive content into server logs.</comment>

<file context>
@@ -229,13 +248,16 @@ class OpenAILLM extends BaseLLM<OpenAIConfig> {
+            extractJson: true,
+          }) as string;
+        } catch (repairErr) {
+          console.error('repairJson failed on content:', content);
+          throw new Error(`Failed to repair JSON: ${repairErr}`);
+        }
</file context>
Suggested change
console.error('repairJson failed on content:', content);
console.error('repairJson failed', {
error: repairErr instanceof Error ? repairErr.message : String(repairErr),
contentLength: content.length,
});
Fix with Cubic

existingCall.arguments += tc.function?.arguments || '';
return {
const argsToParse = existingCall.arguments || '{}';
parsedToolCalls.push({
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Parse failures in streaming tool calls are now swallowed and emitted as a synthetic {} arguments tool call, which downstream consumers execute without validation. This can trigger incorrect tool execution instead of surfacing the parse error.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/lib/models/providers/openai/openaiLLM.ts, line 187:

<comment>Parse failures in streaming tool calls are now swallowed and emitted as a synthetic `{}` arguments tool call, which downstream consumers execute without validation. This can trigger incorrect tool execution instead of surfacing the parse error.</comment>

<file context>
@@ -163,27 +166,43 @@ class OpenAILLM extends BaseLLM<OpenAIConfig> {
                 existingCall.arguments += tc.function?.arguments || '';
-                return {
+                const argsToParse = existingCall.arguments || '{}';
+                parsedToolCalls.push({
                   ...existingCall,
-                  arguments: parse(existingCall.arguments),
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] v1.12.0: JSON parse error with Claude models via OpenAI-compatible endpoints (LiteLLM/OpenRouter)

1 participant