Skip to content

feat(providers): add openai-oxide as alternative OpenAI provider#521

Open
fortunto2 wants to merge 3 commits intomoltis-org:mainfrom
fortunto2:feat/openai-oxide-v2
Open

feat(providers): add openai-oxide as alternative OpenAI provider#521
fortunto2 wants to merge 3 commits intomoltis-org:mainfrom
fortunto2:feat/openai-oxide-v2

Conversation

@fortunto2
Copy link
Copy Markdown

@fortunto2 fortunto2 commented Mar 30, 2026

Summary

  • Add openai-oxide 0.11.1 as standalone OpenAI provider ([providers.openai-oxide])
  • Supports both Chat Completions and Responses API via wire_api config
  • Full tool calling, vision, reasoning effort, WebSocket streaming
  • Switch between oxide and built-in by renaming config section — no rebuild needed
  • Both provider-async-openai and provider-openai-oxide compile by default

Config-based switching

Use oxide (Responses API + WebSocket):

[providers.openai-oxide]
enabled = true
api_key = "${OPENAI_API_KEY}"
wire_api = "responses"
stream_transport = "auto"

Use built-in (Chat Completions + SSE):

[providers.openai]
enabled = true
api_key = "${OPENAI_API_KEY}"

Feature comparison

Feature async_openai (built-in) openai_oxide
Chat Completions
Responses API
Streaming usage tokens
Tool calling
Streaming tool calls
Vision
Reasoning effort
WebSocket transport
SSE + Auto fallback

Validation

Completed

  • cargo check passes (full workspace)
  • cargo build passes
  • moltis doctor — provider detected
  • Gateway starts, serves UI, WebSocket streaming works
  • Tested gpt-5-mini via Responses API + WebSocket
  • Config-based switching verified (oxide ↔ built-in)
  • openai-oxide 0.11.1 API compatibility (3 breaking changes fixed)

Remaining

  • cargo test (provider unit tests)
  • just lint / just format-check
  • E2E test with tool calling

Manual QA

  1. Add [providers.openai-oxide] to config with wire_api = "responses"
  2. Set OPENAI_API_KEY, start gateway
  3. Select any openai-oxide:: model in UI
  4. Send message — verify streaming response via WebSocket
  5. Rename section to [providers.openai] — verify built-in works

🤖 Generated with Claude Code

…cket

Add openai-oxide 0.11.1 as alternative OpenAI provider supporting:
- Chat Completions and Responses API (WireApi config)
- SSE, WebSocket, and Auto transport modes
- Full tool calling (extraction, streaming, replay)
- Vision, reasoning effort, model discovery
- 888 lines replacing 5300+ lines of manual HTTP/SSE code

Replaces PR moltis-org#487 (rebased on current main, oxide bumped 0.10.1 → 0.11.1).
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR introduces openai-oxide 0.11.1 as a compile-time-selectable replacement for async-openai, supporting both the Chat Completions and Responses APIs with streaming, tool calling, vision, reasoning effort, and WebSocket transport. The implementation is well-structured — the process_response_event helper cleanly centralises event mapping and is reused across SSE, WebSocket, and auto-fallback paths. Unit tests cover message-building and extraction helpers.

Two P1 bugs need resolution before merge:

  • Tool results dropped in Responses API path (openai_oxide_provider.rs): split_for_responses silently discards ChatMessage::Tool (tool-call results) and UserContent::Multimodal messages via the catch-all _ => {}. Any agent turn that calls a tool and receives a result will have the tool output missing from the next ResponseCreateRequest, breaking the agent loop when WireApi::Responses is in use.
  • Hardcoded display name and single-model registration (lib.rs): display_name is unconditionally set to "GPT-4o (openai-oxide)" regardless of the configured model, and only the first entry from configured_models_for_provider is registered. Users who configure additional models (e.g. gpt-4-turbo, o3-mini) will see them silently ignored.

Additionally, ToolCallComplete events in stream_chat are emitted by iterating a HashMap, producing non-deterministic ordering for multi-tool responses (P2). The provider-openai-oxide feature is added to the workspace defaults, which contradicts the "opt-in" framing in the PR description (P2).

Confidence Score: 4/5

Not safe to merge as-is — tool results are silently dropped in the Responses API path, which breaks agent loops using tool calling.

Two P1 issues block merge: the split_for_responses silent drop of ChatMessage::Tool results will cause agent tool-call loops to malfunction under WireApi::Responses, and the hardcoded "GPT-4o" display name plus single-model registration limit usability. The overall architecture and streaming implementation are solid; fixing these targeted issues should bring the PR to merge-ready.

crates/providers/src/openai_oxide_provider.rs (split_for_responses tool-result handling) and crates/providers/src/lib.rs (display_name + multi-model registration)

Important Files Changed

Filename Overview
crates/providers/src/openai_oxide_provider.rs New 888-line provider implementation; clean overall structure, but split_for_responses silently drops ChatMessage::Tool results and multimodal user messages, breaking agent tool-call loops under WireApi::Responses. ToolCallComplete ordering in streaming is also non-deterministic for multiple simultaneous tool calls.
crates/providers/src/lib.rs Registration function only registers the first configured model and hardcodes display_name to "GPT-4o (openai-oxide)" regardless of actual model; both are P1 correctness issues.
Cargo.toml Adds openai-oxide 0.11.1 workspace dependency and enables provider-openai-oxide by default; contradicts PR's "opt-in" framing but is otherwise structurally sound.
crates/providers/Cargo.toml Adds provider-openai-oxide feature flag and optional openai-oxide dependency; straightforward and correctly structured.
Cargo.lock Lock file updated to add openai-oxide 0.11.1 and its transitive deps (including gloo-timers for WebSocket); also bumps several windows-sys entries from 0.48/0.59/0.60 to 0.61.2.

Sequence Diagram

sequenceDiagram
    participant C as Caller
    participant P as OpenAiOxideProvider
    participant OAI as openai-oxide client

    C->>P: stream_with_tools(messages, tools)
    alt WireApi::ChatCompletions
        P->>P: build_chat_messages()
        P->>OAI: chat().completions().create_stream()
        OAI-->>P: SSE chunks (content / tool_calls / usage)
        P-->>C: StreamEvent::Delta / ToolCallStart / ToolCallArgumentsDelta / ToolCallComplete / Done
    else WireApi::Responses + SSE
        P->>P: split_for_responses() ⚠️ drops Tool & Multimodal
        P->>P: build_responses_request()
        P->>OAI: responses().create_stream()
        OAI-->>P: ResponseStreamEvent
        P->>P: process_response_event()
        P-->>C: StreamEvent::Delta / ToolCallStart / ToolCallArgumentsDelta / ToolCallComplete / Done
    else WireApi::Responses + WebSocket
        P->>OAI: ws_session()
        OAI-->>P: WsSession
        P->>OAI: session.send_stream()
        OAI-->>P: ResponseStreamEvent (WS)
        P->>P: process_response_event()
        P-->>C: StreamEvent::Delta / ... / Done
    else WireApi::Responses + Auto
        P->>OAI: ws_session() [attempt]
        alt WS succeeds
            OAI-->>P: WsSession
            P->>OAI: session.send_stream()
        else WS fails
            P->>P: fallback to stream_responses (SSE)
        end
        P-->>C: StreamEvent (from whichever path succeeded)
    end
Loading

Comments Outside Diff (1)

  1. Cargo.toml, line 228 (link)

    P2 "Opt-in" feature is enabled by default in workspace

    The PR description states provider-openai-oxide is "opt-in", but this line enables it unconditionally in the workspace's default feature set. If provider-async-openai is ever re-added here (or is enabled via another workspace member), both providers will register against the same "openai" config key and the first-one-wins race will silently suppress whichever loads second. The debug warning on the other side is easy to miss.

    Consider either: (a) clarifying the documentation that this is now the default OpenAI provider, or (b) removing it from the workspace defaults and requiring explicit opt-in as described.

Reviews (1): Last reviewed commit: "feat(providers): add openai-oxide provid..." | Re-trigger Greptile

Comment on lines +537 to +574
#[async_trait]
impl LlmProvider for OpenAiOxideProvider {
fn name(&self) -> &str {
self.alias.as_deref().unwrap_or("openai-oxide")
}

fn id(&self) -> &str {
&self.model
}

fn supports_tools(&self) -> bool {
true
}

fn supports_vision(&self) -> bool {
true
}

fn reasoning_effort(&self) -> Option<ReasoningEffort> {
self.reasoning_effort
}

fn with_reasoning_effort(
self: Arc<Self>,
effort: ReasoningEffort,
) -> Option<Arc<dyn LlmProvider>> {
Some(Arc::new(Self {
model: self.model.clone(),
client: self.client.clone(),
alias: self.alias.clone(),
reasoning_effort: Some(effort),
wire_api: self.wire_api,
stream_transport: self.stream_transport,
}))
}

fn context_window(&self) -> u32 {
128_000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Tool results silently dropped in Responses API path

split_for_responses handles System, User { Text }, and Assistant { content } but the catch-all _ => {} silently discards both ChatMessage::Tool { ... } (tool-call results) and ChatMessage::User { content: UserContent::Multimodal(_) } (vision messages).

For any agent turn that involves tool calling with WireApi::Responses, the ChatMessage::Tool result messages produced by the tool executor will be dropped before the request is built. The resulting ResponseCreateRequest will contain an assistant message that references a function call with no corresponding output item, which causes the Responses API to return an error or silently loop — breaking the agent loop entirely.

The Responses API represents tool results as function_call_output input items. Something like:

ChatMessage::Tool { tool_call_id, content, .. } => {
    input.push(ResponseInputItem {
        role: openai_oxide::types::responses::Role::Tool,
        content: serde_json::json!({
            "type": "function_call_output",
            "call_id": tool_call_id,
            "output": content,
        }),
    });
}

Vision/multimodal messages (UserContent::Multimodal) also need handling — despite the PR claiming vision support, it only works via Chat Completions, not Responses API.

Comment on lines 1778 to 1830
);
}

#[cfg(feature = "provider-openai-oxide")]
fn register_openai_oxide_providers(
&mut self,
config: &ProvidersConfig,
env_overrides: &HashMap<String, String>,
) {
if !config.is_enabled("openai") {
return;
}

let Some(key) = resolve_api_key(config, "openai", "OPENAI_API_KEY", env_overrides) else {
return;
};

let base_url = config
.get("openai")
.and_then(|e| e.base_url.clone())
.or_else(|| env_value(env_overrides, "OPENAI_BASE_URL"))
.unwrap_or_else(|| "https://api.openai.com/v1".into());

let model_id = configured_models_for_provider(config, "openai")
.into_iter()
.next()
.unwrap_or_else(|| "gpt-4o".to_string());

let alias = config.get("openai").and_then(|e| e.alias.clone());
let provider_label = alias.clone().unwrap_or_else(|| "openai-oxide".into());
if self.has_model_any_provider(&model_id) {
return;
}

let provider = Arc::new(openai_oxide_provider::OpenAiOxideProvider::with_alias(
key,
model_id.clone(),
base_url,
alias,
));
self.register(
ModelInfo {
id: model_id,
provider: provider_label,
display_name: "GPT-4o (openai-oxide)".into(),
created_at: None,
},
provider,
);
}

#[cfg(feature = "provider-openai-codex")]
fn register_openai_codex_providers(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 display_name hardcoded regardless of model; only first configured model registered

Two related issues in register_openai_oxide_providers:

1. Hardcoded display name: The display_name is unconditionally set to "GPT-4o (openai-oxide)" regardless of which model is actually selected. If a user sets model = "gpt-4-turbo" or model = "o3-mini" in their config, the UI will still show "GPT-4o (openai-oxide)", surfacing incorrect data.

// Current (wrong):
display_name: "GPT-4o (openai-oxide)".into(),

// Should be derived from the actual model_id:
display_name: format!("{model_id} (openai-oxide)"),

2. Only the first configured model is registered: configured_models_for_provider(config, "openai").into_iter().next() silently discards every configured model after the first. Other providers (e.g. async-openai) register all models; here a user who has configured ["gpt-4o", "gpt-4-turbo"] will only get gpt-4o registered, with no indication the others were ignored.

Comment on lines +615 to +650
request.stream_options = Some(StreamOptions { include_usage: Some(true) });
stream_chat(&self.client, request)
}
WireApi::Responses => {
let request = self.build_responses_request(&messages, &tools);
match self.stream_transport {
ProviderStreamTransport::Websocket => {
stream_responses_ws(&self.client, request)
}
ProviderStreamTransport::Auto => {
// Auto: try WS, fallback to SSE
stream_responses_auto(&self.client, request)
}
ProviderStreamTransport::Sse => {
stream_responses(&self.client, request)
}
}
}
}
}
}

impl OpenAiOxideProvider {
fn build_responses_request(
&self,
messages: &[ChatMessage],
tools: &[serde_json::Value],
) -> ResponseCreateRequest {
let (instructions, input) = split_for_responses(messages);
let mut request = ResponseCreateRequest::new(&self.model)
.input(ResponseInput::Messages(input));
if let Some(instr) = instructions {
request = request.instructions(instr);
}
if !tools.is_empty() {
request.tools = Some(tools_to_response_tools(tools));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ToolCallComplete events emitted in non-deterministic order

In stream_chat, ToolCallComplete events for multiple simultaneous tool calls are emitted by iterating seen_starts.keys() — a HashMap<i32, bool> — whose iteration order is not guaranteed:

for &idx in seen_starts.keys() {
    yield StreamEvent::ToolCallComplete { index: idx as usize };
}

If a downstream consumer expects ToolCallComplete(n) to follow ToolCallStart(n) in index order (which is a reasonable assumption), this can produce out-of-order events when n > 1. Consider sorting the keys before iterating:

let mut sorted_keys: Vec<i32> = seen_starts.keys().copied().collect();
sorted_keys.sort_unstable();
for idx in sorted_keys {
    yield StreamEvent::ToolCallComplete { index: idx as usize };
}

…based switching

- Register oxide under its own config key `[providers.openai-oxide]`
- Both oxide and async-openai compile by default (no feature flag switching)
- Oxide discovers all OpenAI models (same catalog as built-in)
- Reads wire_api and stream_transport from provider config
- Switch providers by renaming config section, no rebuild needed
… ordering

- Map ChatMessage::Tool as function_call_output in Responses API path
- Map UserContent::Multimodal (flatten text) in Responses API path
- Replace HashMap with BTreeMap for deterministic ToolCallComplete ordering
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant