Skip to content

Conversation

@lehcode
Copy link

@lehcode lehcode commented Dec 22, 2025

Summary

Add support for openai_generic LLM provider in the MCP server factory.

This provider uses OpenAIGenericClient which calls /chat/completions with response_format for structured output instead of the /responses endpoint.

Problem

The default openai provider uses OpenAIClient which calls OpenAI's /responses endpoint. This endpoint is only available on OpenAI's native API and fails with:

  • LiteLLM proxy
  • Ollama
  • vLLM
  • Any other OpenAI-compatible API

When using these backends, users get errors like:

'NoneType' object has no attribute 'encode'

Solution

Add openai_generic case to LLMClientFactory that:

  1. Imports OpenAIGenericClient (which already exists in graphiti-core)
  2. Uses the same OpenAI provider config (api_key, api_url)
  3. Routes to /chat/completions endpoint with response_format for structured output
  4. Implements automatic fallback from json_schema to json_object for providers that don't support strict schemas
  5. Adds robust JSON extraction for responses with trailing explanatory text

Usage

llm:
  provider: "openai_generic"  # Instead of "openai"
  model: "your-model"
  providers:
    openai:
      api_key: ${OPENAI_API_KEY}
      api_url: ${OPENAI_BASE_URL}  # e.g., http://localhost:4000

Type of Change

  • Bug fix (JSON extraction, auto-fallback)
  • New feature (openai_generic provider)
  • Performance improvement
  • Documentation/Tests

Objective

Enable Graphiti to work with OpenAI-compatible APIs like LiteLLM proxy and Ollama. Some providers (e.g., Gemini via LiteLLM) don't support json_schema response format and return the schema definition instead of data. This PR adds automatic detection and fallback to json_object mode, plus robust JSON extraction for providers that append explanatory text after JSON output.

Testing

  • Unit tests added/updated (18 tests for _is_schema_returned_as_data, _extract_json)
  • Integration tests added/updated
  • All existing tests pass
  • Verified with LiteLLM proxy + Ollama backend
  • Confirmed /chat/completions endpoint is called (not /responses)
  • Tested add_memory and entity extraction working

Breaking Changes

  • This PR contains breaking changes

Checklist

  • Code follows project style guidelines (make lint passes)
  • Self-review completed
  • Documentation updated where necessary
  • No secrets or sensitive information committed

Related Issues

N/A

@danielchalef
Copy link
Member

danielchalef commented Dec 22, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@lehcode
Copy link
Author

lehcode commented Dec 22, 2025

I have read the CLA Document and I hereby sign the CLA

danielchalef added a commit that referenced this pull request Dec 22, 2025
@lehcode
Copy link
Author

lehcode commented Dec 23, 2025

Added auto-fallback for providers without json_schema support: e680665

Add support for `openai_generic` LLM provider in the MCP server factory.

This provider uses `OpenAIGenericClient` which calls `/chat/completions`
with `response_format` for structured output instead of the `/responses`
endpoint. This enables compatibility with:
- LiteLLM proxy
- Ollama
- vLLM
- Any OpenAI-compatible API that doesn't support `/responses`

The `/responses` endpoint is only available on OpenAI's native API, so
this provider is essential for self-hosted LLM deployments.

Usage in config.yaml:
```yaml
llm:
  provider: "openai_generic"
  model: "your-model"
  providers:
    openai:
      api_key: ${OPENAI_API_KEY}
      api_url: ${OPENAI_BASE_URL}
```
Detect when providers (e.g., LiteLLM with Gemini) return schema
definition instead of data and automatically switch to json_object
mode with schema embedded in prompt.

- Add _is_schema_returned_as_data() detection helper
- Add instance-level _use_json_object_mode fallback state
- Modify _generate_response() to support dual modes
- Fallback persists for client lifetime after first trigger
- Add _extract_json() method to handle responses with trailing content
- Simplify _is_json_schema() detection logic
- Handle "Extra data" JSON parse errors gracefully
- Document openai_generic provider in README.md with LiteLLM and Ollama examples
- Add provider configuration to .env.example
- Add unit tests for _is_schema_returned_as_data() and _extract_json() methods
@lehcode lehcode force-pushed the feat/mcp-openai-generic-provider branch from 453e8c0 to 45bfe80 Compare December 29, 2025 14:39
FlibbertyGibbitz pushed a commit to RuneLabs-ai/graphiti that referenced this pull request Jan 5, 2026
Adds configuration files for running Graphiti with LiteLLM proxy:

- docker-compose-local.yml: Isolated Neo4j (ports 7475/7688) + MCP server (8001)
- config-local.yaml: Uses openai_generic provider with gpt-oss-120b model
- .env.local.example: Template for LiteLLM credentials

This setup routes LLM calls through LiteLLM for unified logging and
local model support, while embeddings go to OpenAI via LiteLLM.

Requires PR getzep#1120 openai_generic provider to be merged.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants