Fix Replicate API for Gemini models and improve embed() #680

Fuzzwah · 2025-11-28T06:34:38Z

Summary

Fixes Replicate API to properly support Gemini models and improves the embed() function to be more robust.

Problem

When using Gemini models (e.g., google/gemini-2.5-flash) on Replicate:

The system_prompt field is ignored - Gemini needs the system message in the main prompt
The stream() method returns empty results - Gemini needs run() instead
The embed() function would incorrectly use the chat model for embeddings

Changes

Gemini Model Support

Detect Gemini models by checking if model name contains 'gemini'
Combine system message into the main prompt (since system_prompt is ignored)
Use run() instead of stream() for Gemini models
Handle various output formats (string, array, object)

Improved embed() Function

Always use a dedicated embedding model, not the chat model
Detect if configured model is an embedding model (contains 'embed', 'gte', 'e5-')
Fall back to mark3labs/embeddings-gte-base for chat models
Add input validation for text parameter
Handle multiple output formats (vectors, embedding, embeddings, array)
Better error messages

Usage

To use Gemini on Replicate, configure your profile:

{
  "model": "replicate/google/gemini-2.5-flash",
  "embedding": "replicate/mark3labs/embeddings-gte-base"
}

Testing

Tested with:

google/gemini-2.5-flash for chat
mark3labs/embeddings-gte-base for embeddings

Both work correctly with this fix.

Gemini model fixes: - Detect Gemini models by name and use different input format - Combine system message into prompt (Gemini ignores system_prompt field) - Use run() instead of stream() for Gemini (streaming returns empty) - Handle various output formats (string, array) Embed function improvements: - Always use dedicated embedding model, not chat model - Add input validation for text parameter - Handle multiple output formats (vectors, embedding, embeddings, array) - Detect embedding models vs chat models to avoid using wrong model These changes enable using Gemini models (e.g., google/gemini-2.5-flash) on Replicate alongside existing Llama/Mistral models.

Sweaterdog

Code looks good, I haven't tested

Sweaterdog approved these changes Dec 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Replicate API for Gemini models and improve embed() #680

Fix Replicate API for Gemini models and improve embed() #680

Uh oh!

Fuzzwah commented Nov 28, 2025

Uh oh!

Sweaterdog left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix Replicate API for Gemini models and improve embed() #680

Are you sure you want to change the base?

Fix Replicate API for Gemini models and improve embed() #680

Uh oh!

Conversation

Fuzzwah commented Nov 28, 2025

Summary

Problem

Changes

Gemini Model Support

Improved embed() Function

Usage

Testing

Uh oh!

Sweaterdog left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants