-
Notifications
You must be signed in to change notification settings - Fork 385
feat: implement google genai provider #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement google genai provider #134
Conversation
Test summary |
|
@evalstate I might have botched any semblance of code quality in this PR. Hopefully by moving the Google provider to |
|
THANKYOU!!!! I can't wait to play with this - it looks amazing. I'm going to get the other PRs and a couple of defect fixes released in the next day or 2 and then concentrate on bringing this in. I must confess, I did think implementing this would be quite good fun when you got to the different modalities - how has it been so far? |
Oh don't get your hopes too high; this was an hour and a half work coded with I am pretty confident we've got all of the major features working. The rest will most likely be edge-cases that isn't easily covered by the tests. |
|
Sorry for adding a comment not related to the thread here. But couldn't help ask @monotykamary if you could share the coding agent referred here - "super simple coding agent PoC I'm trying to make" . Quite curios to know how fastagent can be leveraged to make coding agents, that perhaps complements other tools out there including "Cline" that you have mentioned. |
|
Oh I'll have a quick look 🚀 |
fca0262 to
4aa8dd1
Compare
@rkunnamp An exercise up to the reader... I'm kidding, it's really abusing the DesktopCommander MCP and formatting the agents in a way that pass history to each other. One example would be using an infinite loop and pass history whenever import asyncio
from mcp_agent import RequestParams
from mcp_agent.core.fastagent import FastAgent
...
async def main():
async with fast.run() as agent:
while True:
# Start an interactive session with architect
await agent.interactive(agent="architect")
# Pass architect's history to coder for generation
await agent.coder.generate(agent.architect.message_history)
# Start an interactive session with coder
await agent.interactive(agent="coder")
# Pass architect's history to coder for generation
await agent.architect.generate(agent.coder.message_history)
if __name__ == "__main__":
asyncio.run(main())There are much better patterns that you can use to inject it directly into a following user prompt, instead of updating the system prompt like this to avoid:
Really, the key player here is the MCP server and the agent architecture is up to you. skydeckai-code, or originally called |
| parallel_tool_calls=False, | ||
| systemPrompt=self.instruction, # System instruction will be mapped in _google_completion | ||
| parallel_tool_calls=True, # Assume parallel tool calls are supported by default with native API | ||
| max_iterations=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated default
|
That's great - looks like structured generation is really solid now. And these models are fast through this native interface - really cool :) I've done another pass of testing and:
Anyway, great stuff so far - think this is a huge enhancement to fast-agent - thanks! |
Janspoerer google genai provider
|
I did what any sane senior engineer would do on a Monday: merge it and figure it out later 😆 |
|
This. This is the way. |
|
I got back from the MCP Dev Conference this morning. I've made a couple of tweaks to update to MCP1.9.1 and add aliases for the new Sonnet models. Sounds like this is good to go then, pending a decision on whether we run with this as the default? (And I can see the workflow just failed! :P). Seems like we are all confident enough to have this as default, and OpenAI as legacy right? One question I have whether the multimodal/tool calling works better/worse via the OpenAI interface. Anyway, would like to get this released in the next version tonight/tomorrow. |
|
I think the question is about whether to use OpenAI vs. the native Google genai provider, right? I would propose using the native Google genai provider as the default as there seem to be more features (e.g., PDF) when using the Google genai provider. |
|
Yep - just left a review comment to that effect. Also opens the door to native audio/image generation via this model, and it's faster too. A quick look makes me think the lint failure is the cause of the test failure, but don't have time to dig in right now - not sure which commit broke it! 🫣 |
|
Ou naughty linter! Glancing at it, this issue is suspiciously close to where I made changes. I’ll check this when I get home. Not sure if I can change the provider to native Google today though - don’t have that much time. Depends on how involved this change will be. |
|
I think it's just swapping that text at the bottom where I left a comment, might be nice to rename the class but not essential. Not sure I'll be able to release tonight, but will merge/test and go as soon as I once ready. |
…_provider_model in -- should resolve the UnboundVariable error that came in tests/unit/mcp_agent/llm/test_model_factory
|
The latest commits address the linting and test issues. |
|
Understood that only the factory constant needs to be adjusted from GOOGLE to GOOGLE_NATIVE. The renaming is a good idea as well. Did you mean to rename GOOGLE_NATIVE to GOOGLE? And should we let the OpenAI format in there and call it GOOGLE_OA? Where did you leave the in-code comments? I cannot see comments in this pull request. |
|
Got it, thanks. @monotykamary Plz yolo the PR :-) |
|
quick heads up - i've run out of time to finish this today, but am pushing the release for 1.9.1 so will aim to get this across tomorrow with another little feature i want :) |
|
Sounds great, thank you. Do let me know if you need help to push this PR through. |
Janspoerer google genai provider
|
thanks very much both, i've merged that to main with it as the default provider for google 🍾. only one i can't explain is why PDF doesn't work as part of a tool call, but it's not a blocker (and may well be an API limitation - there are a few edge cases in the Anthropic API around stuff like that). If I could ask that someone update the docs: https://github.com/evalstate/fast-agent-docs -> I think the rules over the API keys have changed (does it look for both VERTEX and GOOGLE_API_KEY). Also, the old provider is now accessible as "googleoai." and takes api keys in the same way. thanks again :) |
|
Thank you very much, I'm very happy that this worked out and will soon try the Google Provider for some of my use cases. Regarding further help with the documentation.: I'll not be able to get to this week, I think, but will try to put in some time next week. Here is an issue for this small doc change: #203 |



This PR implements the Google provider for fast-agent using the native
google.genailibrary and should resolve #6 .Key changes include:
google.genai.Clientfor interacting with Google's generative models.PromptMessageMultipartandgoogle.genai.types.Content.GoogleConverterclass to handle data structure conversions.Image test code sample
This work is a step towards fully leveraging the features of the
google.genailibrary within fast-agent.