Skip to content

refactor: replace genAI model sdk by okHttp#206

Open
Aias00 wants to merge 105 commits intoagentscope-ai:mainfrom
Aias00:feat/replace_genai_sdk
Open

refactor: replace genAI model sdk by okHttp#206
Aias00 wants to merge 105 commits intoagentscope-ai:mainfrom
Aias00:feat/replace_genai_sdk

Conversation

@Aias00
Copy link
Copy Markdown
Contributor

@Aias00 Aias00 commented Dec 16, 2025

AgentScope-Java Version

[The version of AgentScope-Java you are working on, e.g. 1.0.2, check your pom.xml dependency version or run mvn dependency:tree | grep agentscope-parent:pom(only mac/linux)]

Description

[Please describe the background, purpose, changes made, and how to test this PR]
Introduce dedicated DTOs for Gemini API request and response structures and update related components.
fixes #96

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has been formatted with mvn spotless:apply
  • All tests are passing (mvn test)
  • Javadoc comments are complete and follow project conventions
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the Google GenAI Java SDK with dedicated DTOs and direct HTTP calls to the Gemini API, simplifying the implementation and removing dependencies on the Google SDK and Vertex AI support.

  • Custom DTOs introduced for all Gemini API request/response structures
  • Direct HTTP implementation using OkHttp with SSE streaming support
  • All formatters, converters, and parsers updated to use new DTOs

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
GeminiChatModel.java Replaced SDK client with OkHttpclient for direct API calls with streaming support
GeminiRequest.java, GeminiResponse.java New DTOs for API request/response structures
GeminiContent.java, GeminiPart.java DTOs for message content and parts
GeminiTool.java, GeminiToolConfig.java DTOs for tool definitions and configuration
GeminiGenerationConfig.java DTO for generation parameters
GeminiChatFormatter.java Updated to work with DTOs instead of SDK types
GeminiMessageConverter.java Updated conversion logic for new DTOs
GeminiMediaConverter.java Updated media handling to use DTO Blob type
GeminiToolsHelper.java Simplified tool conversion without SDK schema conversion
GeminiResponseParser.java Updated response parsing for new DTO structures
ModelProviderType.java Simplified Gemini config removing Vertex AI parameters
AgentScopeProducer.java Removed Vertex AI configuration logic
pom.xml (core) Removed google-genai dependency
pom.xml (mem0) Added jackson-datatype-jsr310 dependency
Test files Updated all tests to use new DTOs instead of SDK types

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Aias00 and others added 5 commits December 16, 2025 13:53
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ini/GeminiMediaConverter.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hatModel.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hatModel.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@AlbumenJ AlbumenJ marked this pull request as draft December 16, 2025 08:25
@Aias00
Copy link
Copy Markdown
Contributor Author

Aias00 commented Dec 17, 2025

gemini-2.0-flush
截屏2025-12-17 09 52 54

gemini-3-pro-preview
截屏2025-12-17 09 57 20

@Aias00 Aias00 marked this pull request as ready for review December 17, 2025 02:00
Signed-off-by: liuhy <liuhongyu@apache.org>
@Aias00 Aias00 changed the title feat: Introduce dedicated DTOs for Gemini API request and response structures and update related components. refactor: replace genAI model sdk by okHttp Dec 17, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

agentscope-extensions/agentscope-extensions-mem0/pom.xml:1

  • The jackson-datatype-jsr310 version 2.15.2 may be outdated. Consider using a more recent version that matches the Jackson version used elsewhere in the project, or verify if 2.15.2 is the intended version for compatibility reasons.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@AlbumenJ
Copy link
Copy Markdown
Collaborator

image

Please make sure all the E2E Tests pass

@AlbumenJ
Copy link
Copy Markdown
Collaborator

Your current logic for handling the Base URL differs from the GenAI SDK's approach, specifically regarding the handling of the GOOGLE_API_BASE_URL key.

… multi-agent thinking

Signed-off-by: liuhy <liuhongyu@apache.org>
Aias00 added 4 commits March 10, 2026 14:17
# Conflicts:
#	agentscope-core/src/main/java/io/agentscope/core/agent/AgentBase.java
#	agentscope-core/src/test/java/io/agentscope/core/tool/mcp/McpClientBuilderTest.java
…mprove response handling

Signed-off-by: liuhy <liuhongyu@apache.org>
…ng logic and improve logging levels

Signed-off-by: liuhy <liuhongyu@apache.org>
@LearningGp LearningGp added the wait-for-response PRs that require further response label Mar 13, 2026
@github-actions
Copy link
Copy Markdown

Closing due to inactivity. Feel free to reopen when ready.

@github-actions github-actions bot added the stale label Mar 21, 2026
@github-actions github-actions bot closed this Mar 21, 2026
@Aias00
Copy link
Copy Markdown
Contributor Author

Aias00 commented Mar 21, 2026

@AlbumenJ @LearningGp hi, pls help me reopen this pr

@LearningGp LearningGp reopened this Mar 24, 2026
@LearningGp LearningGp removed wait-for-response PRs that require further response stale labels Mar 24, 2026
Aias00 added 8 commits March 24, 2026 12:27
# Conflicts:
#	agentscope-core/src/main/java/io/agentscope/core/formatter/gemini/GeminiChatFormatter.java
#	agentscope-core/src/test/java/io/agentscope/core/formatter/anthropic/AnthropicChatFormatterTest.java
#	agentscope-core/src/test/java/io/agentscope/core/formatter/anthropic/AnthropicResponseParserTest.java
#	agentscope-core/src/test/java/io/agentscope/core/formatter/gemini/GeminiChatFormatterGroundTruthTest.java
#	agentscope-core/src/test/java/io/agentscope/core/model/transport/websocket/JdkWebSocketConnectionTest.java
#	agentscope-extensions/agentscope-extensions-higress/src/test/java/io/agentscope/extensions/higress/HigressMcpClientWrapperTest.java
#	agentscope-extensions/agentscope-extensions-rag-simple/src/main/java/io/agentscope/core/rag/store/QdrantStore.java
#	agentscope-extensions/agentscope-extensions-rag-simple/src/test/java/io/agentscope/core/rag/store/QdrantStoreTest.java
…PI interactions

Signed-off-by: liuhy <liuhongyu@apache.org>
…nd cleanup in structured output processing

Signed-off-by: liuhy <liuhongyu@apache.org>
…gement and cleanup in structured output processing"

This reverts commit e40c902.
LearningGp

This comment was marked as outdated.

Copy link
Copy Markdown
Collaborator

@LearningGp LearningGp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Gemini 3, are we supposed to use thinkingLevel instead of thinkingBudget now? Curious about the specific reason why "thinking" is restricted on Gemini 3 Flash.

@Aias00
Copy link
Copy Markdown
Contributor Author

Aias00 commented Mar 28, 2026

For Gemini 3, are we supposed to use thinkingLevel instead of thinkingBudget now? Curious about the specific reason why "thinking" is restricted on Gemini 3 Flash.

For Gemini 3, yes, Google now recommends using thinkingLevel rather than thinkingBudget. thinkingBudget is still accepted for backward compatibility, but the official guidance is: use thinkingLevel for Gemini 3 and later, and use thinkingBudget for Gemini 2.5. Google also notes that using thinkingBudget with Gemini 3 Pro can lead to unexpected performance, so it should not be treated as the preferred control going forward.

For the Gemini 3 Flash + structured output case, the current attempt to suppress thinking is not because Gemini Flash is officially defined as incompatible with thinking. The reason is that, in our earlier implementation, we observed that once thinking was enabled on gemini-3-flash, the tool-based structured output path became less stable and less deterministic. Since AgentScope currently implements structured output through the framework’s unified generate_response tool flow, we introduced this workaround to make Gemini 3 Flash align more reliably with the framework-defined structured output capability.

In other words, the current “disable thinking” behavior is a framework-compatibility mitigation: it was added to make gemini-3-flash + structured output tool behave more predictably within the existing framework contract, rather than because Gemini’s native design requires thinking to be turned off.

Going forward, we should evaluate whether Gemini structured output can be handled in a more native way, for example through responseSchema, while still adapting the result back into the framework’s common structured output semantics. If that works well, this workaround may no longer be necessary.

Official references:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Refactor] Use okhttp to replace GenAI Model SDK

5 participants