fix(telemetry): serialize Pydantic BaseModel in _safe_json_serialize#4651
fix(telemetry): serialize Pydantic BaseModel in _safe_json_serialize#4651giulio-leone wants to merge 1 commit intogoogle:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical data serialization problem within the telemetry system. Previously, Pydantic Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
Response from ADK Triaging Agent Hello @giulio-leone, thank you for creating this PR! Before we can merge this, you'll need to sign the Contributor License Agreement (CLA). You can find more information in the "Details" link of the Also, could you please add a Thanks! |
There was a problem hiding this comment.
Code Review
This pull request correctly addresses the issue of serializing Pydantic BaseModel instances in _safe_json_serialize by introducing a custom default handler for json.dumps. The fix is well-targeted and effective.
My main feedback is that a very similar function, _safe_json_serialize_no_whitespaces in src/google/adk/telemetry/_experimental_semconv.py, has the same serialization problem but was not updated. To ensure consistency and prevent future bugs, I recommend applying the same fix there. To avoid code duplication, the new default handler logic could be extracted into a reusable, module-level function. I've left a specific comment with a suggestion on how to refactor this.
| def _default(o: Any) -> Any: | ||
| if isinstance(o, BaseModel): | ||
| return o.model_dump(mode='json') | ||
| return '<not serializable>' | ||
|
|
||
| try: | ||
| # Try direct JSON serialization first | ||
| return json.dumps( | ||
| obj, ensure_ascii=False, default=lambda o: '<not serializable>' | ||
| ) | ||
| return json.dumps(obj, ensure_ascii=False, default=_default) |
There was a problem hiding this comment.
This correctly handles Pydantic model serialization. To improve reusability and fix a similar issue in another file, I suggest extracting the _default logic into a module-level function, for example _pydantic_json_default.
# At module level in tracing.py
def _pydantic_json_default(o: Any) -> Any:
if isinstance(o, BaseModel):
return o.model_dump(mode='json')
return '<not serializable>'A nearly identical function, _safe_json_serialize_no_whitespaces in src/google/adk/telemetry/_experimental_semconv.py, still uses the old lambda and will also need this fix. Making the helper function module-level will allow you to import and reuse it there, avoiding code duplication.
| def _default(o: Any) -> Any: | |
| if isinstance(o, BaseModel): | |
| return o.model_dump(mode='json') | |
| return '<not serializable>' | |
| try: | |
| # Try direct JSON serialization first | |
| return json.dumps( | |
| obj, ensure_ascii=False, default=lambda o: '<not serializable>' | |
| ) | |
| return json.dumps(obj, ensure_ascii=False, default=_default) | |
| try: | |
| # Try direct JSON serialization first | |
| return json.dumps(obj, ensure_ascii=False, default=_pydantic_json_default) |
99bd851 to
8cd9985
Compare
_safe_json_serialize replaces Pydantic BaseModel instances with '<not serializable>' because json.dumps cannot handle them natively. Any tool that returns a Pydantic model has its traced output lost. Replace the generic lambda default with a _default function that calls model_dump(mode='json') for BaseModel instances before falling back to '<not serializable>' for truly non-serializable objects. BaseModel is already imported in tracing.py. Fixes google#4629
8cd9985 to
33c0d84
Compare
|
Closing — CLA not yet signed. Will resubmit when ready. |
Problem
_safe_json_serializesilently replaces PydanticBaseModelinstances with the string'<not serializable>'when serializing tool responses for tracing spans.Any tool that returns a Pydantic model rather than a plain
dictexecutes correctly, but the traced output is lost:Root Cause
The
defaultlambda injson.dumpsunconditionally returns'<not serializable>'for any object that isn't natively JSON-serializable, including Pydantic models.Fix
Replace the generic lambda with a
_defaultfunction that callsmodel_dump(mode='json')forBaseModelinstances before falling back to'<not serializable>'.BaseModelis already imported intracing.py(line 63).Test
56 telemetry tests pass, 2 consecutive clean runs.
Fixes #4629