Skip to content

Conversation

@tbarbugli
Copy link
Member

@tbarbugli tbarbugli commented Oct 31, 2025

  • LLM metrics
  • STT metrics
  • Turn metrics
  • Built-in docker-compose with Grafana
  • Track end-to-end latency
  • Measure video processing latency
  • Measure latency for Realtime models
  • Measure tokens usage

How to play with this

  1. Run docker compose up
  2. uv add --dev opentelemetry-exporter-prometheus prometheus-client opentelemetry-sdk opentelemetry-exporter-otlp
  3. Run the simple agent example
  4. Grafana -> http://localhost:3000
  5. Python metrics -> http://localhost:9464/
  6. Prometheus -> http://localhost:9090/

Note: Do not change the Grafana directly, ask AI to change it for you (everything lives under observability and has a README)!

See turn_detection.py to see an example of metrics collection

Summary by CodeRabbit

  • New Features

    • End-to-end observability: Prometheus/Grafana/Jaeger stack, example app exposes metrics and starts telemetry by default.
    • Automatic latency and error metrics for LLM, STT, TTS, turn detection, YOLO pose and plugins, with provider/class metadata and prebuilt dashboards.
  • Improvements

    • Higher default processing resolution for pose detection (better quality).
  • Tests

    • Added comprehensive Timer tests covering sync/async, decorator/context, nesting, units, and error scenarios.
  • Documentation

    • Added observability README and Grafana provisioning/init assets.

@coderabbitai
Copy link

coderabbitai bot commented Oct 31, 2025

Walkthrough

Adds OpenTelemetry-based observability: a metrics module with a reusable Timer and many histograms/counters; instruments LLM/STT/TTS/turn-detection entry points to record latency and errors; refactors public plugin LLM/STT methods to internal hooks; adds observability stack, dashboards, init scripts, and Timer unit tests.

Changes

Cohort / File(s) Change Summary
Observability core & tests
agents-core/vision_agents/core/observability/metrics.py, agents-core/tests/test_timer.py
Add tracer/meter, many histograms/counters/up-down counters, Timer utility (context/callable/decorator) with class-name detection; add comprehensive unit tests for Timer.
Core wrappers & hooks
agents-core/vision_agents/core/llm/llm.py, agents-core/vision_agents/core/stt/stt.py, agents-core/vision_agents/core/tts/tts.py, agents-core/vision_agents/core/turn_detection/turn_detection.py
Wrap public entrypoints with Timer latency recording and error counters; introduce internal/abstract hooks (_simple_response, _process_audio, detect_turn) and annotate metrics with class/provider attributes.
Plugin LLMs
plugins/*/*/*_llm.py (e.g. plugins/anthropic/.../anthropic_llm.py, plugins/aws/.../aws_llm.py, plugins/gemini/.../gemini_llm.py, plugins/openai/.../openai_llm.py, plugins/openrouter/.../openrouter_llm.py, plugins/xai/.../xai/llm.py)
Rename public simple_response_simple_response, add provider_name attributes on LLM implementations, small helper/event formatting and realtime wrapper additions.
Plugin STT method renames
plugins/deepgram/.../deepgram_stt.py, plugins/fish/.../stt.py, plugins/wizper/.../stt.py
Rename process_audio_process_audio to align with core wrapper; mostly formatting-only edits.
Turn-detection plugins
plugins/krisp/.../turn_detection.py, plugins/smart_turn/.../smart_turn_detection.py, plugins/vogent/.../vogent_turn_detection.py
Rename/replace process_audio with detect_turn (adds conversation param where applicable), add Timer instrumentation around VAD/transcription/turn-end prediction, and introduce provider-specific latency metrics.
Pose processing plugin
plugins/ultralytics/.../yolo_pose_processor.py
Add YOLO pose inference histogram, instrument inference with Timer (frame/device/conf/persons_detected), and update default resolution constant.
Examples & telemetry init
examples/01_simple_agent_example/simple_agent_example.py
Switch TurnDetection source to smart_turn, enable telemetry setup by default, initialize MeterProvider and start Prometheus metrics server.
Observability stack & docs
docker-compose.yml, observability/* (e.g., observability/README.md, .gitignore, grafana/*, prometheus/prometheus.yml)
Add Prometheus/Grafana/Jaeger compose, Grafana provisioning and init script, dashboard JSON, Prometheus config, README and .gitignore.
Misc test updates
tests/test_function_calling.py
Add TestLLM subclass implementing _simple_response and update tests to use it; adjust imports/typing and test scaffolding accordingly.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant Core as Core API
    participant Timer as Timer
    participant Impl as Implementation (plugin/_hook)
    participant OT as OpenTelemetry

    Client->>Core: call public API (e.g. simple_response / process_audio / send)
    Core->>Timer: start(metric, base_attrs)
    Timer->>Impl: invoke underlying _simple_response / _process_audio / detect_turn
    alt success
        Impl-->>Timer: return result
        Timer->>OT: record histogram with merged attributes
        Core-->>Client: return result
    else error
        Impl-->>Timer: raise error
        Timer->>OT: increment error counter (provider, error_type)
        Timer-->>Core: re-raise
        Core-->>Client: propagate exception
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Extra attention: Timer internals (attribute merging, class-name detection, decorator semantics, idempotent stop) and unit tests.
  • Verify wrapper → hook delegation across core and plugin LLM/STT/turn-detection changes.
  • Check metric names/units and consistent attribute keys across modules and dashboards.

Possibly related PRs

Suggested labels

tests, core-infrastructure

Suggested reviewers

  • maxkahan
  • Nash0x7E2
  • d3xvn

Poem

I time the small bright deaths beneath my hands,
each histogram a lined and patient room;
the class-name like a rib against the skin,
exceptions counted, boxed in muted bloom.
The bell goes: I stop, record, and name the gloom.

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The pull request title 'Add more metrics' is vague and generic. While the PR description clarifies that this adds observability metrics for LLM, STT, turn detection, and associated infrastructure (docker-compose, Grafana dashboards), the title itself uses a non-descriptive term 'more metrics' without specifying what metrics are being added or the scope of the change. A developer scanning commit history would not immediately understand the primary purpose from the title alone. Consider using a more descriptive title that captures the main objective, such as 'Implement observability stack with LLM, STT, and turn detection metrics' or 'Add OpenTelemetry metrics and Grafana dashboards for core observability'. This would provide clearer context about the breadth and significance of the changes across the codebase.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch more-metrics

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)

111-117: Fix type mismatch when dispatching PCM frames

process_pcm_turn_taking expects a PcmData object (it reads sample_rate and samples), but detect_turn passes a raw ndarray. The first call to pcm.sample_rate will throw an AttributeError, so turn detection never runs. Keep the PcmData wrapper when resampling.

-        resampled_pcm = audio_data.resample(16_000, 1).samples
+        resampled_pcm = audio_data.resample(16_000, 1)
 
         try:
             loop = asyncio.get_event_loop()
             await loop.run_in_executor(
-                None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata
+                None, self.process_pcm_turn_taking, resampled_pcm, user_id, metadata
             )
plugins/fish/vision_agents/plugins/fish/stt.py (1)

63-70: Align docstring with the actual signature

The docstring still references user_metadata and returning a list of tuples, but the method only takes participant and returns None. Please update the Args/Returns sections to reflect reality so the Google-style doc stays accurate.

🧹 Nitpick comments (3)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)

32-35: Clean up duplicate default resolution assignments.

Line 32 redefines DEFAULT_WIDTH/DEFAULT_HEIGHT twice, leaving both 640x480 and 1920x1080 literals in place. Please keep only the intended defaults so the module state is unambiguous.

plugins/wizper/vision_agents/plugins/wizper/stt.py (1)

96-96: TODO indicates known service limitation.

Line 96 documents a bug in the fal-ai/wizper service where only "transcribe" task works. Consider tracking this with an issue if not already done, as the task and target_language constructor parameters suggest broader intended functionality.

Would you like me to create an issue to track this limitation and monitor for when the fal-ai/wizper service fixes the bug?

observability/grafana/init-home-dashboard.sh (1)

26-30: Hardcoded credentials acceptable for local development.

The use of admin:admin credentials is appropriate for the local Docker Compose development setup described in the PR. For production deployments, these credentials should be externalized via environment variables.

Based on learnings

To make this more flexible, consider this enhancement:

+GRAFANA_USER="${GRAFANA_USER:-admin}"
+GRAFANA_PASSWORD="${GRAFANA_PASSWORD:-admin}"
+
 echo ""
 echo "Setting admin user home dashboard..."
 curl -X PUT \
-  -u "admin:admin" \
+  -u "${GRAFANA_USER}:${GRAFANA_PASSWORD}" \
   -H "Content-Type: application/json" \
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d78a4a0 and 2755b9d.

📒 Files selected for processing (27)
  • agents-core/tests/test_timer.py (1 hunks)
  • agents-core/vision_agents/core/llm/llm.py (9 hunks)
  • agents-core/vision_agents/core/observability/metrics.py (2 hunks)
  • agents-core/vision_agents/core/stt/stt.py (5 hunks)
  • agents-core/vision_agents/core/tts/tts.py (1 hunks)
  • agents-core/vision_agents/core/turn_detection/turn_detection.py (4 hunks)
  • examples/01_simple_agent_example/simple_agent_example.py (5 hunks)
  • observability/.gitignore (1 hunks)
  • observability/README.md (1 hunks)
  • observability/grafana/dashboards/vision-agents.json (1 hunks)
  • observability/grafana/init-home-dashboard.sh (1 hunks)
  • observability/grafana/provisioning/dashboards/default.yml (1 hunks)
  • observability/grafana/provisioning/datasources/prometheus.yml (1 hunks)
  • observability/prometheus/prometheus.yml (1 hunks)
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (15 hunks)
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py (18 hunks)
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (7 hunks)
  • plugins/fish/vision_agents/plugins/fish/stt.py (2 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (12 hunks)
  • plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_llm.py (2 hunks)
  • plugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.py (4 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (2 hunks)
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (16 hunks)
  • plugins/wizper/vision_agents/plugins/wizper/stt.py (1 hunks)
  • plugins/xai/vision_agents/plugins/xai/llm.py (6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/fish/vision_agents/plugins/fish/stt.py
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • plugins/wizper/vision_agents/plugins/wizper/stt.py
  • plugins/openrouter/vision_agents/plugins/openrouter/openrouter_llm.py
  • agents-core/vision_agents/core/tts/tts.py
  • plugins/xai/vision_agents/plugins/xai/llm.py
  • plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py
  • plugins/krisp/vision_agents/plugins/krisp/turn_detection.py
  • agents-core/tests/test_timer.py
  • agents-core/vision_agents/core/turn_detection/turn_detection.py
  • examples/01_simple_agent_example/simple_agent_example.py
  • agents-core/vision_agents/core/stt/stt.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py
  • plugins/openai/vision_agents/plugins/openai/openai_llm.py
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py
  • agents-core/vision_agents/core/observability/metrics.py
  • agents-core/vision_agents/core/llm/llm.py
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:00:34.300Z
Learnt from: dangusev
Repo: GetStream/Vision-Agents PR: 98
File: plugins/deepgram/vision_agents/plugins/deepgram/stt.py:135-150
Timestamp: 2025-10-13T22:00:34.300Z
Learning: In the Deepgram STT plugin (plugins/deepgram/vision_agents/plugins/deepgram/stt.py), the `started()` method is designed to wait for the connection attempt to complete, not to guarantee a successful connection. It's acceptable for the connection attempt to fail, and downstream code handles the case where `self.dg_connection` is `None`. The `_connected_once` event is set in the `finally` block intentionally to signal attempt completion.

Applied to files:

  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py
🧬 Code graph analysis (17)
plugins/fish/vision_agents/plugins/fish/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (2)
  • _process_audio (151-166)
  • _emit_transcript_event (41-63)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/wizper/vision_agents/plugins/wizper/stt.py (1)
  • _process_audio (60-128)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (72-84)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
  • detect_turn (182-193)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
plugins/wizper/vision_agents/plugins/wizper/stt.py (3)
agents-core/vision_agents/core/stt/stt.py (1)
  • _process_audio (151-166)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/fish/vision_agents/plugins/fish/stt.py (1)
  • _process_audio (52-138)
plugins/xai/vision_agents/plugins/xai/llm.py (4)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (90-105)
  • LLMResponseCompletedEvent (109-115)
agents-core/vision_agents/core/llm/llm.py (2)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
  • _simple_response (81-109)
plugins/xai/vision_agents/plugins/xai/events.py (1)
  • XAIChunkEvent (7-10)
plugins/ultralytics/vision_agents/plugins/ultralytics/yolo_pose_processor.py (1)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (3)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (72-84)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
  • detect_turn (158-170)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
  • detect_turn (182-193)
agents-core/tests/test_timer.py (2)
agents-core/vision_agents/core/observability/metrics.py (4)
  • Timer (62-199)
  • elapsed_ms (143-144)
  • stop (125-141)
  • _restart (146-150)
agents-core/vision_agents/core/stt/stt.py (1)
  • process_audio (116-148)
agents-core/vision_agents/core/turn_detection/turn_detection.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)
  • detect_turn (158-170)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)
  • detect_turn (182-193)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
  • detect_turn (78-119)
examples/01_simple_agent_example/simple_agent_example.py (1)
plugins/krisp/vision_agents/plugins/krisp/turn_detection.py (1)
  • TurnDetection (34-202)
agents-core/vision_agents/core/stt/stt.py (4)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/tts/tts.py (1)
  • send (238-343)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • process_audio (44-69)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
  • _process_audio (78-114)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (6)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseCompletedEvent (109-115)
  • LLMResponseChunkEvent (90-105)
agents-core/vision_agents/core/llm/llm.py (5)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
  • _dedup_and_execute (379-413)
  • _convert_tools_to_provider_format (139-153)
  • _extract_tool_calls_from_response (155-169)
agents-core/vision_agents/core/agents/conversation.py (1)
  • Message (14-24)
plugins/gemini/vision_agents/plugins/gemini/events.py (1)
  • GeminiResponseEvent (35-38)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _convert_tools_to_provider_format (509-532)
  • _extract_tool_calls_from_response (534-569)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • ToolSchema (64-67)
  • NormalizedToolCallItem (107-111)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (3)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (90-105)
  • LLMResponseCompletedEvent (109-115)
agents-core/vision_agents/core/llm/llm.py (7)
  • _simple_response (88-98)
  • _dedup_and_execute (379-413)
  • _sanitize_tool_output (415-426)
  • _convert_tools_to_provider_format (139-153)
  • _extract_tool_calls_from_response (155-169)
  • LLMResponseEvent (35-39)
  • _create_tool_result_message (187-202)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • NormalizedToolCallItem (107-111)
  • ToolSchema (64-67)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (5)
agents-core/vision_agents/core/llm/llm.py (1)
  • _simple_response (88-98)
plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (1)
  • _simple_response (75-95)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)
  • _simple_response (81-101)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
plugins/xai/vision_agents/plugins/xai/llm.py (1)
  • _simple_response (71-96)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)
agents-core/vision_agents/core/stt/stt.py (1)
  • _process_audio (151-166)
agents-core/vision_agents/core/llm/llm.py (5)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (4)
  • _simple_response (81-109)
  • _convert_tools_to_provider_format (372-404)
  • _extract_tool_calls_from_response (406-433)
  • _create_tool_result_message (435-465)
agents-core/vision_agents/core/llm/llm_types.py (2)
  • ToolSchema (64-67)
  • NormalizedToolCallItem (107-111)
agents-core/vision_agents/core/llm/function_registry.py (1)
  • get_tool_schemas (108-114)
agents-core/vision_agents/core/llm/events.py (2)
  • ToolStartEvent (119-125)
  • ToolEndEvent (129-138)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (5)
plugins/aws/tests/test_aws.py (1)
  • llm (37-44)
agents-core/vision_agents/core/llm/events.py (2)
  • LLMResponseChunkEvent (90-105)
  • LLMResponseCompletedEvent (109-115)
agents-core/vision_agents/core/llm/llm.py (4)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
  • _dedup_and_execute (379-413)
  • _sanitize_tool_output (415-426)
agents-core/vision_agents/core/llm/llm_types.py (1)
  • NormalizedToolCallItem (107-111)
plugins/aws/vision_agents/plugins/aws/events.py (1)
  • AWSStreamEvent (7-10)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)
  • _prepare_silero_vad (150-156)
  • SileroVAD (419-486)
  • detect_turn (158-170)
  • predict_speech (454-455)
🪛 Gitleaks (8.28.0)
observability/grafana/init-home-dashboard.sh

[high] 26-27: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.

(curl-auth-user)

🪛 markdownlint-cli2 (0.18.1)
observability/README.md

19-19: Bare URL used

(MD034, no-bare-urls)


20-20: Bare URL used

(MD034, no-bare-urls)


21-21: Bare URL used

(MD034, no-bare-urls)


38-38: Bare URL used

(MD034, no-bare-urls)


51-51: Bare URL used

(MD034, no-bare-urls)


59-59: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


69-69: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


137-137: Bare URL used

(MD034, no-bare-urls)


138-138: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (15)
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (1)

78-115: LGTM! Method rename aligns with observability pattern.

The rename from process_audio to _process_audio is consistent with the broader refactoring to support metrics collection at the base class level. The docstring follows Google style, and the logic correctly handles connection readiness, resampling, and audio transmission.

plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (4)

68-68: Provider name attribute added correctly.

The addition of provider_name = "anthropic" enables consistent provider identification in metrics and events across the observability stack.


75-95: LGTM! Method visibility change aligns with base class pattern.

The rename to _simple_response is consistent with the standardized approach where the base class provides a public wrapper with metrics collection, delegating to this private implementation method.


132-217: Tool calling logic is well-structured.

The multi-hop tool execution with deduplication (via _dedup_and_execute) and bounded iteration (MAX_ROUNDS=3) provides robust handling of Claude's tool use pattern. The formatting improvements enhance readability without altering behavior.


218-349: Streaming tool execution correctly implemented.

The streaming path maintains consistency with the non-streaming approach, properly accumulating tool calls across chunks and executing them in multi-hop fashion. Event emissions correctly identify the provider.

plugins/xai/vision_agents/plugins/xai/llm.py (2)

62-62: LGTM! Provider standardization implemented correctly.

The addition of provider_name = "xai" and the rename to _simple_response align with the cross-provider standardization pattern. The unused participant parameter in the method signature matches the base class interface requirements.

Also applies to: 71-96


148-152: Event emissions correctly formatted.

The multi-line event emission formatting improves readability, and all events consistently include plugin_name="xai" for proper observability tracking.

Also applies to: 184-193, 199-205

observability/grafana/provisioning/dashboards/default.yml (2)

8-8: Verify dashboard deletion setting aligns with PR guidance.

Line 8 sets disableDeletion: false, allowing dashboards to be deleted via the UI. However, the PR description states "Do not modify Grafana directly; changes should be requested via AI." Consider setting this to true to prevent accidental deletions and enforce the file-based workflow.

If the intention is to allow UI modifications only for experimentation (not persistence), apply this diff:

-    disableDeletion: false
+    disableDeletion: true

1-13: Provisioning configuration is well-structured.

The dashboard provider configuration correctly points to the standard Grafana dashboards directory with folder structure preservation enabled. The 10-second update interval provides quick feedback during development.

plugins/wizper/vision_agents/plugins/wizper/stt.py (1)

60-129: LGTM! Implementation follows STT plugin pattern.

The rename to _process_audio is consistent with the base class design. The implementation properly handles temporary file lifecycle with cleanup in a finally block, and error handling emits appropriate events.

examples/01_simple_agent_example/simple_agent_example.py (2)

60-94: Observability stack integration is well-implemented.

The telemetry setup correctly configures OpenTelemetry with Prometheus metrics exposition on port 9464 and OTLP tracing to localhost:4317. The resource identification and graceful shutdown handling via atexit are appropriate.


6-6: Verification complete: Migration from vogent.TurnDetection to smart_turn.TurnDetection is successful.

The verification confirms no references to vogent.TurnDetection remain in the codebase. The file has been correctly updated to import and use smart_turn.TurnDetection() on line 25, with the corresponding import statement on line 6. This aligns with the intended module organization where VogentTurnDetection remains available as a separate plugin while SmartTurnDetection becomes the default in examples.

agents-core/vision_agents/core/tts/tts.py (1)

341-343: Improved metric identification with fully qualified class paths.

The change from self.__class__.__name__ to the fully qualified path (module.qualname) provides better metric disambiguation when multiple TTS providers might share simple class names. This aligns with the observability enhancements throughout the codebase.

observability/grafana/init-home-dashboard.sh (1)

1-33: Script correctly configures home dashboard.

The initialization script properly waits for Grafana health, then sets the Vision Agents dashboard as the default home view for both organization and admin user. The dashboard UID reference aligns with the provisioned dashboard configuration.

agents-core/tests/test_timer.py (1)

18-440: Comprehensive Timer coverage

Love how these tests hammer the context-manager, decorator, inheritance, and async paths—this suite should catch regressions fast.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)

131-146: Consider warning callers when ignored parameters are used.

The method accepts system_prompt, temperature, and max_tokens but silently ignores them. While this appears to be for interface compliance (matching the OpenAI Realtime pattern), callers who pass these parameters may expect them to have an effect.

Consider logging a warning when these parameters are provided:

 async def _simple_response(
     self,
     text: str,
     system_prompt: Optional[str] = None,
     temperature: float = 0.7,
     max_tokens: Optional[int] = None,
     **kwargs: Any,
 ) -> str:
     """
     Internal simple response implementation required by LLM base class.

     Note: Gemini Realtime is event-driven and doesn't return responses directly.
     This implementation sends the text and returns a placeholder.
     """
+    if system_prompt or max_tokens or temperature != 0.7:
+        self.logger.warning(
+            "Parameters system_prompt, temperature, and max_tokens are not supported "
+            "by Gemini Realtime and will be ignored"
+        )
     await self.send_realtime_input(text=text)
     return ""  # Realtime API doesn't return text synchronously

148-164: Consider adding parameter warnings and clarifying the generator pattern.

Similar to _simple_response, this method silently ignores system_prompt, temperature, and max_tokens. Additionally, the return followed by yield pattern (lines 163-164) creates an empty generator, which is technically correct but may be confusing.

Consider adding a warning for ignored parameters and a comment explaining the pattern:

 async def _simple_response_stream(
     self,
     text: str,
     system_prompt: Optional[str] = None,
     temperature: float = 0.7,
     max_tokens: Optional[int] = None,
     **kwargs: Any,
 ):
     """
     Internal simple response stream implementation required by LLM base class.

     Note: Gemini Realtime is event-driven and doesn't stream responses in this manner.
     This implementation sends the text but yields nothing.
     """
+    if system_prompt or max_tokens or temperature != 0.7:
+        self.logger.warning(
+            "Parameters system_prompt, temperature, and max_tokens are not supported "
+            "by Gemini Realtime and will be ignored"
+        )
     await self.send_realtime_input(text=text)
+    # Return immediately to create an empty generator (yield makes this a generator function)
     return
     yield  # Make this a generator
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

138-154: Unreachable yield statement after return.

The yield on line 154 is unreachable because it follows the return statement on line 153. While this still makes the function a generator (which is the intent), the unreachable code is a code smell.

Apply this diff to make the intent clearer:

-        await self.rtc.send_text(text)
-        return
-        yield  # Make this a generator
+        await self.rtc.send_text(text)
+        # Make this a generator that yields nothing
+        if False:
+            yield

Or alternatively:

-        await self.rtc.send_text(text)
-        return
-        yield  # Make this a generator
+        await self.rtc.send_text(text)
+        yield  # Generator that yields nothing

Note: This pattern is also present in the Gemini Realtime implementation (plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py, lines 147-163).

docker-compose.yml (3)

17-17: Pin Grafana and Prometheus to specific versions for reproducibility.

Using :latest tags can lead to non-deterministic behavior across environments and time, potentially breaking observability setups unexpectedly. Pin these images to specific versions for consistent deployments.

-    image: prom/prometheus:latest
+    image: prom/prometheus:v2.52.0
-    image: grafana/grafana:latest
+    image: grafana/grafana:11.2.0

Check Prometheus releases and Grafana releases for appropriate versions to match your requirements.

Also applies to: 35-35


50-51: Add health checks to prevent race conditions during startup.

The depends_on declarations ensure service startup order but do not wait for readiness. The grafana-init service may attempt to initialize dashboards before Grafana is fully ready, and Grafana may start before Prometheus is accepting connections.

Consider adding health checks to ensure dependent services are truly ready before proceeding:

   prometheus:
     image: prom/prometheus:v2.52.0
     ...
+    healthcheck:
+      test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy"]
+      interval: 5s
+      timeout: 3s
+      retries: 5
   grafana:
     ...
     depends_on:
-      - prometheus
+      prometheus:
+        condition: service_healthy
+    healthcheck:
+      test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
+      interval: 5s
+      timeout: 3s
+      retries: 5
   grafana-init:
     ...
     depends_on:
-      - grafana
+      grafana:
+        condition: service_healthy

Also applies to: 62-63


56-66: Grafana-init script robustness.

The grafana-init service uses a simple curl image to run a shell script. Consider adding error handling and explicit success/failure indicators in the script itself to make debugging easier if initialization fails.

Additionally, ensure the init-home-dashboard.sh script includes proper error handling and exits with appropriate status codes.

Please review the observability/grafana/init-home-dashboard.sh script and confirm it:

  • Exits with a non-zero status on failure
  • Has informative logging for debugging
  • Includes reasonable retry logic or timeout handling for API calls
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2755b9d and 3646cd0.

📒 Files selected for processing (4)
  • docker-compose.yml (1 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (1 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • tests/test_function_calling.py (10 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
  • tests/test_function_calling.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic

Files:

  • tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _simple_response (131-146)
  • _simple_response_stream (148-164)
plugins/openai/vision_agents/plugins/openai/openai_llm.py (1)
  • _simple_response (81-109)
plugins/openai/vision_agents/plugins/openai/rtc_manager.py (1)
  • send_text (419-440)
tests/test_function_calling.py (3)
agents-core/vision_agents/core/llm/llm.py (1)
  • LLM (46-426)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
  • _simple_response (131-146)
  • _simple_response_stream (148-164)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
  • _simple_response (121-136)
  • _simple_response_stream (138-154)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
  • _simple_response (121-136)
  • _simple_response_stream (138-154)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

121-136: Implementation correctly aligns with base LLM API.

The method properly delegates to rtc.send_text and returns a placeholder empty string, consistent with the event-driven nature of the Realtime API. The unused parameters (system_prompt, temperature, max_tokens) are accepted for interface compatibility.

tests/test_function_calling.py (2)

16-41: Well-structured test LLM implementation.

The TestLLM class provides appropriate mock implementations of the required abstract methods. The _simple_response returns a mock string, and _simple_response_stream properly yields (not unreachable like in the production code).


162-162: Correctly replaces abstract class instantiation with test implementation.

Replacing LLM() with TestLLM() throughout the tests properly addresses the issue of instantiating an abstract base class. The test implementation provides the required concrete methods.

Also applies to: 176-176, 392-392, 416-416, 435-435, 481-481, 511-511, 544-544

docker-compose.yml (1)

22-22: All referenced configuration files are present and correctly mounted.

Verification confirms that all observability configuration files and scripts referenced in the docker-compose.yml file exist in the repository:

  • observability/prometheus/prometheus.yml
  • observability/grafana/init-home-dashboard.sh
  • observability/grafana/provisioning/
  • observability/grafana/dashboards/

No issues found.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (1)

272-278: Replace magic numbers with defined constants

The calculation uses literals 512 and 16000, but CHUNK and RATE constants are defined at lines 41-42 and used elsewhere in the file (e.g., lines 238, 242, 243).

Apply this diff:

                 trailing_silence_ms = (
                     self._silence.trailing_silence_chunks
-                    * 512
-                    / 16000
+                    * CHUNK
+                    / RATE
                     * 1000
                     * 5  # DTX correction
                 )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3646cd0 and f42466c.

📒 Files selected for processing (5)
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py (15 hunks)
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2 hunks)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • tests/test_function_calling.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/anthropic/vision_agents/plugins/anthropic/anthropic_llm.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • tests/test_function_calling.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

tests/**/*.py: Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files
Write tests using pytest (avoid unittest.TestCase or other frameworks)
Mark integration tests with @pytest.mark.integration
Do not use @pytest.mark.asyncio; async support is automatic

Files:

  • tests/test_function_calling.py
🧬 Code graph analysis (3)
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (4)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (2)
  • _simple_response (121-137)
  • simple_response (100-119)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
agents-core/vision_agents/core/llm/llm.py (1)
  • LLMResponseEvent (35-39)
agents-core/vision_agents/core/agents/agents.py (1)
  • simple_response (266-279)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/turn_detection/turn_detection.py (1)
  • detect_turn (72-84)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
  • detect_turn (182-193)
  • _process_audio_packet (216-342)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
tests/test_function_calling.py (4)
agents-core/vision_agents/core/llm/llm.py (6)
  • LLM (46-426)
  • LLMResponseEvent (35-39)
  • register_function (218-231)
  • get_available_functions (233-235)
  • _dedup_and_execute (379-413)
  • _run_one_tool (280-352)
agents-core/vision_agents/core/llm/function_registry.py (3)
  • FunctionRegistry (35-237)
  • get_tool_schemas (108-114)
  • get_callable (144-160)
agents-core/vision_agents/core/llm/events.py (2)
  • ToolStartEvent (119-125)
  • ToolEndEvent (129-138)
agents-core/vision_agents/core/events/manager.py (1)
  • wait (470-484)
🔇 Additional comments (7)
tests/test_function_calling.py (1)

16-30: LGTM: Clean test double implementation.

The TestLLM class provides a concrete implementation of the now-abstract LLM base class, which is necessary after _simple_response became abstract. This is a legitimate test double that doesn't rely on mocking frameworks.

plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py (2)

131-147: LGTM! Internal method aligns with Realtime API pattern.

The _simple_response implementation correctly follows the pattern established in openai_realtime.py. Returning an empty LLMResponseEvent is appropriate for event-driven APIs where actual responses arrive through the event system rather than return values.

The inline import of LLMResponseEvent mirrors the OpenAI implementation and likely avoids circular dependencies.


362-364: LGTM! Formatting improvement.

The multi-line formatting improves readability without changing behavior.

plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (4)

16-20: LGTM: Observability imports

The metrics imports are clean and consistent with the broader instrumentation patterns across the codebase.


177-208: Excellent lifecycle management with proper cleanup

The nested try/finally structure elegantly resolves the shutdown race condition from the previous review. The outer finally block (lines 205-208) ensures both _processing_active and _shutdown_event are cleared even when cancellation occurs mid-processing, preventing hung waits and enabling clean restart cycles.


252-256: LGTM: VAD latency instrumentation

The Timer wrapping captures essential context (samples, implementation) and follows the established observability pattern seen in vogent.


294-303: LGTM: Comprehensive turn-end detection metrics

The Timer captures rich context (audio duration, samples, trailing silence, prediction probability, and turn-end decision), enabling effective observability for turn detection behavior.

"""

import pytest
from unittest.mock import Mock, patch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove unittest.mock usage per coding guidelines.

This file violates the coding guideline: "Never use mocking utilities (e.g., unittest.mock, pytest-mock) in test files" for tests/**/*.py. The Mock and patch imports are used extensively in TestOpenAIFunctionCalling, TestClaudeFunctionCalling, and TestGeminiFunctionCalling classes.

Consider rewriting these tests without mocking frameworks—use real instances with test configurations, in-memory fakes, or integration test markers if external dependencies are required.

As per coding guidelines

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (1)

182-193: Update call sites in vogent plugin to use detect_turn method.

The refactoring is incomplete. Tests and examples within the vogent plugin still call the old process_audio method, which will cause failures:

  • plugins/vogent/tests/test_vogent_td.py lines 45, 46
  • plugins/vogent/example/basic_usage.py lines 90, 166
🧹 Nitpick comments (4)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

164-168: Consider extracting duplicated constants.

The max_concurrency=8 and timeout_s=30 parameters appear identically in both tool execution loops (lines 164-168 and 408-412).

Apply this diff to extract the constants:

     def __init__(
         self,
         model: str,
         region_name: str = "us-east-1",
         aws_access_key_id: Optional[str] = None,
         aws_secret_access_key: Optional[str] = None,
         aws_session_token: Optional[str] = None,
     ):
         """
         Initialize the BedrockLLM class.

         Args:
             model: The Bedrock model ID (e.g., "anthropic.claude-3-5-sonnet-20241022-v2:0")
             region_name: AWS region name (default: "us-east-1")
             aws_access_key_id: Optional AWS access key ID
             aws_secret_access_key: Optional AWS secret access key
             aws_session_token: Optional AWS session token
         """
         super().__init__()
         self.events.register_events_from_module(events)
         self.model = model
         self._pending_tool_uses_by_index: Dict[int, Dict[str, Any]] = {}
+        self._tool_max_concurrency = 8
+        self._tool_timeout_s = 30

Then reference these constants in both locations:

                 while current_calls and rounds < MAX_ROUNDS:
                     # Execute calls concurrently with dedup
                     triples, seen = await self._dedup_and_execute(
                         cast(List[Dict[str, Any]], current_calls),
                         seen=seen,
-                        max_concurrency=8,
-                        timeout_s=30,
+                        max_concurrency=self._tool_max_concurrency,
+                        timeout_s=self._tool_timeout_s,
                     )

And similarly at lines 408-412.

Also applies to: 408-412

plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)

149-153: Simplify model initialization to match smart_turn pattern.

The lambda wrappers around model constructors add unnecessary complexity. The asyncio.to_thread function accepts any callable, so you can pass constructors directly as shown in the smart_turn plugin.

Apply this diff to simplify _prepare_silero_vad:

-        self.vad = await asyncio.to_thread(  # type: ignore[func-returns-value]
-            lambda: SileroVAD(  # type: ignore[arg-type]
-                path, reset_interval_seconds=self.vad_reset_interval_seconds
-            )
+        self.vad = await asyncio.to_thread(
+            SileroVAD, path, reset_interval_seconds=self.vad_reset_interval_seconds
         )

Apply similar changes to _prepare_whisper:

-        self.whisper = await asyncio.to_thread(  # type: ignore[func-returns-value]
-            lambda: WhisperModel(
-                self.whisper_model_size, device="cpu", compute_type="int8"
-            )
+        self.whisper = await asyncio.to_thread(
+            WhisperModel, self.whisper_model_size, device="cpu", compute_type="int8"
         )

And to _prepare_vogent:

-        self.vogent = await asyncio.to_thread(  # type: ignore[func-returns-value]
-            lambda: VogentDetector(
-                compile_model=True,
-                warmup=True,
-                device=None,
-                model_name="vogent/Vogent-Turn-80M",
-                revision="main",
-            )
+        self.vogent = await asyncio.to_thread(
+            VogentDetector,
+            compile_model=True,
+            warmup=True,
+            device=None,
+            model_name="vogent/Vogent-Turn-80M",
+            revision="main",
         )

Note: The type: ignore comments may become unnecessary after this refactor. If they remain, investigate the underlying type issues rather than suppressing them.

Also applies to: 159-162, 171-179


376-403: Consider checking self.whisper before starting the timer.

The Timer currently records latency even when self.whisper is None and an empty string is returned immediately. This could skew your metrics if the early-exit path is taken frequently.

Apply this diff to avoid recording no-op latency:

     async def _transcribe_segment(self, pcm: PcmData) -> str:
         """
         Transcribe audio segment using faster-whisper.
 
         Args:
             pcm: PcmData containing audio samples
 
         Returns:
             Transcribed text
         """
+        if self.whisper is None:
+            return ""
+
         with Timer(vogent_whisper_latency_ms) as timer:
             # Ensure it's 16khz and f32 format
             pcm = pcm.resample(16000).to_float32()
             audio_array = pcm.samples
             timer.attributes["audio_duration_ms"] = pcm.duration_ms
             timer.attributes["samples"] = len(audio_array)
 
-            if self.whisper is None:
-                return ""
-
             # Run transcription in thread pool to avoid blocking
             segments, info = await asyncio.to_thread(

422-456: Consider checking self.vogent before starting the timer.

Similar to the transcription method, the Timer records latency even when self.vogent is None and False is returned immediately. Moving the guard clause outside the Timer context would provide more accurate metrics.

Apply this diff:

     async def _predict_turn_completed(
         self,
         pcm: PcmData,
         prev_line: str,
         curr_line: str,
     ) -> bool:
         """
         Predict whether the current turn is complete using Vogent.
 
         Args:
             pcm: PcmData containing audio samples
             prev_line: Previous speaker's text (for context)
             curr_line: Current speaker's text
 
         Returns:
             True if turn is complete, False otherwise
         """
+        if self.vogent is None:
+            return False
+
         with Timer(turn_end_detection_latency_ms) as timer:
             # Ensure it's 16khz and f32 format
             pcm = pcm.resample(16000).to_float32()
 
             # Truncate to 8 seconds
             audio_array = pcm.tail(8, False).samples
             timer.attributes["implementation"] = "vogent"
             timer.attributes["audio_duration_ms"] = len(audio_array) / 16000 * 1000
             timer.attributes["prev_line_length"] = len(prev_line)
             timer.attributes["curr_line_length"] = len(curr_line)
 
-            if self.vogent is None:
-                return False
-
             # Run vogent prediction in thread pool
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 06aea31 and 8869d65.

📒 Files selected for processing (3)
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py (3 hunks)
  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (10 hunks)
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (15 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/python.mdc)

**/*.py: Do not modify sys.path in Python code
Docstrings must follow the Google style guide

Files:

  • plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py
  • plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py
  • plugins/aws/vision_agents/plugins/aws/aws_llm.py
🧬 Code graph analysis (3)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (4)
  • detect_turn (182-193)
  • _process_audio_packet (216-342)
  • predict_speech (518-546)
  • _predict_turn_completed (405-456)
plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)
agents-core/vision_agents/core/observability/metrics.py (1)
  • Timer (62-199)
agents-core/vision_agents/core/utils/utils.py (1)
  • ensure_model (115-157)
plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (3)
  • _prepare_silero_vad (151-157)
  • SileroVAD (429-496)
  • detect_turn (159-171)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (2)
plugins/gemini/vision_agents/plugins/gemini/gemini_llm.py (1)
  • _simple_response (69-86)
agents-core/vision_agents/core/llm/llm.py (2)
  • _simple_response (88-98)
  • LLMResponseEvent (35-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
  • GitHub Check: unit / Ruff & mypy
  • GitHub Check: unit / Test "not integration"
🔇 Additional comments (6)
plugins/aws/vision_agents/plugins/aws/aws_llm.py (1)

82-87: LGTM - Method signature aligns with the refactoring pattern.

The change from public simple_response to private _simple_response with explicit return type annotation is consistent with the broader refactoring across LLM implementations. The signature correctly matches the base class abstract method.

plugins/vogent/vision_agents/plugins/vogent/vogent_turn_detection.py (3)

19-24: LGTM! Clean metrics imports and definition.

The observability imports are well-organized, and the new vogent_whisper_latency_ms histogram follows proper naming conventions with clear metadata.

Also applies to: 31-36


259-264: LGTM! VAD instrumentation follows best practices.

The Timer usage correctly wraps the VAD prediction, and the attributes provide useful context for performance analysis (sample count and implementation type).


476-476: LGTM! Punctuation stripping improves model input.

Using rstrip(".!?") to remove terminal punctuation before passing text to Vogent is a sensible preprocessing step that should improve turn detection accuracy.

plugins/smart_turn/vision_agents/plugins/smart_turn/smart_turn_detection.py (2)

178-210: Lifecycle cleanup looks solid.

Resetting _processing_active and _shutdown_event in the shutdown path prevents the cancellation hang we saw earlier and keeps the detector restartable.


253-305: Telemetry instrumentation is on point.

Wrapping the VAD and turn-completion calls in Timer gives us per-call latency with helpful context attributes, which will drop straight into the new dashboards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants