-
Notifications
You must be signed in to change notification settings - Fork 34
feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81
Conversation
- Added a comprehensive document outlining the critical latency issues in TTS playback within the Speech Cascade architecture. - Identified root causes including processing loop deadlock, sentence buffering delays, queue-based event processing, and full synthesis before streaming. - Proposed a multi-phase optimization strategy to address identified issues, including: - Phase 0: Fix processing loop deadlock by creating a dedicated TTS processing task. - Phase 1: Reduce sentence buffer threshold for earlier TTS chunk dispatch. - Phase 2: Implement parallel TTS prefetching to synthesize the next sentence while streaming. - Phase 3: Enable streaming TTS synthesis to stream audio while synthesizing. - Phase 4: Achieve full pipeline parallelism for LLM to TTS to WebSocket streaming. - Created a detailed test implementation plan with metrics and success criteria to validate improvements. test: Add unit tests for HandoffService - Created unit tests for the HandoffService, covering handoff detection, target resolution, and handoff resolution methods. - Implemented tests for greeting selection and context building to ensure proper functionality. - Added tests for the HandoffResolution dataclass to verify properties and default values.
- Introduced ScenarioBuilder component for visual orchestration of agent flows. - Implemented drag-and-drop functionality for agents and handoff configuration. - Added buttons in RealTimeVoiceApp for accessing Agent and Scenario Builders. - Enhanced state management for agent scenarios, including creation and updates. - Integrated new handoff editor for configuring agent interactions.
…handoffs Hotfix/agent context and handoffs
… management - Implement tests to verify cleanup functionality in LiveOrchestrator. - Ensure proper registration and unregistration of orchestrators in the registry. - Test background task tracking and cleanup mechanisms. - Validate greeting task cancellation during orchestrator cleanup. - Introduce memory leak detection tests to prevent unbounded growth in orchestrator registry. - Verify user message history deque is properly bounded and cleared on cleanup. - Add scenario update tests to ensure correct agent management during updates. - Optimize hot path functions to ensure non-blocking behavior during network calls.
…mable pool, Redis manager, speech auth manager, speech recognizer, and text-to-speech modules for improved log verbosity control. Remove outdated greeting context tests and add comprehensive scenario orchestration contract tests to ensure functional contracts are preserved during refactoring. Update session agent manager tests to use set comparison for agent listing to avoid dict ordering issues.
- Added `metrics_factory.py` to provide a common infrastructure for OpenTelemetry metrics. - Implemented `LazyMeter`, `LazyHistogram`, and `LazyCounter` for lazy initialization of metrics. - Updated `speech_cascade/metrics.py` to utilize the new shared metrics factory, simplifying metric initialization. - Refactored `voicelive/metrics.py` to use the shared factory for consistent metric handling. - Enhanced orchestrator classes in `speech_cascade/orchestrator.py` and `voicelive/orchestrator.py` to cache orchestrator configurations, improving performance and reducing redundant calls. - Introduced utility functions for building common metric attributes, ensuring consistency across metrics.
…sistent behavior across orchestrators and enhance documentation
feat: Implement TTS Streaming Latency Analysis and Optimization Plan
… quickstart guide
…-provisioning process
Docs/user flows - and adding easyauth as part of postprovision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements comprehensive TTS streaming latency analysis and optimization planning, along with several infrastructure improvements and frontend enhancements for the real-time voice application.
Key changes:
- Introduces detailed TTS latency analysis documentation identifying root causes and optimization strategies
- Refactors logging from
infotodebugfor hot-path operations to reduce noise - Adds new test suites for handoff service and generic handoff functionality
- Enhances frontend with scenario builder integration and custom scenario support
Reviewed changes
Copilot reviewed 84 out of 110 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_handoff_service.py |
New comprehensive test suite (1186 lines) for unified handoff resolution service |
tests/test_generic_handoff_tool.py |
New test suite (467 lines) for generic handoff tool and scenario integration |
src/speech/text_to_speech.py |
Reduced logging verbosity for TTS warm connection |
src/speech/speech_recognizer.py |
Reduced logging verbosity for STT preparation and warm connection |
src/speech/auth_manager.py |
Reduced logging verbosity for token pre-fetch |
src/redis/manager.py |
Added error handling for Redis cluster and OSError exceptions; reduced logging verbosity |
src/pools/warmable_pool.py |
Reduced logging verbosity for pool operations |
src/pools/connection_manager.py |
Reduced logging verbosity for connection manager initialization |
pyproject.toml |
Changed PyYAML to lowercase pyyaml for consistency |
docs/proposals/tts-streaming-latency-analysis.md |
New comprehensive TTS latency analysis and optimization plan document |
docs/proposals/scenario-orchestration-simplification.md |
New analysis of scenario orchestration complexity with simplification recommendations |
docs/proposals/handoff-consolidation-plan.md |
Progress tracking for handoff orchestration consolidation |
docs/mkdocs.yml |
Added new documentation sections for orchestration and handoff service |
docs/guides/agent-voice-model-config.md |
Updated model deployment examples from preview to production model names |
docs/getting-started/quickstart.md |
Enhanced with detailed deployment hooks, profile creation, agent builder, and scenario builder guides |
docs/getting-started/local-development.md |
Added UI orientation screenshots |
docs/getting-started/README.md |
Minor formatting update |
docs/architecture/orchestration/handoff-service.md |
New comprehensive handoff service documentation |
docs/architecture/orchestration/README.md |
Updated orchestration overview with scenario-based routing and unified handoff service |
devops/scripts/azd/postprovision.sh |
Added EasyAuth configuration task; commented out Cosmos DB initialization |
devops/scripts/azd/helpers/sync-appconfig.sh |
Added endpoint format validation and improved error reporting |
devops/scripts/azd/helpers/seed_data/financial.py |
Added new demo user profile (jin_lee_cfs) |
devops/scripts/azd/helpers/preflight-checks.sh |
Added interactive prompt for Azure preflight checks |
devops/scripts/azd/helpers/enable-easyauth.sh |
New script for enabling EasyAuth with Federated Identity Credentials |
apps/artagent/frontend/src/hooks/useRealTimeVoiceApp.js |
Changed hot-path console.log to console.debug for reduced noise |
apps/artagent/frontend/src/components/App.jsx |
Added scenario builder integration, custom scenario support, and streaming performance optimizations |
apps/artagent/frontend/src/components/AgentScenarioBuilder.jsx |
New unified builder component combining agent and scenario builders |
apps/artagent/frontend/src/components/AgentBuilder.jsx |
Enhanced with separate cascade/voicelive model configuration and improved template display |
apps/artagent/backend/voice/voicelive/session_loader.py |
Deleted - wrapper file no longer needed |
apps/artagent/backend/voice/voicelive/metrics.py |
Refactored to use shared metrics factory for lazy initialization |
apps/artagent/backend/voice/voicelive/handler.py |
Added session agent support, orchestrator registration, improved cleanup, and audio event prioritization |
apps/artagent/backend/voice/voicelive/__init__.py |
Added orchestrator registry exports |
apps/artagent/backend/voice/speech_cascade/metrics.py |
Refactored to use shared metrics factory; added TTS synthesis and streaming metrics |
apps/artagent/backend/voice/__init__.py |
Updated module documentation |
apps/artagent/backend/registries/toolstore/registry.py |
Reduced logging verbosity for tool registry initialization |
| if attempt >= retries: | ||
| break | ||
| self._create_client() | ||
| except RedisClusterException as cluster_err: |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new exception handlers for RedisClusterException and OSError duplicate the retry logic that already exists in the generic Exception handler. Consider consolidating this by handling these exceptions in the existing catch block with improved error messages, or extracting the retry logic into a helper function to avoid code duplication.
| const REALTIME_STREAM_MODE_FALLBACK = 'realtime'; | ||
| const PANEL_MARGIN = 16; | ||
| // Avoid noisy logging in hot-path streaming handlers unless explicitly enabled | ||
| const ENABLE_VERBOSE_STREAM_LOGS = false; |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ENABLE_VERBOSE_STREAM_LOGS constant should be configurable via environment variables rather than hardcoded. This would allow enabling verbose logs in production for debugging without requiring code changes.
| const ENABLE_VERBOSE_STREAM_LOGS = false; | |
| const ENABLE_VERBOSE_STREAM_LOGS = | |
| typeof process !== 'undefined' && | |
| process.env && | |
| process.env.REACT_APP_ENABLE_VERBOSE_STREAM_LOGS === 'true'; |
|
|
||
| def _log_outcome(t: asyncio.Task) -> None: | ||
| def _background_task(coro: Awaitable[Any], *, label: str) -> asyncio.Task: | ||
| """Create a tracked background task that will be cleaned up on handler stop.""" |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _background_task function now returns an asyncio.Task but this return value appears unused in the codebase. If callers don't need to track tasks individually (since cleanup is centralized), consider documenting why the return value exists or making it explicit that callers can ignore it.
| """Create a tracked background task that will be cleaned up on handler stop.""" | |
| """ | |
| Create a tracked background task that will be cleaned up on handler stop. | |
| The returned :class:`asyncio.Task` is registered in ``_pending_background_tasks`` | |
| and will be cancelled by ``_cancel_all_background_tasks`` when the handler | |
| shuts down. Callers are not required to keep or use the returned task; it is | |
| exposed only for optional observability (e.g., attaching custom callbacks or | |
| diagnostics) and can be safely ignored in normal usage. | |
| """ |
|
|
||
| ### The Processing Loop Deadlock | ||
|
|
||
| **Location:** [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615) |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The document contains hardcoded line number references to source files which will become stale as the code evolves. Consider using more stable references like function names or section headers, or add a note that line numbers are approximate and may change.
| **Location:** [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615) | |
| **Location:** [handler.py (SpeechCascadeHandler._processing_loop)](../apps/artagent/backend/voice/speech_cascade/handler.py) |
| }, | ||
| ]; | ||
|
|
||
| // Legacy: combined options for backward compatibility |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment indicates this is for 'backward compatibility' but the code redefines MODEL_OPTIONS to point to CASCADE_MODEL_OPTIONS after both model option arrays are defined. If this is truly legacy, consider deprecating it or making the backward compatibility reason more explicit (e.g., which external code depends on MODEL_OPTIONS).
| // Legacy: combined options for backward compatibility | |
| /** | |
| * Legacy alias kept for backward compatibility. | |
| * | |
| * Some existing code paths and/or persisted configurations still reference | |
| * `MODEL_OPTIONS`. New code should prefer `CASCADE_MODEL_OPTIONS` (for text/chat) | |
| * or `VOICELIVE_MODEL_OPTIONS` (for realtime voice). | |
| * | |
| * @deprecated Use `CASCADE_MODEL_OPTIONS` or `VOICELIVE_MODEL_OPTIONS` directly. | |
| */ |
…agement and improve handoff context
- Consolidated TTS playback logic into a unified class for speech cascade. - Removed deprecated VoiceSessionContext and related compatibility shims. - Enhanced error handling during tool initialization and event handler registration. - Updated model configuration handling in UnifiedAgent to prioritize mode-specific settings. - Improved logging for TTS synthesis and streaming processes. - Added new handoff tool registration for dynamic routing.
… interactive prompts
pablosalvador10
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…-local-quickstart-80 docs: add local development guide (closes Azure-Samples#80)
No description provided.