feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81

JinLee794 · 2025-12-18T22:29:04Z

No description provided.

- Added a comprehensive document outlining the critical latency issues in TTS playback within the Speech Cascade architecture. - Identified root causes including processing loop deadlock, sentence buffering delays, queue-based event processing, and full synthesis before streaming. - Proposed a multi-phase optimization strategy to address identified issues, including: - Phase 0: Fix processing loop deadlock by creating a dedicated TTS processing task. - Phase 1: Reduce sentence buffer threshold for earlier TTS chunk dispatch. - Phase 2: Implement parallel TTS prefetching to synthesize the next sentence while streaming. - Phase 3: Enable streaming TTS synthesis to stream audio while synthesizing. - Phase 4: Achieve full pipeline parallelism for LLM to TTS to WebSocket streaming. - Created a detailed test implementation plan with metrics and success criteria to validate improvements. test: Add unit tests for HandoffService - Created unit tests for the HandoffService, covering handoff detection, target resolution, and handoff resolution methods. - Implemented tests for greeting selection and context building to ensure proper functionality. - Added tests for the HandoffResolution dataclass to verify properties and default values.

- Introduced ScenarioBuilder component for visual orchestration of agent flows. - Implemented drag-and-drop functionality for agents and handoff configuration. - Added buttons in RealTimeVoiceApp for accessing Agent and Scenario Builders. - Enhanced state management for agent scenarios, including creation and updates. - Integrated new handoff editor for configuring agent interactions.

…handoffs Hotfix/agent context and handoffs

… management - Implement tests to verify cleanup functionality in LiveOrchestrator. - Ensure proper registration and unregistration of orchestrators in the registry. - Test background task tracking and cleanup mechanisms. - Validate greeting task cancellation during orchestrator cleanup. - Introduce memory leak detection tests to prevent unbounded growth in orchestrator registry. - Verify user message history deque is properly bounded and cleared on cleanup. - Add scenario update tests to ensure correct agent management during updates. - Optimize hot path functions to ensure non-blocking behavior during network calls.

…I elements

…mable pool, Redis manager, speech auth manager, speech recognizer, and text-to-speech modules for improved log verbosity control. Remove outdated greeting context tests and add comprehensive scenario orchestration contract tests to ensure functional contracts are preserved during refactoring. Update session agent manager tests to use set comparison for agent listing to avoid dict ordering issues.

…rchestration

- Added `metrics_factory.py` to provide a common infrastructure for OpenTelemetry metrics. - Implemented `LazyMeter`, `LazyHistogram`, and `LazyCounter` for lazy initialization of metrics. - Updated `speech_cascade/metrics.py` to utilize the new shared metrics factory, simplifying metric initialization. - Refactored `voicelive/metrics.py` to use the shared factory for consistent metric handling. - Enhanced orchestrator classes in `speech_cascade/orchestrator.py` and `voicelive/orchestrator.py` to cache orchestrator configurations, improving performance and reducing redundant calls. - Introduced utility functions for building common metric attributes, ensuring consistency across metrics.

…sistent behavior across orchestrators and enhance documentation

feat: Implement TTS Streaming Latency Analysis and Optimization Plan

…agent setup

…ance

… quickstart guide

…t guide

…-provisioning process

…tputs

Docs/user flows - and adding easyauth as part of postprovision

Copilot

Pull request overview

This PR implements comprehensive TTS streaming latency analysis and optimization planning, along with several infrastructure improvements and frontend enhancements for the real-time voice application.

Key changes:

Introduces detailed TTS latency analysis documentation identifying root causes and optimization strategies
Refactors logging from info to debug for hot-path operations to reduce noise
Adds new test suites for handoff service and generic handoff functionality
Enhances frontend with scenario builder integration and custom scenario support

Reviewed changes

Copilot reviewed 84 out of 110 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`tests/test_handoff_service.py`	New comprehensive test suite (1186 lines) for unified handoff resolution service
`tests/test_generic_handoff_tool.py`	New test suite (467 lines) for generic handoff tool and scenario integration
`src/speech/text_to_speech.py`	Reduced logging verbosity for TTS warm connection
`src/speech/speech_recognizer.py`	Reduced logging verbosity for STT preparation and warm connection
`src/speech/auth_manager.py`	Reduced logging verbosity for token pre-fetch
`src/redis/manager.py`	Added error handling for Redis cluster and OSError exceptions; reduced logging verbosity
`src/pools/warmable_pool.py`	Reduced logging verbosity for pool operations
`src/pools/connection_manager.py`	Reduced logging verbosity for connection manager initialization
`pyproject.toml`	Changed `PyYAML` to lowercase `pyyaml` for consistency
`docs/proposals/tts-streaming-latency-analysis.md`	New comprehensive TTS latency analysis and optimization plan document
`docs/proposals/scenario-orchestration-simplification.md`	New analysis of scenario orchestration complexity with simplification recommendations
`docs/proposals/handoff-consolidation-plan.md`	Progress tracking for handoff orchestration consolidation
`docs/mkdocs.yml`	Added new documentation sections for orchestration and handoff service
`docs/guides/agent-voice-model-config.md`	Updated model deployment examples from preview to production model names
`docs/getting-started/quickstart.md`	Enhanced with detailed deployment hooks, profile creation, agent builder, and scenario builder guides
`docs/getting-started/local-development.md`	Added UI orientation screenshots
`docs/getting-started/README.md`	Minor formatting update
`docs/architecture/orchestration/handoff-service.md`	New comprehensive handoff service documentation
`docs/architecture/orchestration/README.md`	Updated orchestration overview with scenario-based routing and unified handoff service
`devops/scripts/azd/postprovision.sh`	Added EasyAuth configuration task; commented out Cosmos DB initialization
`devops/scripts/azd/helpers/sync-appconfig.sh`	Added endpoint format validation and improved error reporting
`devops/scripts/azd/helpers/seed_data/financial.py`	Added new demo user profile (jin_lee_cfs)
`devops/scripts/azd/helpers/preflight-checks.sh`	Added interactive prompt for Azure preflight checks
`devops/scripts/azd/helpers/enable-easyauth.sh`	New script for enabling EasyAuth with Federated Identity Credentials
`apps/artagent/frontend/src/hooks/useRealTimeVoiceApp.js`	Changed hot-path console.log to console.debug for reduced noise
`apps/artagent/frontend/src/components/App.jsx`	Added scenario builder integration, custom scenario support, and streaming performance optimizations
`apps/artagent/frontend/src/components/AgentScenarioBuilder.jsx`	New unified builder component combining agent and scenario builders
`apps/artagent/frontend/src/components/AgentBuilder.jsx`	Enhanced with separate cascade/voicelive model configuration and improved template display
`apps/artagent/backend/voice/voicelive/session_loader.py`	Deleted - wrapper file no longer needed
`apps/artagent/backend/voice/voicelive/metrics.py`	Refactored to use shared metrics factory for lazy initialization
`apps/artagent/backend/voice/voicelive/handler.py`	Added session agent support, orchestrator registration, improved cleanup, and audio event prioritization
`apps/artagent/backend/voice/voicelive/__init__.py`	Added orchestrator registry exports
`apps/artagent/backend/voice/speech_cascade/metrics.py`	Refactored to use shared metrics factory; added TTS synthesis and streaming metrics
`apps/artagent/backend/voice/__init__.py`	Updated module documentation
`apps/artagent/backend/registries/toolstore/registry.py`	Reduced logging verbosity for tool registry initialization

Copilot · 2025-12-18T22:30:34Z

src/redis/manager.py

                if attempt >= retries:
                    break
                self._create_client()
+            except RedisClusterException as cluster_err:


The new exception handlers for RedisClusterException and OSError duplicate the retry logic that already exists in the generic Exception handler. Consider consolidating this by handling these exceptions in the existing catch block with improved error messages, or extracting the retry logic into a helper function to avoid code duplication.

Copilot · 2025-12-18T22:30:34Z

apps/artagent/frontend/src/components/App.jsx

 const REALTIME_STREAM_MODE_FALLBACK = 'realtime';
 const PANEL_MARGIN = 16;
+// Avoid noisy logging in hot-path streaming handlers unless explicitly enabled
+const ENABLE_VERBOSE_STREAM_LOGS = false;


The ENABLE_VERBOSE_STREAM_LOGS constant should be configurable via environment variables rather than hardcoded. This would allow enabling verbose logs in production for debugging without requiring code changes.

Suggested change

const ENABLE_VERBOSE_STREAM_LOGS = false;

const ENABLE_VERBOSE_STREAM_LOGS =

typeof process !== 'undefined' &&

process.env &&

process.env.REACT_APP_ENABLE_VERBOSE_STREAM_LOGS === 'true';

Copilot · 2025-12-18T22:30:35Z

apps/artagent/backend/voice/voicelive/handler.py


-    def _log_outcome(t: asyncio.Task) -> None:
+def _background_task(coro: Awaitable[Any], *, label: str) -> asyncio.Task:
+    """Create a tracked background task that will be cleaned up on handler stop."""


The _background_task function now returns an asyncio.Task but this return value appears unused in the codebase. If callers don't need to track tasks individually (since cleanup is centralized), consider documenting why the return value exists or making it explicit that callers can ignore it.

Suggested change

"""Create a tracked background task that will be cleaned up on handler stop."""

"""

Create a tracked background task that will be cleaned up on handler stop.

The returned :class:`asyncio.Task` is registered in ``_pending_background_tasks``

and will be cancelled by ``_cancel_all_background_tasks`` when the handler

shuts down. Callers are not required to keep or use the returned task; it is

exposed only for optional observability (e.g., attaching custom callbacks or

diagnostics) and can be safely ignored in normal usage.

"""

Copilot · 2025-12-18T22:30:35Z

docs/proposals/tts-streaming-latency-analysis.md

+
+### The Processing Loop Deadlock
+
+**Location:** [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615)


The document contains hardcoded line number references to source files which will become stale as the code evolves. Consider using more stable references like function names or section headers, or add a note that line numbers are approximate and may change.

Suggested change

**Location:** [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615)

**Location:** [handler.py (SpeechCascadeHandler._processing_loop)](../apps/artagent/backend/voice/speech_cascade/handler.py)

Copilot · 2025-12-18T22:30:35Z

apps/artagent/frontend/src/components/AgentBuilder.jsx

+  },
+];
+
+// Legacy: combined options for backward compatibility


The comment indicates this is for 'backward compatibility' but the code redefines MODEL_OPTIONS to point to CASCADE_MODEL_OPTIONS after both model option arrays are defined. If this is truly legacy, consider deprecating it or making the backward compatibility reason more explicit (e.g., which external code depends on MODEL_OPTIONS).

Suggested change

// Legacy: combined options for backward compatibility

/**

* Legacy alias kept for backward compatibility.

*

* Some existing code paths and/or persisted configurations still reference

* `MODEL_OPTIONS`. New code should prefer `CASCADE_MODEL_OPTIONS` (for text/chat)

* or `VOICELIVE_MODEL_OPTIONS` (for realtime voice).

*

* @deprecated Use `CASCADE_MODEL_OPTIONS` or `VOICELIVE_MODEL_OPTIONS` directly.

*/

…agement and improve handoff context

- Consolidated TTS playback logic into a unified class for speech cascade. - Removed deprecated VoiceSessionContext and related compatibility shims. - Enhanced error handling during tool initialization and event handler registration. - Updated model configuration handling in UnifiedAgent to prioritize mode-specific settings. - Improved logging for TTS synthesis and streaming processes. - Added new handoff tool registration for dynamic routing.

… interactive prompts

pablosalvador10

LGTM

…-local-quickstart-80 docs: add local development guide (closes Azure-Samples#80)

JinLee794 and others added 25 commits December 13, 2025 23:55

Refactor code structure for improved readability and maintainability

ab64c70

Merge pull request #6 from AIappsGBBFactory/hotfix/agent-context-and-…

1017542

…handoffs Hotfix/agent context and handoffs

feat: Enhance AgentBuilder with consistent field names and improved U…

055374d

…I elements

feat: Add predefined handoff condition patterns to enhance scenario o…

2c67532

…rchestration

feat: Consolidate handoff logic into a unified HandoffService for con…

26bff43

…sistent behavior across orchestrators and enhance documentation

Merge branch 'staging' into feat/scenario-orch

718e5ff

fix: Simplify environment determination logic in deployment workflow

ace89cf

Merge pull request #7 from AIappsGBBFactory/feat/scenario-orch

956da4f

feat: Implement TTS Streaming Latency Analysis and Optimization Plan

feat: Add user flow screenshots and enhance documentation for guided …

c577619

…agent setup

feat: Enhance scenario testing instructions for clarity and user guid…

23600f4

…ance

fix: Correct image paths in quickstart guide for accurate rendering

d02b935

feat: Add initial agent builder and template selection screenshots to…

7d4a16d

… quickstart guide

feat: Add demo profile creation steps and related images to quickstar…

aa4faa6

…t guide

feat: Implement EasyAuth configuration script and integrate into post…

84fbb8d

…-provisioning process

refactor: Remove backend IP restrictions configuration and related ou…

18fc95f

…tputs

chore: Remove unused workflow images for demo profiles

f3b9561

fix: Update demo profile creation images in quickstart guide

1c1f7ae

fix: Update home screen image in quickstart guide

83f1e8e

fix: Update home screen and scenario images in quickstart guide

e5c592d

Merge pull request #8 from AIappsGBBFactory/docs/user-flows

48f8991

Docs/user flows - and adding easyauth as part of postprovision

Copilot AI review requested due to automatic review settings December 18, 2025 22:29

Copilot AI reviewed Dec 18, 2025

View reviewed changes

Merge branch 'staging' into staging

221a9e4

JinLee794 requested a review from pablosalvador10 December 18, 2025 22:44

add opentelemetry import for tracing support in TTS module

bb177b4

Jin Lee (HLS US SE) added 3 commits December 18, 2025 17:24

refactor: update LiveOrchestrator to enhance user message history man…

f66516b

…agement and improve handoff context

refactor: streamline EasyAuth enabling process in CI mode and improve…

8cb8d0c

… interactive prompts

pablosalvador10 approved these changes Dec 19, 2025

View reviewed changes

pablosalvador10 merged commit c257768 into Azure-Samples:staging Dec 19, 2025
4 of 8 checks passed

JinLee794 pushed a commit to AIappsGBBFactory/art-voice-agent-accelerator that referenced this pull request Dec 19, 2025

Merge pull request Azure-Samples#81 from pablosalvador10/docs/minimal…

67e5e18

…-local-quickstart-80 docs: add local development guide (closes Azure-Samples#80)

JinLee794 pushed a commit to AIappsGBBFactory/art-voice-agent-accelerator that referenced this pull request Dec 22, 2025

Merge pull request Azure-Samples#81 from AIappsGBBFactory/staging

5b17497

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81

feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81

Uh oh!

JinLee794 commented Dec 18, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 18, 2025

Uh oh!

Copilot AI Dec 18, 2025

Uh oh!

Copilot AI Dec 18, 2025

Uh oh!

Copilot AI Dec 18, 2025

Uh oh!

Copilot AI Dec 18, 2025

Uh oh!

pablosalvador10 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-const ENABLE_VERBOSE_STREAM_LOGS = false;
+const ENABLE_VERBOSE_STREAM_LOGS =
+  typeof process !== 'undefined' &&
+  process.env &&
+  process.env.REACT_APP_ENABLE_VERBOSE_STREAM_LOGS === 'true';

-    """Create a tracked background task that will be cleaned up on handler stop."""
+    """
+    Create a tracked background task that will be cleaned up on handler stop.
+    The returned :class:`asyncio.Task` is registered in ``_pending_background_tasks``
+    and will be cancelled by ``_cancel_all_background_tasks`` when the handler
+    shuts down. Callers are not required to keep or use the returned task; it is
+    exposed only for optional observability (e.g., attaching custom callbacks or
+    diagnostics) and can be safely ignored in normal usage.
+    """


		### The Processing Loop Deadlock

		Location: [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615)

	Location: [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615)
	Location: [handler.py (SpeechCascadeHandler._processing_loop)](../apps/artagent/backend/voice/speech_cascade/handler.py)

-// Legacy: combined options for backward compatibility
+/**
+ * Legacy alias kept for backward compatibility.
+ *
+ * Some existing code paths and/or persisted configurations still reference
+ * `MODEL_OPTIONS`. New code should prefer `CASCADE_MODEL_OPTIONS` (for text/chat)
+ * or `VOICELIVE_MODEL_OPTIONS` (for realtime voice).
+ *
+ * @deprecated Use `CASCADE_MODEL_OPTIONS` or `VOICELIVE_MODEL_OPTIONS` directly.
+ */

feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81

feat: Implement TTS Streaming Latency Analysis and Optimization Plan #81

Uh oh!

Conversation

JinLee794 commented Dec 18, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

pablosalvador10 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants