Skip to content

Conversation

@JinLee794
Copy link
Collaborator

No description provided.

JinLee794 and others added 25 commits December 13, 2025 23:55
- Added a comprehensive document outlining the critical latency issues in TTS playback within the Speech Cascade architecture.
- Identified root causes including processing loop deadlock, sentence buffering delays, queue-based event processing, and full synthesis before streaming.
- Proposed a multi-phase optimization strategy to address identified issues, including:
  - Phase 0: Fix processing loop deadlock by creating a dedicated TTS processing task.
  - Phase 1: Reduce sentence buffer threshold for earlier TTS chunk dispatch.
  - Phase 2: Implement parallel TTS prefetching to synthesize the next sentence while streaming.
  - Phase 3: Enable streaming TTS synthesis to stream audio while synthesizing.
  - Phase 4: Achieve full pipeline parallelism for LLM to TTS to WebSocket streaming.
- Created a detailed test implementation plan with metrics and success criteria to validate improvements.

test: Add unit tests for HandoffService

- Created unit tests for the HandoffService, covering handoff detection, target resolution, and handoff resolution methods.
- Implemented tests for greeting selection and context building to ensure proper functionality.
- Added tests for the HandoffResolution dataclass to verify properties and default values.
- Introduced ScenarioBuilder component for visual orchestration of agent flows.
- Implemented drag-and-drop functionality for agents and handoff configuration.
- Added buttons in RealTimeVoiceApp for accessing Agent and Scenario Builders.
- Enhanced state management for agent scenarios, including creation and updates.
- Integrated new handoff editor for configuring agent interactions.
…handoffs

Hotfix/agent context and handoffs
… management

- Implement tests to verify cleanup functionality in LiveOrchestrator.
- Ensure proper registration and unregistration of orchestrators in the registry.
- Test background task tracking and cleanup mechanisms.
- Validate greeting task cancellation during orchestrator cleanup.
- Introduce memory leak detection tests to prevent unbounded growth in orchestrator registry.
- Verify user message history deque is properly bounded and cleared on cleanup.
- Add scenario update tests to ensure correct agent management during updates.
- Optimize hot path functions to ensure non-blocking behavior during network calls.
…mable pool, Redis manager, speech auth manager, speech recognizer, and text-to-speech modules for improved log verbosity control. Remove outdated greeting context tests and add comprehensive scenario orchestration contract tests to ensure functional contracts are preserved during refactoring. Update session agent manager tests to use set comparison for agent listing to avoid dict ordering issues.
- Added `metrics_factory.py` to provide a common infrastructure for OpenTelemetry metrics.
- Implemented `LazyMeter`, `LazyHistogram`, and `LazyCounter` for lazy initialization of metrics.
- Updated `speech_cascade/metrics.py` to utilize the new shared metrics factory, simplifying metric initialization.
- Refactored `voicelive/metrics.py` to use the shared factory for consistent metric handling.
- Enhanced orchestrator classes in `speech_cascade/orchestrator.py` and `voicelive/orchestrator.py` to cache orchestrator configurations, improving performance and reducing redundant calls.
- Introduced utility functions for building common metric attributes, ensuring consistency across metrics.
…sistent behavior across orchestrators and enhance documentation
feat: Implement TTS Streaming Latency Analysis and Optimization Plan
Docs/user flows - and adding easyauth as part of postprovision
Copilot AI review requested due to automatic review settings December 18, 2025 22:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements comprehensive TTS streaming latency analysis and optimization planning, along with several infrastructure improvements and frontend enhancements for the real-time voice application.

Key changes:

  • Introduces detailed TTS latency analysis documentation identifying root causes and optimization strategies
  • Refactors logging from info to debug for hot-path operations to reduce noise
  • Adds new test suites for handoff service and generic handoff functionality
  • Enhances frontend with scenario builder integration and custom scenario support

Reviewed changes

Copilot reviewed 84 out of 110 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_handoff_service.py New comprehensive test suite (1186 lines) for unified handoff resolution service
tests/test_generic_handoff_tool.py New test suite (467 lines) for generic handoff tool and scenario integration
src/speech/text_to_speech.py Reduced logging verbosity for TTS warm connection
src/speech/speech_recognizer.py Reduced logging verbosity for STT preparation and warm connection
src/speech/auth_manager.py Reduced logging verbosity for token pre-fetch
src/redis/manager.py Added error handling for Redis cluster and OSError exceptions; reduced logging verbosity
src/pools/warmable_pool.py Reduced logging verbosity for pool operations
src/pools/connection_manager.py Reduced logging verbosity for connection manager initialization
pyproject.toml Changed PyYAML to lowercase pyyaml for consistency
docs/proposals/tts-streaming-latency-analysis.md New comprehensive TTS latency analysis and optimization plan document
docs/proposals/scenario-orchestration-simplification.md New analysis of scenario orchestration complexity with simplification recommendations
docs/proposals/handoff-consolidation-plan.md Progress tracking for handoff orchestration consolidation
docs/mkdocs.yml Added new documentation sections for orchestration and handoff service
docs/guides/agent-voice-model-config.md Updated model deployment examples from preview to production model names
docs/getting-started/quickstart.md Enhanced with detailed deployment hooks, profile creation, agent builder, and scenario builder guides
docs/getting-started/local-development.md Added UI orientation screenshots
docs/getting-started/README.md Minor formatting update
docs/architecture/orchestration/handoff-service.md New comprehensive handoff service documentation
docs/architecture/orchestration/README.md Updated orchestration overview with scenario-based routing and unified handoff service
devops/scripts/azd/postprovision.sh Added EasyAuth configuration task; commented out Cosmos DB initialization
devops/scripts/azd/helpers/sync-appconfig.sh Added endpoint format validation and improved error reporting
devops/scripts/azd/helpers/seed_data/financial.py Added new demo user profile (jin_lee_cfs)
devops/scripts/azd/helpers/preflight-checks.sh Added interactive prompt for Azure preflight checks
devops/scripts/azd/helpers/enable-easyauth.sh New script for enabling EasyAuth with Federated Identity Credentials
apps/artagent/frontend/src/hooks/useRealTimeVoiceApp.js Changed hot-path console.log to console.debug for reduced noise
apps/artagent/frontend/src/components/App.jsx Added scenario builder integration, custom scenario support, and streaming performance optimizations
apps/artagent/frontend/src/components/AgentScenarioBuilder.jsx New unified builder component combining agent and scenario builders
apps/artagent/frontend/src/components/AgentBuilder.jsx Enhanced with separate cascade/voicelive model configuration and improved template display
apps/artagent/backend/voice/voicelive/session_loader.py Deleted - wrapper file no longer needed
apps/artagent/backend/voice/voicelive/metrics.py Refactored to use shared metrics factory for lazy initialization
apps/artagent/backend/voice/voicelive/handler.py Added session agent support, orchestrator registration, improved cleanup, and audio event prioritization
apps/artagent/backend/voice/voicelive/__init__.py Added orchestrator registry exports
apps/artagent/backend/voice/speech_cascade/metrics.py Refactored to use shared metrics factory; added TTS synthesis and streaming metrics
apps/artagent/backend/voice/__init__.py Updated module documentation
apps/artagent/backend/registries/toolstore/registry.py Reduced logging verbosity for tool registry initialization

if attempt >= retries:
break
self._create_client()
except RedisClusterException as cluster_err:
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new exception handlers for RedisClusterException and OSError duplicate the retry logic that already exists in the generic Exception handler. Consider consolidating this by handling these exceptions in the existing catch block with improved error messages, or extracting the retry logic into a helper function to avoid code duplication.

Copilot uses AI. Check for mistakes.
const REALTIME_STREAM_MODE_FALLBACK = 'realtime';
const PANEL_MARGIN = 16;
// Avoid noisy logging in hot-path streaming handlers unless explicitly enabled
const ENABLE_VERBOSE_STREAM_LOGS = false;
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ENABLE_VERBOSE_STREAM_LOGS constant should be configurable via environment variables rather than hardcoded. This would allow enabling verbose logs in production for debugging without requiring code changes.

Suggested change
const ENABLE_VERBOSE_STREAM_LOGS = false;
const ENABLE_VERBOSE_STREAM_LOGS =
typeof process !== 'undefined' &&
process.env &&
process.env.REACT_APP_ENABLE_VERBOSE_STREAM_LOGS === 'true';

Copilot uses AI. Check for mistakes.

def _log_outcome(t: asyncio.Task) -> None:
def _background_task(coro: Awaitable[Any], *, label: str) -> asyncio.Task:
"""Create a tracked background task that will be cleaned up on handler stop."""
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _background_task function now returns an asyncio.Task but this return value appears unused in the codebase. If callers don't need to track tasks individually (since cleanup is centralized), consider documenting why the return value exists or making it explicit that callers can ignore it.

Suggested change
"""Create a tracked background task that will be cleaned up on handler stop."""
"""
Create a tracked background task that will be cleaned up on handler stop.
The returned :class:`asyncio.Task` is registered in ``_pending_background_tasks``
and will be cancelled by ``_cancel_all_background_tasks`` when the handler
shuts down. Callers are not required to keep or use the returned task; it is
exposed only for optional observability (e.g., attaching custom callbacks or
diagnostics) and can be safely ignored in normal usage.
"""

Copilot uses AI. Check for mistakes.

### The Processing Loop Deadlock

**Location:** [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document contains hardcoded line number references to source files which will become stale as the code evolves. Consider using more stable references like function names or section headers, or add a note that line numbers are approximate and may change.

Suggested change
**Location:** [handler.py#L615-L766](../apps/artagent/backend/voice/speech_cascade/handler.py#L615)
**Location:** [handler.py (SpeechCascadeHandler._processing_loop)](../apps/artagent/backend/voice/speech_cascade/handler.py)

Copilot uses AI. Check for mistakes.
},
];

// Legacy: combined options for backward compatibility
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment indicates this is for 'backward compatibility' but the code redefines MODEL_OPTIONS to point to CASCADE_MODEL_OPTIONS after both model option arrays are defined. If this is truly legacy, consider deprecating it or making the backward compatibility reason more explicit (e.g., which external code depends on MODEL_OPTIONS).

Suggested change
// Legacy: combined options for backward compatibility
/**
* Legacy alias kept for backward compatibility.
*
* Some existing code paths and/or persisted configurations still reference
* `MODEL_OPTIONS`. New code should prefer `CASCADE_MODEL_OPTIONS` (for text/chat)
* or `VOICELIVE_MODEL_OPTIONS` (for realtime voice).
*
* @deprecated Use `CASCADE_MODEL_OPTIONS` or `VOICELIVE_MODEL_OPTIONS` directly.
*/

Copilot uses AI. Check for mistakes.
Jin Lee (HLS US SE) added 3 commits December 18, 2025 17:24
- Consolidated TTS playback logic into a unified class for speech cascade.
- Removed deprecated VoiceSessionContext and related compatibility shims.
- Enhanced error handling during tool initialization and event handler registration.
- Updated model configuration handling in UnifiedAgent to prioritize mode-specific settings.
- Improved logging for TTS synthesis and streaming processes.
- Added new handoff tool registration for dynamic routing.
Copy link
Contributor

@pablosalvador10 pablosalvador10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pablosalvador10 pablosalvador10 merged commit c257768 into Azure-Samples:staging Dec 19, 2025
4 of 8 checks passed
JinLee794 pushed a commit to AIappsGBBFactory/art-voice-agent-accelerator that referenced this pull request Dec 19, 2025
…-local-quickstart-80

docs: add local development guide (closes Azure-Samples#80)
JinLee794 pushed a commit to AIappsGBBFactory/art-voice-agent-accelerator that referenced this pull request Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants