Skip to content

feat(channels): add SIP voice channel with pyVoIP/LiveKit dual-mode, streaming STT/TTS, and multi-turn conversation support#3449

Open
shaohuaxi wants to merge 1 commit intoagentscope-ai:mainfrom
shaohuaxi:dev_sip
Open

feat(channels): add SIP voice channel with pyVoIP/LiveKit dual-mode, streaming STT/TTS, and multi-turn conversation support#3449
shaohuaxi wants to merge 1 commit intoagentscope-ai:mainfrom
shaohuaxi:dev_sip

Conversation

@shaohuaxi
Copy link
Copy Markdown

@shaohuaxi shaohuaxi commented Apr 15, 2026

Description

Roadmap: QwenPaw Roadmap / Task List (#2291) — Task #15 "Support SIP protocol registration and voice call channel integration"
Related Issue: #3448

Add a SIP voice channel to QwenPaw, enabling real-time voice conversations via standard SIP phones and softphones (e.g., Linphone, MicroSIP, IP desk phones).

SIPChannel implements BaseChannel with a pluggable SipBackend protocol, supporting two modes:

BaseChannel
├── ConsoleChannel, DingTalkChannel, VoiceChannel (Twilio), ...
└── SIPChannel (this PR)
├── PyVoIPBackend (sip_mode="dev") — pure-Python, zero infra
└── LiveKitBackend (sip_mode="livekit") — production-grade

SIP Phone → [SIP/RTP] → Backend (pyVoIP or LiveKit SIP)

SIPChannel (BaseChannel)
├─ STT (DashScope Paraformer streaming)
├─ Agent (_process)
└─ TTS (DashScope Sambert) → audio playback

Key capabilities:

  • Dual-mode backend — Dev mode (pyVoIP, pip install and go) for local testing; Production mode (LiveKit SIP Server) for high-
    concurrency deployment
  • Streaming STT/TTS — DashScope Paraformer real-time ASR with on_sentence_end callback; DashScope Sambert TTS with mode-aware PCM
    conversion
  • Multi-turn conversation — Non-blocking transcript dispatch ensures agent processing doesn't stall subsequent STT results
  • Welcome greeting — Configurable TTS greeting played on call answer
  • Session management — Per-call lifecycle tracking with sip:{call_id} session IDs
  • Fake STT/TTS — Built-in test engines for CI-friendly testing without cloud APIs

Type of Change

  • ☐ Bug fix
  • ☑ New feature
  • ☐ Breaking change
  • ☐ Documentation
  • ☐ Refactoring

Component(s) Affected

  • ☐Core / Backend (app, agents, config, providers, utils, local_models)
  • ☐ Console (frontend web UI)
  • ☑ Channels (DingTalk, Feishu, QQ, Discord, iMessage, etc.)
  • ☐ Skills
  • ☐ CLI
  • ☑Documentation (website)
  • ☐ Tests
  • ☐ CI/CD
  • ☐ Scripts / Deploy

Changes Overview

New files (SIP-specific, self-contained)

File Lines Description
app/channels/sip/init.py +486 SIPChannel implementing BaseChannel: lifecycle, send, audio reader, transcript processing, PCM
conversion
app/channels/sip/backend.py +56 SipBackend Protocol definition with IncomingCallCallback / CallEndedCallback types
app/channels/sip/pyvoip_backend.py +193 Dev mode: pure-Python SIP UA via pyVoIP, threaded audio bridged to asyncio
app/channels/sip/livekit_backend.py +289 Production mode: LiveKit room polling, participant_connected event handling, audio track
bridging
app/channels/sip/stt_engine.py +105 STTStreamEngine Protocol + DashScope Paraformer real-time ASR implementation
app/channels/sip/stt_tts.py +64 STT/TTS factory functions (aliyun / fake providers)
app/channels/sip/fake_stt_tts.py +78 Scripted fake STT + sine-wave fake TTS for testing
app/channels/sip/session.py +67 SIPCallSession dataclass + SIPCallSessionManager
app/channels/sip/sip_client.py +33 Outbound call management with CallFailedError and hangup cause codes

Minimal modifications to upstream files

File Lines changed Description
app/channels/registry.py +1 Add "sip": (".sip", "SIPChannel") to _BUILTIN_SPECS
config/config.py +41 Add SIPChannelConfig(BaseChannelConfig) with dual-mode fields, register in ChannelConfig and
ChannelConfigUnion
pyproject.toml +9 Add sip (pyVoIP, dashscope-realtime, dashscope) and sip-livekit (livekit, livekit-api) optional dependency
groups

Upstream files NOT modified

  • base.py (BaseChannel) — zero changes
  • manager.py (ChannelManager) — zero changes
  • voice/channel.py (VoiceChannel) — zero changes
  • runner.py — zero changes
  • react_agent.py — zero changes

Architecture Design: Why Dual-Track?

Pure Python cannot handle production SIP/RTP due to GIL limitations with jitter buffering and codec resampling. Industry leaders (OpenAI
Realtime API, LiveKit, Vapi, Retell) all adopt a "media gateway terminates SIP/RTP → clean audio stream → Python AI node" pattern.

This PR follows the same proven architecture while preserving QwenPaw's "zero-config, easy to start" philosophy:

  • Track 1 (Dev): pyVoIP — pip install, dial from softphone, AI conversation works. No external infra.
  • Track 2 (Production): LiveKit SIP Server — a single Go binary handles gigabit RTP traffic; Python focuses on LLM and business
    logic.

Both tracks implement the same SipBackend protocol. Switching requires only changing sip_mode in config.

Checklist

  • ☑ I ran pre-commit
  • ☑ If pre-commit auto-fixed files, I committed those changes and reran checks
  • ☑ I ran tests locally
  • ☐ Documentation updated (will add after discussion)
  • ☑ Ready for review

Testing

Manual E2E testing performed:

  • Dev mode: Linphone → mini SIP proxy → QwenPaw pyVoIP. Verified: call establishment, welcome greeting playback, STT recognition,
    agent reply, TTS playback, multi-turn conversation, call teardown.
  • Production mode: Linphone → mini SIP proxy → LiveKit SIP Server → LiveKit Room → QwenPaw LiveKitBackend. Full chain verified.
  • Verified fake STT/TTS providers work for automated testing without cloud APIs.
  • Verified concurrent call independence (dev mode, single-concurrency by design).
  • Verified graceful handling when STT/TTS providers are unavailable.

Additional Notes

  • Default config has sip.enabled = false — no change for existing users
  • SIP dependencies are optional (pip install qwenpaw[sip] / pip install qwenpaw[sip-livekit])
  • All SIP-specific code is isolated in app/channels/sip/, minimizing upstream merge conflicts
  • The SipBackend protocol is extensible — future backends (Twilio SIP Trunking, Agora, TRTC) can be added without modifying SIPChannel

@github-project-automation github-project-automation bot moved this to Todo in QwenPaw Apr 15, 2026
@github-actions github-actions bot added the first-time-contributor PR created by a first time contributor label Apr 15, 2026
@github-actions
Copy link
Copy Markdown

Welcome to QwenPaw! 🐾

Hi @shaohuaxi, thank you for your first Pull Request! 🎉

🙌 Join Developer Community

Thanks so much for your contribution! We'd love to invite you to join the official QwenPaw developer group! You can find the Discord and DingTalk group links under the "Developer Community" section on our docs page:
https://qwenpaw.agentscope.io/docs/community

We truly appreciate your enthusiasm—and look forward to your future contributions! 😊

We'll review your PR soon.

@xieyxclack xieyxclack added the OpenTask Refer to https://github.com/agentscope-ai/CoPaw/issues/2291 label Apr 16, 2026
@shaohuaxi shaohuaxi requested a deployment to maintainer-approved April 16, 2026 02:02 — with GitHub Actions Waiting
@xieyxclack
Copy link
Copy Markdown
Member

Please format the code via pre-commit run --all-files

@shaohuaxi shaohuaxi force-pushed the dev_sip branch 2 times, most recently from 333a566 to 2f26ac5 Compare April 16, 2026 06:02
…streaming STT/TTS, and multi-turn conversation support
@shaohuaxi
Copy link
Copy Markdown
Author

shaohuaxi commented Apr 16, 2026

Please format the code via pre-commit run --all-files

@xieyxclack I've run pre-commit run --all-files locally with Python 3.10 (same as CI) and all hooks pass cleanly. Could you please approve the workflow run so CI can verify

image

@shaohuaxi shaohuaxi temporarily deployed to maintainer-approved April 16, 2026 06:23 — with GitHub Actions Inactive
@xieyxclack
Copy link
Copy Markdown
Member

@shaohuaxi Thank you for your contribution! It seems that sip does not support being configured through the console (or am I misunderstanding something?)

Maybe you can refer to this pr (adding telegram channel) : #147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first-time-contributor PR created by a first time contributor OpenTask Refer to https://github.com/agentscope-ai/CoPaw/issues/2291 Under Review

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

2 participants