Skip to content

perf: QwenPaw's application startup#3386

Open
rayrayraykk wants to merge 3 commits intoagentscope-ai:mainfrom
rayrayraykk:weirui/dev/lazy0414
Open

perf: QwenPaw's application startup#3386
rayrayraykk wants to merge 3 commits intoagentscope-ai:mainfrom
rayrayraykk:weirui/dev/lazy0414

Conversation

@rayrayraykk
Copy link
Copy Markdown
Member

Description

This PR significantly improves QwenPaw's application startup performance through lazy loading and parallel initialization, reducing server ready time from ~4.5 seconds to ~0.05 seconds while maintaining full functionality.

Key Improvements

  1. Two-Phase Startup Architecture

    • Phase 1 (Fast): Essential setup completes in <100ms, allowing HTTP server to start accepting requests immediately
    • Phase 2 (Background): Heavy initialization (agents, plugins, services) runs asynchronously without blocking
  2. True Parallel Agent Initialization

    • Refactored MultiAgentManager.get_agent() with fine-grained locking
    • Introduced _pending_starts coordination with asyncio.Event to prevent duplicate initialization
    • Multiple agents now start truly in parallel instead of sequentially
  3. Non-Blocking Service Initialization

    • Wrapped synchronous service constructors with asyncio.to_thread() to prevent event loop blocking
    • Services like ReMeLightMemoryManager and MCP clients no longer block the main event loop during initialization
  4. Cleaner Startup Logs

    • Reduced log noise by moving repetitive initialization logs from INFO to DEBUG level
    • Key milestones remain visible: server ready time, agent startup summary, background completion
    • Added final ✨ QwenPaw ready! message with server URL after all background tasks complete

Performance Results

Before:

INFO:     Uvicorn running on http://127.0.0.1:8088 (Press CTRL+C to quit)
... (4.446 seconds of blocking initialization logs)

After:

INFO: Server ready in 0.043s (agents loading in background)
INFO: Uvicorn running on http://127.0.0.1:8088 (Press CTRL+C to quit)
... (background initialization logs)
INFO: Agent startup complete: 4/4 agents started successfully, 5 disabled
INFO: Background startup completed in 2.772 seconds
INFO: ✨ QwenPaw ready! → http://127.0.0.1:8088

Related Issue: N/A (Performance optimization)

Security Considerations: None. Changes only affect initialization order and concurrency control, not security boundaries.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Refactoring

Component(s) Affected

  • Core / Backend (app, agents, config, providers, utils, local_models)
  • Console (frontend web UI)
  • Channels (DingTalk, Feishu, QQ, Discord, iMessage, etc.)
  • Skills
  • CLI
  • Documentation (website)
  • Tests
  • CI/CD
  • Scripts / Deploy

Checklist

  • I ran pre-commit run --all-files locally and it passes
  • If pre-commit auto-fixed files, I committed those changes and reran checks
  • I ran tests locally (pytest or as relevant) and they pass
  • Documentation updated (if needed)
  • Ready for review

Testing

Manual Testing Steps

  1. Startup Performance Verification

    qwenpaw app --log-level info
    • Verify server shows "ready" in <100ms
    • Verify Uvicorn line appears early
    • Verify "Background startup completed" appears ~2-3s later
    • Verify final "QwenPaw ready!" message appears with correct URL
  2. Functional Verification

    # Start server
    qwenpaw app
    
    # In another terminal, wait for "Background startup completed", then:
    curl http://127.0.0.1:8088/api/agents
    curl http://127.0.0.1:8088/api/health
    • Verify all endpoints respond correctly
    • Verify agents are properly initialized
  3. Parallel Initialization Verification

    qwenpaw app --log-level debug
    • Check DEBUG logs show multiple agents starting simultaneously
    • Verify workspace creation timestamps overlap (parallelism)
    • Verify no deadlocks or race conditions
  4. UI Responsiveness

    • Open http://127.0.0.1:8088 immediately after server starts
    • Verify UI loads (may show loading state for agents)
    • Verify UI becomes fully functional after background startup completes

Modified Files

Core Changes:

  • src/qwenpaw/app/_app.py: Two-phase lifespan with background initialization
  • src/qwenpaw/app/multi_agent_manager.py: Fine-grained locking and parallel agent startup
  • src/qwenpaw/app/workspace/service_manager.py: asyncio.to_thread() for sync constructors

Log Level Adjustments (INFO → DEBUG):

  • src/qwenpaw/app/channels/command_registry.py
  • src/qwenpaw/app/channels/unified_queue_manager.py
  • src/qwenpaw/app/channels/console/channel.py
  • src/qwenpaw/app/channels/manager.py
  • src/qwenpaw/app/runner/manager.py
  • src/qwenpaw/app/runner/control_commands/__init__.py
  • src/qwenpaw/app/agent_config_watcher.py
  • src/qwenpaw/app/mcp/stateful_client.py
  • src/qwenpaw/app/workspace/service_factories.py
  • src/qwenpaw/app/workspace/workspace.py
  • src/qwenpaw/agents/memory/reme_light_memory_manager.py
  • src/qwenpaw/providers/provider_manager.py

Local Verification Evidence

# Pre-commit checks
pre-commit run --all-files
# (pending - to be run before merge)

# Unit tests
pytest tests/unit/
# (pending - existing tests should pass, no behavioral changes)

# Performance measurement
time qwenpaw app --log-level info
# Expected: "Server ready" message appears in <100ms
# Expected: "Background startup completed" in 2-3s

Additional Notes

Technical Details

Concurrency Strategy:

  • Lock is held only during dictionary access in MultiAgentManager.get_agent()
  • Actual Workspace.start() runs outside the lock, allowing parallel execution
  • asyncio.Event coordination prevents duplicate initialization attempts

Event Loop Protection:

  • All synchronous blocking operations (service constructors, start methods) wrapped with asyncio.to_thread()
  • This allows the event loop to remain responsive for HTTP requests during background initialization

Backward Compatibility:

  • No API changes
  • No configuration changes
  • No breaking changes to plugins or channels
  • Only internal initialization flow modified

Future Enhancements

Potential follow-ups (not included in this PR):

  1. True on-demand agent initialization (lazy load only when first accessed via API)
  2. Service dependency graph for optimal parallel initialization order
  3. Startup progress reporting via SSE endpoint for UI loading indicators
  4. Configurable startup behavior (eager vs lazy) per agent

Known Limitations

  • First API request requiring an agent will still wait for that agent's initialization to complete (by design - lazy loading waits on first access)
  • Background initialization logs can still be verbose in DEBUG mode (intentional for troubleshooting)
  • Plugin startup hooks remain sequential (parallel execution may break assumptions)

Copilot AI review requested due to automatic review settings April 14, 2026 13:00
@github-project-automation github-project-automation bot moved this to Todo in QwenPaw Apr 14, 2026
@rayrayraykk rayrayraykk requested a deployment to maintainer-approved April 14, 2026 13:00 — with GitHub Actions Waiting
@github-actions
Copy link
Copy Markdown

Welcome to QwenPaw! 🐾

Hi @rayrayraykk, this is your 114th Pull Request.

🙌 Join Developer Community

Thanks so much for your contribution! We'd love to invite you to join the official QwenPaw developer group! You can find the Discord and DingTalk group links under the "Developer Community" section on our docs page:
https://qwenpaw.agentscope.io/docs/community

We truly appreciate your enthusiasm—and look forward to your future contributions! 😊

We'll review your PR soon.

@rayrayraykk rayrayraykk linked an issue Apr 14, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors QwenPaw’s backend startup sequence to reduce time-to-serve by splitting initialization into a fast “server-ready” phase and a background “heavy init” phase, and by enabling more parallel startup work (notably agent workspace initialization).

Changes:

  • Refactors FastAPI lifespan into a two-phase startup: minimal synchronous setup first, heavy initialization in a background task.
  • Refactors MultiAgentManager.get_agent() to deduplicate concurrent workspace creation and reduce lock hold time to enable parallel agent startup.
  • Reduces startup log noise by demoting many INFO logs to DEBUG and adds timing/debug instrumentation for service startup.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/qwenpaw/app/_app.py Introduces two-phase lifespan and background initialization task; reorganizes plugin/agent/service startup order.
src/qwenpaw/app/multi_agent_manager.py Adds per-agent pending-start coordination to avoid duplicate initialization and allow parallel startup.
src/qwenpaw/app/workspace/service_manager.py Adds yielding/timing logs; offloads constructors/start methods to thread pool to avoid blocking the event loop.
src/qwenpaw/app/workspace/workspace.py Demotes workspace lifecycle logs from INFO to DEBUG.
src/qwenpaw/app/workspace/service_factories.py Demotes ChatManager reuse/creation logs from INFO to DEBUG.
src/qwenpaw/app/runner/manager.py Demotes ChatManager constructor log from INFO to DEBUG.
src/qwenpaw/app/runner/control_commands/init.py Demotes command registration logs from INFO to DEBUG.
src/qwenpaw/app/mcp/stateful_client.py Demotes MCP “connected” logs from INFO to DEBUG.
src/qwenpaw/app/channels/unified_queue_manager.py Demotes queue manager initialization/cleanup loop logs from INFO to DEBUG.
src/qwenpaw/app/channels/manager.py Demotes workspace injection log from INFO to DEBUG.
src/qwenpaw/app/channels/console/channel.py Demotes console channel started log from INFO to DEBUG.
src/qwenpaw/app/channels/command_registry.py Demotes command registry logs from INFO to DEBUG.
src/qwenpaw/app/agent_config_watcher.py Demotes watcher started log from INFO to DEBUG.
src/qwenpaw/agents/memory/reme_light_memory_manager.py Demotes memory manager init/config logs from INFO to DEBUG.
src/qwenpaw/providers/provider_manager.py Demotes “background local model restore completed” log from INFO to DEBUG.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +290 to +294
# Offload synchronous constructor to thread pool to avoid blocking
# the event loop during background startup.
service = await asyncio.to_thread(
partial(service_cls, **init_kwargs),
)
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Service constructors are now always executed via asyncio.to_thread(). This will break services whose init creates asyncio primitives or expects a running event loop (e.g., MCPClientManager.init creates asyncio.Lock at src/qwenpaw/app/mcp/manager.py:34-38; CronManager.init creates AsyncIOScheduler and asyncio.Lock at src/qwenpaw/app/crons/manager.py:52-58). In a worker thread, these typically raise RuntimeError("no running event loop") or bind to the wrong loop. Consider making thread-offloading opt-in per ServiceDescriptor (default off), or refactor affected services so event-loop-bound objects are created in an async start() that runs on the main loop (and keep init lightweight).

Copilot uses AI. Check for mistakes.
Comment on lines +121 to +135
self._pending_starts.pop(agent_id, None)

elapsed = time.perf_counter() - t0
event.set()
logger.debug(
f"Workspace created and started: {agent_id} "
f"({elapsed:.3f}s)",
)
return instance
except Exception as e:
async with self._lock:
self._pending_starts.pop(agent_id, None)
event.set()
logger.error(f"Failed to start workspace {agent_id}: {e}")
raise
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_agent() cleans up _pending_starts and signals waiters only in the success path and in except Exception. On Python 3.11+ asyncio.CancelledError inherits from BaseException, so a cancellation during await instance.start() will skip this handler, leaving _pending_starts populated and the Event never set—subsequent callers will await forever. Ensure cleanup + event.set() happens in a finally (or handle CancelledError explicitly) and then re-raise the cancellation.

Suggested change
self._pending_starts.pop(agent_id, None)
elapsed = time.perf_counter() - t0
event.set()
logger.debug(
f"Workspace created and started: {agent_id} "
f"({elapsed:.3f}s)",
)
return instance
except Exception as e:
async with self._lock:
self._pending_starts.pop(agent_id, None)
event.set()
logger.error(f"Failed to start workspace {agent_id}: {e}")
raise
elapsed = time.perf_counter() - t0
logger.debug(
f"Workspace created and started: {agent_id} "
f"({elapsed:.3f}s)",
)
return instance
except Exception as e:
logger.error(f"Failed to start workspace {agent_id}: {e}")
raise
finally:
async with self._lock:
self._pending_starts.pop(agent_id, None)
event.set()

Copilot uses AI. Check for mistakes.
Comment thread src/qwenpaw/app/multi_agent_manager.py
@xieyxclack xieyxclack self-requested a review April 15, 2026 08:41
@rayrayraykk rayrayraykk requested a deployment to maintainer-approved April 15, 2026 09:37 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

[Question]: Optimization suggestions

3 participants