Codex/audio cpp tts provider#2599
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
PR Summary by QodoAdd audio.cpp TTS provider with optional managed sidecar, config, and tests
AI Description
Diagram
High-Level Assessment
Files changed (21)
|
There was a problem hiding this comment.
Code Review
This pull request introduces the audio_cpp text-to-speech (TTS) provider integration, allowing tldw_server to interface with an external or managed audiocpp_server. It includes the core adapter, configuration models, an HTTP client, a managed sidecar supervisor, an installer helper script, and comprehensive unit and integration tests. The review feedback highlights opportunities to improve asynchronous performance by offloading synchronous file I/O operations—such as writing the server configuration and cleaning up reference audio files—to background threads using asyncio.to_thread to avoid blocking the main event loop.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| selected_port = self._select_port() | ||
| self._write_server_config(selected_port) |
There was a problem hiding this comment.
The synchronous method _write_server_config is called directly on the event loop, which blocks execution. Consider making this method asynchronous or offloading the file I/O to a background thread using asyncio.to_thread.
| selected_port = self._select_port() | |
| self._write_server_config(selected_port) | |
| selected_port = self._select_port() | |
| await self._write_server_config(selected_port) |
|
|
||
| def _write_server_config(self, port: int) -> None: | ||
| config = self._audio_cpp_config.render_server_config() | ||
| config["host"] = self._host | ||
| config["port"] = int(port) | ||
| self.server_config_path.parent.mkdir(parents=True, exist_ok=True) | ||
| self.server_config_path.write_text( | ||
| json.dumps(config, indent=2, sort_keys=True) + "\n", | ||
| encoding="utf-8", |
There was a problem hiding this comment.
To prevent blocking the event loop during file system writes, refactor _write_server_config to perform the synchronous directory creation and file writing inside asyncio.to_thread.
| def _write_server_config(self, port: int) -> None: | |
| config = self._audio_cpp_config.render_server_config() | |
| config["host"] = self._host | |
| config["port"] = int(port) | |
| self.server_config_path.parent.mkdir(parents=True, exist_ok=True) | |
| self.server_config_path.write_text( | |
| json.dumps(config, indent=2, sort_keys=True) + "\n", | |
| encoding="utf-8", | |
| async def _write_server_config(self, port: int) -> None: | |
| config = self._audio_cpp_config.render_server_config() | |
| config["host"] = self._host | |
| config["port"] = int(port) | |
| await asyncio.to_thread(self._write_server_config_sync, config) | |
| def _write_server_config_sync(self, config: dict[str, Any]) -> None: | |
| self.server_config_path.parent.mkdir(parents=True, exist_ok=True) | |
| self.server_config_path.write_text( | |
| json.dumps(config, indent=2, sort_keys=True) + "\n", | |
| encoding="utf-8", | |
| ) |
| def _cleanup_reference(self, staged_reference: Path | None) -> None: | ||
| if staged_reference is None or self._audio_cpp_config.retain_request_artifacts: | ||
| return | ||
| with suppress(OSError, ValueError): | ||
| staged_reference.unlink(missing_ok=True) |
There was a problem hiding this comment.
Deleting the reference audio file synchronously using staged_reference.unlink blocks the event loop. Offload this file system operation to a background thread using asyncio.to_thread or schedule it as a background task.
def _cleanup_reference(self, staged_reference: Path | None) -> None:
if staged_reference is None or self._audio_cpp_config.retain_request_artifacts:
return
asyncio.create_task(self._cleanup_reference_async(staged_reference))
async def _cleanup_reference_async(self, staged_reference: Path) -> None:
with suppress(OSError, ValueError):
await asyncio.to_thread(staged_reference.unlink, missing_ok=True)
Code Review by Qodo
Context used✅ Compliance rules (platform):
74 rules 1. AudioCppTTSAdapter.__init__ missing return type
|
| def __init__(self, config: dict[str, Any] | None = None): | ||
| super().__init__(config) | ||
| cfg = config or {} | ||
| self._audio_cpp_config = AudioCppConfig.from_provider_config(cfg, repo_root=Path.cwd()) | ||
| self.sample_rate = int(cfg.get("sample_rate") or 24000) | ||
| self.max_text_length = int(cfg.get("max_text_length") or 5000) | ||
| self._client: Any | None = cfg.get("client") or cfg.get("_client") | ||
| self._owns_client = self._client is None | ||
| self._sidecar_supervisor: Any | None = cfg.get("sidecar_supervisor") or cfg.get("_sidecar_supervisor") | ||
| self._owns_sidecar_supervisor = self._sidecar_supervisor is None | ||
| self._available_models: list[str] = [] | ||
| self._voices = self._parse_voice_catalog(cfg) |
There was a problem hiding this comment.
1. audiocppttsadapter.init missing return type 📘 Rule violation ✧ Quality
AudioCppTTSAdapter.__init__ is missing an explicit return type annotation (-> None). This violates the project requirement for full type hints and can degrade static analysis and tooling quality.
Agent Prompt
## Issue description
`AudioCppTTSAdapter.__init__` lacks an explicit return type annotation (`-> None`), violating the rule that all functions/methods must have parameter and return type hints.
## Issue Context
Constructors should still declare an explicit return type (`-> None`) per the checklist.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_adapter.py[72-83]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def _write_server_config(self, port: int) -> None: | ||
| config = self._audio_cpp_config.render_server_config() | ||
| config["host"] = self._host | ||
| config["port"] = int(port) | ||
| self.server_config_path.parent.mkdir(parents=True, exist_ok=True) | ||
| self.server_config_path.write_text( | ||
| json.dumps(config, indent=2, sort_keys=True) + "\n", | ||
| encoding="utf-8", | ||
| ) |
There was a problem hiding this comment.
2. Blocking write_text() in ensure_started 📘 Rule violation ➹ Performance
AudioCppSidecarSupervisor.ensure_started() performs synchronous filesystem I/O via Path.mkdir() and Path.write_text() while running in an async flow. This can block the event loop and reduce concurrency under load.
Agent Prompt
## Issue description
An async code path writes to disk synchronously (`mkdir`/`write_text`), which can block the event loop.
## Issue Context
`ensure_started()` is `async` and calls `_write_server_config()`, which currently performs direct filesystem writes.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_sidecar_supervisor.py[94-137]
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_sidecar_supervisor.py[212-220]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| with contextlib.suppress(Exception): | ||
| await process.wait() | ||
| finally: | ||
| self._clear_process_state(process) | ||
|
|
||
| def _clear_process_state(self, process: asyncio.subprocess.Process | None = None) -> None: | ||
| if process is not None and self._process is not process: | ||
| return | ||
| self._process = None |
There was a problem hiding this comment.
3. suppress(exception) hides wait errors 📘 Rule violation ☼ Reliability
The sidecar shutdown path uses contextlib.suppress(Exception) around await process.wait(), which can silently hide unexpected errors. This can make failures harder to diagnose and violates the rule against swallowing exceptions without justification/logging.
Agent Prompt
## Issue description
`contextlib.suppress(Exception)` is used to ignore all exceptions during process shutdown without logging or justification.
## Issue Context
The compliance rule requires exceptions to be logged or re-raised, and any intentional suppression should be narrow and justified.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_sidecar_supervisor.py[309-317]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def __init__( | ||
| self, | ||
| *, | ||
| base_url: str, | ||
| http_client: httpx.AsyncClient | None = None, | ||
| timeout: float = 300.0, | ||
| allow_remote_base_url: bool = False, | ||
| ) -> None: | ||
| self.base_url = validate_base_url( | ||
| base_url, | ||
| allow_remote_base_url=allow_remote_base_url, | ||
| ) | ||
| self.timeout = timeout | ||
| self._http_client = http_client or httpx.AsyncClient(timeout=timeout, trust_env=False) | ||
| self._owns_client = http_client is None |
There was a problem hiding this comment.
4. audiocppclient.init lacks docstring 📘 Rule violation ✧ Quality
AudioCppClient.__init__ has no function docstring as its first statement. This violates the requirement that all modules/classes/functions (including methods) have non-empty docstrings.
Agent Prompt
## Issue description
New methods are missing required docstrings.
## Issue Context
The compliance checklist requires docstrings for every function/method, with the docstring as the first statement in the body.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_client.py[53-67]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| @pytest.mark.integration | ||
| @pytest.mark.asyncio | ||
| async def test_factory_routes_audio_cpp_explicit_provider_and_namespaced_model(): |
There was a problem hiding this comment.
5. Extra pytest marker asyncio 📘 Rule violation ▣ Testability
New tests are decorated with more than one marker (e.g., @pytest.mark.integration plus @pytest.mark.asyncio). This violates the requirement that each test has exactly one of the approved markers.
Agent Prompt
## Issue description
Tests include multiple markers, but policy requires exactly one of: `unit`, `integration`, `external_api`, `local_llm_service`.
## Issue Context
Async tests can still be async, but marker usage must comply with the rule (e.g., rely on pytest-asyncio configuration instead of an extra marker, or refactor to keep exactly one approved marker).
## Fix Focus Areas
- tldw_Server_API/tests/TTS_NEW/integration/test_audio_cpp_tts_service.py[119-152]
- tldw_Server_API/tests/TTS_NEW/unit/adapters/test_audio_cpp_client.py[13-16]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| metadata: dict[str, Any] = field(default_factory=dict) | ||
|
|
||
|
|
||
| def _sanitize_error_text(text: str | None, *, redaction_terms: list[str] | None = None) -> str: |
There was a problem hiding this comment.
6. _sanitize_error_text line too long 📘 Rule violation ✧ Quality
The _sanitize_error_text function definition exceeds typical PEP 8 line-length limits, introducing a new formatting violation. This can cause formatter/linter failures in CI.
Agent Prompt
## Issue description
A new line exceeds the project PEP 8 line-length limit (function signature).
## Issue Context
The compliance checklist requires no new PEP 8 violations in changed code unless explicitly ignored.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_client.py[36-36]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| async def get_capabilities(self) -> TTSCapabilities: | ||
| return TTSCapabilities( | ||
| provider_name="audio.cpp", | ||
| supported_languages=_SUPPORTED_LANGUAGES, | ||
| supported_voices=list(self._voices.values()), | ||
| supported_formats=self.SUPPORTED_FORMATS, | ||
| max_text_length=self.max_text_length, | ||
| supports_streaming=True, | ||
| supports_voice_cloning=True, | ||
| supports_emotion_control=False, | ||
| supports_speech_rate=False, | ||
| supports_pitch_control=False, | ||
| supports_volume_control=False, | ||
| supports_ssml=False, | ||
| supports_phonemes=False, | ||
| supports_multi_speaker=False, |
There was a problem hiding this comment.
7. Voice cloning misadvertised 🐞 Bug ≡ Correctness
AudioCppTTSAdapter reports supports_voice_cloning=True even when voice references are rejected unless managed mode is enabled or external_voice_reference_mode is shared_path. This can cause clients to rely on capability discovery and then hit runtime TTSValidationError for voice_reference requests under the default external configuration.
Agent Prompt
## Issue description
`AudioCppTTSAdapter.get_capabilities()` always sets `supports_voice_cloning=True`, but `_stage_reference_audio()` rejects `request.voice_reference` unless `managed` is enabled or `external_voice_reference_mode == "shared_path"`. Capability discovery should reflect the actual configured behavior to avoid clients sending reference audio that will be rejected.
## Issue Context
- Default YAML config sets `extra_params.managed: false` and `external_voice_reference_mode: "disabled"`.
- In that configuration, `voice_reference` requests should be treated as unsupported.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_adapter.py[141-167]
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_adapter.py[336-352]
## Implementation notes
- Compute `supports_voice_cloning = self._audio_cpp_config.managed or self._audio_cpp_config.external_voice_reference_mode == "shared_path"`.
- Optionally override `validate_request()` to fail early with a clear message when `voice_reference` is provided but not supported by current config.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| @staticmethod | ||
| def _write_reference_audio_sync(path: Path, audio_bytes: bytes) -> None: | ||
| path.parent.mkdir(parents=True, exist_ok=True) | ||
| path.write_bytes(audio_bytes) |
There was a problem hiding this comment.
8. Voice reference file permissions 🐞 Bug ⛨ Security
Voice reference audio is staged via Path.write_bytes without explicitly setting restrictive permissions, so confidentiality depends on umask/directory permissions until cleanup. On hosts with permissive defaults or shared runtime directories, this can expose sensitive voice-reference audio to other local users/processes.
Agent Prompt
## Issue description
`_write_reference_audio_sync()` writes reference audio using `Path.write_bytes()`, which does not guarantee restrictive permissions. Voice reference audio is potentially sensitive and should be written with explicit 0600 permissions (or equivalent) in the shared scratch directory.
## Issue Context
Other adapters (e.g., OmniVoice) materialize reference audio using `tempfile.NamedTemporaryFile(...)`, which is typically created with restrictive permissions by default.
## Fix Focus Areas
- tldw_Server_API/app/core/TTS/adapters/audio_cpp_adapter.py[354-384]
## Implementation notes
- Use a secure creation method:
- Option A: `tempfile.NamedTemporaryFile(dir=..., prefix=..., suffix=..., delete=False)` in `shared_scratch_dir`, then write bytes.
- Option B: `os.open(path, os.O_WRONLY|os.O_CREAT|os.O_EXCL, 0o600)` and `os.fdopen` to write bytes.
- Keep returning a filesystem path the sidecar can read.
- Preserve existing cleanup behavior (`unlink`) when `retain_request_artifacts` is false.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Summary
Add initial support for audio.cpp