feat: add MiniMax Cloud TTS as a built-in engine#331
feat: add MiniMax Cloud TTS as a built-in engine#331octo-patch wants to merge 1 commit intojamiepine:mainfrom
Conversation
Add MiniMax as a cloud-based TTS engine alongside existing local engines (Qwen, LuxTTS, Chatterbox, TADA, Kokoro). MiniMax requires only an API key (MINIMAX_API_KEY) — no model downloads needed. Backend: - New MiniMaxTTSBackend in backend/backends/minimax_backend.py - Calls MiniMax TTS API (POST https://api.minimax.io/v1/t2a_v2) - Default model: speech-2.8-hd, 24kHz PCM output - 12 preset voice IDs (English_Graceful_Lady, Deep_Voice_Man, etc.) - Registered in TTS_ENGINES and backend factory - Preset voice API endpoint returns MiniMax voices Frontend: - Added to engine selector dropdown, profile form, and type definitions - Language support for 10 languages - Listed as preset-only engine (no voice cloning) Tests: - 16 unit tests covering backend lifecycle, voice prompts, API mocking, payload verification, error handling, and engine registration - All 38 tests pass (22 existing + 16 new) Co-Authored-By: Octopus <liyuan851277048@icloud.com>
📝 WalkthroughWalkthroughThis pull request adds a new MiniMax Cloud TTS engine to the application, extending frontend engine selectors, API type definitions, language support configurations, and backend infrastructure to support a cloud-based text-to-speech backend with preset voice support. Changes
Sequence DiagramsequenceDiagram
participant Client as Client (Browser)
participant Frontend as Frontend App
participant Server as Backend Server
participant MiniMax as MiniMax API
Client->>Frontend: Submit TTS generation request<br/>(engine: 'minimax', text, voice_prompt)
Frontend->>Frontend: Validate form<br/>(generationSchema includes minimax)
Frontend->>Frontend: Map engine to modelName<br/>('minimax' → 'minimax-cloud-tts')
Frontend->>Server: POST /generate<br/>(GenerationRequest with engine: 'minimax')
Server->>Server: Route to MiniMaxTTSBackend.generate()
Server->>Server: Extract voice_id from voice_prompt<br/>(default: 'English_Graceful_Lady')
Server->>Server: Construct JSON payload<br/>(model, text, voice_id, audio format)
Server->>MiniMax: POST /t2a_v2<br/>(JSON request with TTS parameters)
MiniMax->>MiniMax: Process TTS synthesis
MiniMax-->>Server: HTTP Response<br/>(status_code, data.audio [hex PCM])
Server->>Server: Validate response status
Server->>Server: Decode hex PCM to int16<br/>Normalize to float32 [-1.0, 1.0]
Server->>Server: Return (audio_array, 24000 Hz)
Server-->>Frontend: Audio stream / file
Frontend->>Client: Play/download audio
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/src/components/VoiceProfiles/ProfileForm.tsx (1)
842-845:⚠️ Potential issue | 🟠 MajorReset selected voice when preset engine changes.
Line 844 updates the engine but keeps the previous
selectedPresetVoiceId. With two engines now available, a stale voice ID can be submitted against the wrong engine.💡 Proposed fix
-<Select - value={selectedPresetEngine} - onValueChange={setSelectedPresetEngine} -> +<Select + value={selectedPresetEngine} + onValueChange={(engine) => { + setSelectedPresetEngine(engine); + setSelectedPresetVoiceId(''); + }} +>Also applies to: 853-853
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/VoiceProfiles/ProfileForm.tsx` around lines 842 - 845, When the preset engine selection changes, reset the selected preset voice to avoid submitting a stale voice ID: update the onValueChange handler for the engine Select (where selectedPresetEngine and setSelectedPresetEngine are used) to also call setSelectedPresetVoiceId(null or undefined) so selectedPresetVoiceId is cleared; do the same update for the other engine Select instance (the second occurrence that currently mirrors lines ~853) to ensure both engine changes clear the voice selection.
🧹 Nitpick comments (1)
backend/tests/test_minimax_backend.py (1)
189-206: Avoid hard-coded default voice ID in the assertion.Use
DEFAULT_VOICE_IDhere to prevent brittle tests if the backend default changes.Proposed tweak
`@patch.dict`(os.environ, {"MINIMAX_API_KEY": "test-key"}) `@patch`("urllib.request.urlopen") def test_generate_uses_default_voice_id(self, mock_urlopen): import asyncio + from backend.backends.minimax_backend import DEFAULT_VOICE_ID mock_resp = MagicMock() mock_resp.read.return_value = self._make_mock_response() mock_resp.__enter__ = lambda s: s mock_resp.__exit__ = MagicMock(return_value=False) mock_urlopen.return_value = mock_resp backend = self._make_backend() asyncio.get_event_loop().run_until_complete( backend.generate("test", {}) ) call_args = mock_urlopen.call_args request = call_args[0][0] payload = json.loads(request.data.decode("utf-8")) - self.assertEqual(payload["voice_setting"]["voice_id"], "English_Graceful_Lady") + self.assertEqual(payload["voice_setting"]["voice_id"], DEFAULT_VOICE_ID)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_minimax_backend.py` around lines 189 - 206, The test test_generate_uses_default_voice_id currently asserts a hard-coded string "English_Graceful_Lady"; update it to import and use DEFAULT_VOICE_ID so the assertion checks payload["voice_setting"]["voice_id"] == DEFAULT_VOICE_ID. Locate the test function test_generate_uses_default_voice_id and replace the hard-coded expected value with the module-level constant DEFAULT_VOICE_ID (imported from the backend module that defines it) to keep the test resilient to default changes while still verifying backend.generate uses the default when none is provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@app/src/components/VoiceProfiles/ProfileForm.tsx`:
- Around line 842-845: When the preset engine selection changes, reset the
selected preset voice to avoid submitting a stale voice ID: update the
onValueChange handler for the engine Select (where selectedPresetEngine and
setSelectedPresetEngine are used) to also call setSelectedPresetVoiceId(null or
undefined) so selectedPresetVoiceId is cleared; do the same update for the other
engine Select instance (the second occurrence that currently mirrors lines ~853)
to ensure both engine changes clear the voice selection.
---
Nitpick comments:
In `@backend/tests/test_minimax_backend.py`:
- Around line 189-206: The test test_generate_uses_default_voice_id currently
asserts a hard-coded string "English_Graceful_Lady"; update it to import and use
DEFAULT_VOICE_ID so the assertion checks payload["voice_setting"]["voice_id"] ==
DEFAULT_VOICE_ID. Locate the test function test_generate_uses_default_voice_id
and replace the hard-coded expected value with the module-level constant
DEFAULT_VOICE_ID (imported from the backend module that defines it) to keep the
test resilient to default changes while still verifying backend.generate uses
the default when none is provided.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5c0b9975-446e-49a7-8b99-6f83422bc9ff
📒 Files selected for processing (10)
app/src/components/Generation/EngineModelSelector.tsxapp/src/components/VoiceProfiles/ProfileForm.tsxapp/src/lib/api/types.tsapp/src/lib/constants/languages.tsapp/src/lib/hooks/useGenerationForm.tsbackend/backends/__init__.pybackend/backends/minimax_backend.pybackend/models.pybackend/routes/profiles.pybackend/tests/test_minimax_backend.py
Summary
Add MiniMax as a cloud-based TTS engine alongside existing local engines (Qwen, LuxTTS, Chatterbox, TADA, Kokoro).
MINIMAX_API_KEYenvironment variablespeech-2.8-hd(high-quality, maximized timbre similarity)Changes
Backend (4 files modified, 2 new):
backend/backends/minimax_backend.py— New MiniMaxTTSBackend implementationPOST https://api.minimax.io/v1/t2a_v2)urllib.request— zero new dependenciesbackend/backends/__init__.py— Registered in TTS_ENGINES dict and factory functionbackend/models.py— Added "minimax" to engine validation patternbackend/routes/profiles.py— Added MiniMax voices to preset voice API endpointFrontend (5 files modified):
Tests (1 new file):
backend/tests/test_minimax_backend.py— 16 unit tests covering:Test Results
All 38 tests pass (22 existing + 16 new), no regressions.
Integration tested against real MiniMax API — generates audio with correct sample rate and duration.
Test plan
MINIMAX_API_KEYenvironment variablepytest backend/tests/test_minimax_backend.py— all 16 tests should passSummary by CodeRabbit