Problem or Motivation
Currently, there is no native support for Xiaomi MiMo TTS in OpenMAIC. When attempting to add it via the generic Custom TTS provider, the test fails with a "Not Found" error.
This happens because the custom provider assumes a standard OpenAI format and automatically routes requests to the /v1/audio/speech endpoint. However, the Xiaomi MiMo API handles text-to-speech differently: it processes audio generation through the /v1/chat/completions endpoint by passing a specific audio object in the payload. This structural difference makes it impossible to use MiMo's high-quality TTS models (like mimo-v2.5-tts) as a simple drop-in replacement via the existing custom provider interface.
Proposed Solution
I propose adding native provider support for Xiaomi MiMo TTS. This would require:
Dedicated UI Provider: Adding a "Xiaomi MiMo TTS" option in the Text-to-Speech provider list.
Endpoint Routing: Routing the API requests for this specific provider to the /v1/chat/completions endpoint instead of the standard /audio/speech path.
Payload Formatting: Adjusting the request payload to include the required audio object format as per Xiaomi's official API documentation.
Model Support: Adding support for MiMo-specific TTS models (e.g., mimo-v2.5-tts, mimo-v2-tts) and allowing users to configure valid voice parameters (such as 茉莉).
Alternatives Considered
No response
Area
TTS / Voice
Additional Context
No response
Problem or Motivation
Currently, there is no native support for Xiaomi MiMo TTS in OpenMAIC. When attempting to add it via the generic Custom TTS provider, the test fails with a "Not Found" error.
This happens because the custom provider assumes a standard OpenAI format and automatically routes requests to the /v1/audio/speech endpoint. However, the Xiaomi MiMo API handles text-to-speech differently: it processes audio generation through the /v1/chat/completions endpoint by passing a specific audio object in the payload. This structural difference makes it impossible to use MiMo's high-quality TTS models (like mimo-v2.5-tts) as a simple drop-in replacement via the existing custom provider interface.
Proposed Solution
I propose adding native provider support for Xiaomi MiMo TTS. This would require:
Dedicated UI Provider: Adding a "Xiaomi MiMo TTS" option in the Text-to-Speech provider list.
Endpoint Routing: Routing the API requests for this specific provider to the /v1/chat/completions endpoint instead of the standard /audio/speech path.
Payload Formatting: Adjusting the request payload to include the required audio object format as per Xiaomi's official API documentation.
Model Support: Adding support for MiMo-specific TTS models (e.g., mimo-v2.5-tts, mimo-v2-tts) and allowing users to configure valid voice parameters (such as 茉莉).
Alternatives Considered
No response
Area
TTS / Voice
Additional Context
No response