Skip to content

Add voice input and text-to-speech for chat interactions #12

@hoangsonww

Description

@hoangsonww

Description:
Enable users to interact with Lumina via voice, allowing speech-to-text input and AI responses read aloud via text-to-speech. This will improve accessibility, support hands-free usage, and create a more natural conversational experience.

Acceptance Criteria:

  1. Integrate the Web Speech API (or a suitable polyfill) for speech-to-text user input in the chat UI.
  2. Add a “microphone” button in the chat input area to start/stop recording, with a clear visual recording indicator.
  3. Convert captured audio to text and populate the chat input before sending as a message.
  4. Use the Web Speech Synthesis API for client-side TTS to read AI responses aloud automatically.
  5. Provide playback controls (play/pause, volume) for spoken responses.
  6. Add settings toggles so users can enable/disable voice input and/or TTS separately.
  7. Ensure graceful fallback to text-only mode on browsers that don’t support the Speech APIs.

Tasks:

  • Research cross-browser compatibility for Web Speech APIs and choose polyfills if necessary.
  • Create a VoiceInput React component with mic button and recording state UI.
  • Hook up speech recognition to feed transcribed text into the existing chat input.
  • Build a TextToSpeech service using Web Speech Synthesis, with UI controls for playback.
  • Extend user profile settings to include voice input and TTS toggles.
  • Add Playwright tests (or mocks) to verify the voice recording and playback flows.
  • Update the README and in-app tooltips to guide users on using voice features.

Estimated Effort: ~2 sprints

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or requestgood first issueGood for newcomershelp wantedExtra attention is needed

Projects

Status

No status

Relationships

None yet

Development

No branches or pull requests

Issue actions