A simple dialect collection tool supporting both real-time recording and audio upload, with text-based timeline annotation.
- Recording dialect based on script
- Uploading existing dialect recordings
- Free conversation recording
- Audio-text alignment annotation
- Recording controls (Start/Pause/Stop)
- Real-time volume visualization
- Recording duration display
- Playback function
- Re-recording option
- Drag-and-drop upload
- File selection upload
- Supported formats: MP3/WAV
- File size limit: 50MB
- Basic audio trimming
- Waveform display
- Audio timeline annotation
- Text input/paste
- Speech segment and text alignment
- Timestamp marking
- Two main entry points: Live Recording/Upload Audio
- Recent recordings list (local storage)
![Recording Interface Mockup]
Layout:
-
Top: Recording Control Area
- Recording button (Start/Pause/Stop)
- Recording duration
- Volume indicator
-
Middle: Text Display Area (Optional)
- Script display
- Font size adjustment
-
Bottom: Audio Visualization Area
- Waveform display
- Timeline
Layout:
- Drag-and-drop zone
- File selection button
- Upload progress indicator
- Format/size guidelines
Layout:
-
Audio Waveform Area
- Waveform visualization
- Draggable timeline
- Playback controls
-
Text Alignment Area
- Text input field
- Timestamp marking buttons
- Alignment preview
- Select "Live Recording"
- (Optional) Input/paste script
- Click record button to start
- Display waveform and duration during recording
- Auto-play preview after stopping
- Confirm save or re-record
- Select "Upload Audio"
- Drag-and-drop or select audio file
- Auto-enter editing interface after upload
- Input/paste corresponding text
- Perform audio-text alignment annotation
- Play audio
- Click mark button at key points
- Auto-generate timestamps
- Adjust text correspondence
- Preview confirmation
- Export annotation results
- Sample rate: 16kHz/48kHz
- Format: WAV/MP3
- Mono channel
- Browser IndexedDB for audio files
- LocalStorage for configuration
- Support for audio and annotation data export
- Recording latency: <200ms
- Audio processing response time: <1s
- Maximum supported audio length: 30 minutes
- Basic recording functionality
- Audio upload
- Waveform display
- Simple audio-text alignment
- Audio trimming
- Real-time volume display
- Timeline annotation
- Audio format conversion
- Noise reduction
- Batch processing