Blog post: here
A lightweight, local yap-to-text tool for Linux. It listens for a global hotkey, records audio, transcribes it locally using faster-whisper, and opens the text in your preferred editor. The model is pre-loaded in RAM via a background service for instant recording.
- Python 3.12+
- Poetry
ffmpeg(required by whisper)
```bash
sudo apt install ffmpeg # Debian/Ubuntu
# sudo pacman -S ffmpeg # Arch
poetry install
```
Open server.py to change defaults:
OUTPUT_DIR: Where transcription text files are saved (Default:~/transcriptions/). If you don't want to preserve them, just set this to/tmp/EDITOR_CMD: The text editor to open (Default:gnome-text-editor -n).MODEL_SIZE: Whisper model size (Default:base.en). For options, see here. I recommend base or small, or tiny if you have a bad cpu.
Then edit yaptype.service to change the value for OMP_NUM_THREADS, if you want something other than the default of 8.
Edit yaptype.service and ensure the paths to python (inside poetry env) and server.py are correct. To get the python executable path, use poetry run which python or poetry env info.
cp yaptype.service ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable --now yaptype.serviceIf you change the yaptype.service file, you need to copy it again, run systemctl --user daemon-reload again and then do systemctl --user restart yaptype.service!
Go to your System Settings -> Keyboard -> Shortcuts (or your Window Manager config). Create a new custom shortcut:
-
Command:
/path/to/poetry/venv/python /path/to/repo/client.py -
--> This python path is the same as above.
-
Shortcut:
Ctrl + Alt + -(or your preference)
- Press your shortcut (e.g.,
Ctrl + Alt + -) to start recording. A microphone icon will light up in your taskbar - Speak your thought. You'll notice that the microphone privacy indicator lights up in the system menu (top-right corner) while the recording is running (on GNOME, at least).
- Press the shortcut again to stop.
- The transcription will process in the background and pop up in your text editor automatically.
Check service logs if recording doesn't start or editor doesn't open:
journalctl --user -u yaptype.service -fThe code downloads the model from HuggingFace on first start and caches it in ~/.cache/huggingface/hub/:
du -sh ~/.cache/huggingface/hub/models--Systran--faster-whisper*
You can see the memory usage by using systemctl --user status yaptype.service. With the base model, about 250MB of RAM are used.