VoiceType

Push-to-talk voice dictation for macOS - Type with your voice, powered by local AI.

VoiceType is a lightweight menu bar app that lets you dictate text into any application using a simple push-to-talk hotkey. It uses WhisperKit for fully local, private speech-to-text transcription - no internet connection required, your voice never leaves your Mac.

Features

Push-to-Talk - Hold your hotkey, speak, release to transcribe and type
100% Local & Private - Uses WhisperKit for on-device transcription, no data sent to servers
Works Everywhere - Types into any focused application (editors, browsers, chat apps, etc.)
Multiple Languages - Auto-detect or choose from 99 supported languages
Customizable Hotkey - Set your preferred key combination
Multiple Models - Choose accuracy vs. speed (tiny, base, small, medium, large-v3)
Menu Bar App - Lives in your menu bar, no dock icon clutter
Launch at Login - Start automatically when you log in

Screenshots

Installation

Download DMG (Recommended)

Download the latest VoiceType-x.x.x.dmg from Releases
Open the DMG and drag VoiceType to your Applications folder
Open VoiceType from Applications
Grant the required permissions when prompted

The app is signed and notarized by Apple, so it will open without any security warnings.

Build from Source

Requirements:

macOS 14.0 (Sonoma) or later
Xcode 15+ or Swift 5.9+ toolchain

# Clone the repository
git clone https://github.com/twissmueller/voice-type.git
cd voice-type

# Build the app
./Scripts/build-app.sh

# Run the app
open .build/release/VoiceType.app

# Or create a DMG for distribution
./Scripts/create-dmg.sh

Usage

Quick Start

Launch VoiceType - The microphone icon appears in your menu bar
Grant Permissions - Click the menu bar icon and grant all required permissions in Settings
Wait for Model - The Whisper model downloads on first launch (~50-150MB depending on model)
Start Dictating - Hold Option+Shift+Space (default), speak, then release

Permissions Required

VoiceType needs three macOS permissions to function:

Permission	Why It's Needed
Microphone	To record your voice for transcription
Accessibility	To type the transcribed text into applications
Input Monitoring	To detect the global hotkey in any app

All permissions can be granted from the Settings window (click the menu bar icon → Settings).

Changing Settings

Click the VoiceType icon in your menu bar, then click Settings to:

Change the hotkey - Click the hotkey field and press your new combination
Select a model - Larger models are more accurate but slower
Choose a language - Or use auto-detect for any of 99 languages
Enable launch at login - Start VoiceType automatically

Model Comparison

Model	Size	Speed	Accuracy	Best For
tiny	~40MB	Fastest	Basic	Quick notes, simple commands
base	~75MB	Fast	Good	General dictation
small	~250MB	Medium	Better	Most users (recommended)
medium	~750MB	Slow	Great	When accuracy matters
large-v3	~1.5GB	Slowest	Best	Professional transcription

Troubleshooting

Hotkey not working?

Check that Input Monitoring permission is granted in System Settings → Privacy & Security
Make sure VoiceType is enabled (green status in menu bar)
Try a different hotkey combination if conflicts with other apps

No transcription output?

Check that Accessibility permission is granted
Ensure the model has finished loading (no loading indicator in menu)
Try speaking closer to your microphone

App won't start?

macOS 14 (Sonoma) or later is required
Try removing and re-adding permissions in System Settings

Tech Stack

Swift 5.9 & SwiftUI - Modern Apple development
WhisperKit - On-device speech recognition
AVAudioEngine - Low-latency audio capture
CGEvent - Global hotkey detection and keystroke emulation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Support the Project

If you find VoiceType useful, consider supporting its development:

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

WhisperKit by Argmax for the amazing on-device Whisper implementation
OpenAI Whisper for the original speech recognition model

Made with ❤️ for the macOS community

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Scripts		Scripts
Sources/VoiceType		Sources/VoiceType
assets		assets
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
DEVELOPMENT.md		DEVELOPMENT.md
Info.plist		Info.plist
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceType

Features

Screenshots

Installation

Download DMG (Recommended)

Build from Source

Usage

Quick Start

Permissions Required

Changing Settings

Model Comparison

Troubleshooting

Hotkey not working?

No transcription output?

App won't start?

Tech Stack

Contributing

Support the Project

License

Acknowledgments

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

License

twissmueller/voice-type

Folders and files

Latest commit

History

Repository files navigation

VoiceType

Features

Screenshots

Installation

Download DMG (Recommended)

Build from Source

Usage

Quick Start

Permissions Required

Changing Settings

Model Comparison

Troubleshooting

Hotkey not working?

No transcription output?

App won't start?

Tech Stack

Contributing

Support the Project

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages