A native macOS voice assistant that listens to your commands and executes actions like typing text, opening apps, and more.
Voice Input → OpenAI Whisper → Cerebras LLM → Tool Execution
- Voice Capture: Press and hold the global hotkey to record your voice
- Transcription: Audio is sent to OpenAI Whisper for speech-to-text
- Command Routing: Transcript is processed by Cerebras (Qwen model) to determine the action
- Execution: The appropriate tool is executed locally on your Mac
| Tool | Description | Example |
|---|---|---|
type |
Types text into the active window | "Type hello world" |
open_app |
Opens an application or URL | "Open Safari" or "Open github.com" |
switch_to |
Switches focus to a running app | "Switch to Slack" |
deep_research |
Opens a Google search | "Research Swift concurrency" |
- Build and launch the app
- Complete onboarding:
- Enter your OpenAI API Key (for Whisper transcription)
- Enter your Hugging Face Token (for Cerebras routing)
- Grant Microphone permission
- Grant Accessibility permission (for typing)
- Configure your preferred hotkey (default: Shift+Space)
- Press "Finish" when all checks pass
- Press and hold your hotkey
- Speak your command
- Release the hotkey
- Watch the popup show your transcript and action result
SecretaryApp/
├── Sources/
│ ├── SecretaryApp.swift # Entry point, AppState, hotkey handling
│ ├── Core/
│ │ ├── AudioRecorder.swift # Microphone capture with level monitoring
│ │ ├── TranscriptionClient.swift # OpenAI Whisper API integration
│ │ ├── ThinkingClient.swift # HuggingFace/Cerebras command routing
│ │ └── ToolManager.swift # macOS tool execution (type, open, switch)
│ ├── Views/
│ │ ├── Theme.swift # Theming system, fonts, colors, icons
│ │ ├── SettingsView.swift # Main dashboard with tabs (Home, Dictionary, Style, Settings)
│ │ ├── MenuPopup.swift # Floating overlay during recording
│ │ ├── OnboardingView.swift # First-launch setup wizard
│ │ ├── ConfigSections.swift # API keys, permissions, language selection UI
│ │ ├── ShortcutRecorder.swift # Hotkey configuration
│ │ ├── WaveformView.swift # Real-time audio visualization
│ │ └── LogsView.swift # Log file viewer
│ └── Utils/
│ ├── Logger.swift # File-based logging
│ ├── DictionaryStore.swift # Custom word/correction storage
│ ├── LanguageStore.swift # Language selection for transcription
│ ├── StyleStore.swift # Writing style examples storage
│ ├── SoundPlayer.swift # Audio feedback
│ └── CrashLogger.swift # Exception handling
- macOS 14.0+ (Sonoma)
- OpenAI API Key
- Hugging Face Token
./build_app.shThen launch the app:
open Secretary.appSettings are stored in UserDefaults:
openaiApiKey- OpenAI API keyhfApiKey- Hugging Face tokenSecretaryShortcutModifier- Hotkey modifierSecretaryShortcutKey- Hotkey key codehasCompletedOnboarding- Setup completion flag
Audio recordings are saved to ~/Documents/Secretary/recording.m4a
Logs are written to Secretary_Log.txt in the app directory
https://www.itech4mac.net/2025/04/how-to-create-a-dmg-installer-for-you-applications-on-macos/
Apache 2.0. You can just credit Aymeric Roucher for the app, and Raphaël Doan for the visual theme!