Talk in. Markdown out.
TokDown is a macOS menu bar app that records system audio and transcribes it to markdown — entirely on-device, using Apple's new SpeechTranscriber API introduced in macOS Tahoe (macOS 26).
At WWDC in June 2025, Apple introduced SpeechTranscriber — a ground-up replacement for the decade-old SFSpeechRecognizer. It shipped with macOS 26 Tahoe in fall 2025. TokDown is built on it.
| Whisper (local) | Whisper API (OpenAI) | Otter / Fireflies | TokDown | |
|---|---|---|---|---|
| Install | Python + ~3 GB model | None (API call) | None (SaaS) | None — macOS has the model |
| Speed (30 min audio) | 3-10 min | ~1 min | Real-time | ~30-45 sec |
| Cost | Free | ~$0.18/hr | $8-24/mo | Free |
| Audio leaves your Mac | No | Yes | Yes | No |
| Quality (vs Whisper Large V3) | Baseline | Baseline | Comparable | Comparable |
| Languages | 99 | 57 | ~30 | 41 |
| Dependencies | Python, ffmpeg, model weights | API key | Account + subscription | None |
The model runs on the Neural Engine. macOS downloads the language model on first use (~150 MB, shared across all apps) and updates it automatically. No Python, no Homebrew, no Docker, no model files to manage.
Quality is comparable to Whisper Large V3 on conversational speech. It handles distant-mic scenarios well — meetings where you're not wearing a headset. Proper nouns are the main weakness, same as every other engine.
41 languages supported including English, Spanish, French, German, Japanese, Korean, Mandarin, Cantonese, Portuguese, Arabic, and more.
Most transcription tools trap your notes in another app or SaaS dashboard. TokDown writes plain markdown files to a folder — searchable, versionable, and ready to feed into agents, prompts, and automations.
- Record meetings, calls, demos, and research audio from system audio
- Get timestamped markdown with YAML front matter (calendar metadata, attendees, links)
- No audio files kept — transcription in, markdown out, audio deleted
- No dependencies, no accounts, no API keys
- ~1,200 lines of Swift, no external packages
- Click the menu bar icon
- Pick an upcoming calendar meeting or start recording immediately
- Stop when done
- TokDown transcribes and saves a
.mdfile — typically in under a minute - The audio file is deleted automatically
Transcripts are saved to ~/Documents/Transcripts/ by default:
2026-03-09_17-38_Standup.md
2026-03-09_18-00_Quarterly_planning_kickoff.md
Meeting recordings use the calendar event title. Manual recordings infer a title from the transcript text.
---
title: "Standup"
source: "calendar_selection"
audio_source: "system_audio"
recording_started_at: "2026-03-09T14:00:00-04:00"
recording_ended_at: "2026-03-09T14:30:00-04:00"
calendar: "Work"
event_start: "2026-03-09T14:00:00-04:00"
event_end: "2026-03-09T14:30:00-04:00"
location: "Zoom"
url: "https://zoom.us/j/123"
organizer:
name: "Jane Doe"
email: "jane@example.com"
attendees:
- name: "Jane Doe"
email: "jane@example.com"
- name: "Alex Smith"
email: "alex@example.com"
notes: |
Agenda and invite notes.
---
# Standup
2026-03-09 14:00–14:30
[00:05] First chunk of transcribed text grouped by ~5s windows.
[00:10] Next chunk continues here with natural grouping.Manual recordings use the same shape but omit calendar-specific fields.
- macOS 26+ (Tahoe)
- Apple Silicon or Intel Mac with Neural Engine support
swift test
bash scripts/build-app.sh debug
open TokDown.appTokDown is a menu bar app — it lives in the menu bar, not the Dock.
Release build:
bash scripts/build-app.sh release- Download
TokDown.app.zipfrom the Releases page - Unzip and move
TokDown.appto/Applications - Launch — if macOS warns on first run, right-click → Open
On first launch, macOS will prompt for:
- Screen Recording — captures system audio via ScreenCaptureKit
- Speech Recognition — runs the on-device transcription model
- Calendar (optional) — shows upcoming meetings in the menu
~1,200 lines of Swift 6. No external dependencies.
| Framework | Purpose |
|---|---|
| Speech (SpeechTranscriber) | On-device transcription — new in macOS 26 |
| ScreenCaptureKit | System audio capture |
| AVFoundation | Audio recording and file I/O |
| EventKit | Calendar meeting integration |
Sources/TokDown/
├── TokDownApp.swift # App entry point
├── MenuBarCoordinator.swift # State machine (idle → recording → transcribing)
├── MenuBarIconView.swift # Custom menu bar icon states
├── MenuBarViews.swift # Menu bar + Settings window
├── SystemAudioService.swift # ScreenCaptureKit audio capture
├── RecordingService.swift # AVAudioRecorder (mic fallback)
├── TranscriptionService.swift # SpeechTranscriber + SpeechAnalyzer pipeline
├── TranscriptFormatter.swift # Front matter + markdown rendering
├── StorageService.swift # File output and audio cleanup
├── CalendarService.swift # EventKit meetings
├── SettingsStore.swift # User preferences
├── AppModels.swift # Data types
└── Resources/
├── Info.plist
├── TokDown.entitlements
└── TokDownIcon.svg # App icon source (→ .icns at build time)
MIT