autotalk

Hands-free voice interface for Claude Code (and any terminal).

Talk to your terminal. It hears you.

mic → voice activity detection → local Whisper STT → terminal injection

Full duplex with Claude Code: Pair with voxtral-mcp for two-way voice conversations — you talk, Claude talks back. No cloud APIs. Fully local.

Demo

Install

git clone https://github.com/ibliminse/autotalk.git
cd autotalk

# Set up environment
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

# Grant macOS permissions:
# System Settings > Privacy & Security > Microphone: allow Terminal/iTerm2
# System Settings > Privacy & Security > Accessibility: allow Terminal/iTerm2

Quick Start

# Open Mic — always listening, speak naturally
./run.sh

# Target a specific app
./run.sh --target iTerm2

# Dry run — transcribe only, no injection
./run.sh --mode dry-run

# Better accuracy (slower)
./run.sh --model small.en

# List available microphones
./run.sh --list-devices

Modes

Open Mic (default)

Always listening. Speak naturally — 2 seconds of silence triggers transcription and injection. Designed for continuous conversation with Claude Code.

Background noise filtering uses dual-layer detection:

WebRTC VAD at maximum aggressiveness filters non-speech
Energy thresholding ignores quiet ambient noise
Whisper hallucination filter catches phantom transcriptions ("Thank you for watching", etc.)

Push (coming in v0.2)

Hold a hotkey to record, release to transcribe. For noisy environments or when you want explicit control.

Full Duplex Setup

You  ──mic──>  autotalk  ──text──>  Claude Code  ──text──>  voxtral-mcp  ──audio──>  You
     (speak)   (STT)      (paste)    (thinks)     (speak)    (TTS)         (listen)

Terminal A — start autotalk:

cd autotalk && ./run.sh

Terminal B — start Claude Code:

claude

You speak into your mic
autotalk transcribes and pastes into Claude Code
Claude Code processes your request
Claude Code uses voxtral-mcp's speak tool to read its response aloud
You hear Claude's response through your speakers

Install voxtral-mcp for the TTS half.

CLI Reference

Flag	Default	Description
`--mode`	`paste`	Delivery method: `paste` (clipboard), `keystroke` (type), `dry-run` (print only)
`--device`	system default	Audio input device index (see `--list-devices`)
`--model`	`base.en`	Whisper model: `tiny.en`, `base.en`, `small.en`, `medium.en`
`--target`	frontmost app	Target app for injection (e.g., `Terminal`, `iTerm2`)
`--vad`	`3`	VAD aggressiveness: 0 (least) to 3 (most)
`--list-devices`		List available audio devices and exit
`--version`		Show version and exit

How It Works

Mic capture — sounddevice (PortAudio) captures 16kHz mono audio in 30ms frames
Voice Activity Detection — webrtcvad (Google WebRTC) classifies each frame as speech/silence. A ring buffer triggers recording when 80% of recent frames contain speech
Energy gating — Per-frame RMS energy check prevents background noise from resetting the silence timer. Overall segment energy check skips quiet captures before they reach Whisper
Speech-to-text — faster-whisper (CTranslate2) runs Whisper locally on CPU with int8 quantization. No cloud API, no API key
Hallucination filter — Common Whisper phantom outputs ("You", "Thank you", "Thanks for watching") are caught and discarded
Terminal injection — AppleScript pastes transcribed text into the target terminal via clipboard (or keystroke simulation)

Compared To

Feature	autotalk	hns	whis	speech2type
Full duplex (input + output)	Yes (with voxtral-mcp)	No	No	No
Local STT	Yes (Whisper)	Yes (Whisper)	Optional	No
Always-on mode	Yes	No	No	No
Terminal injection	Paste + keystroke	Clipboard	Clipboard	Clipboard
Claude Code integration	Native	No	No	No
Platform	macOS	macOS/Linux	macOS/Linux/Win	macOS

Requirements

macOS (AppleScript dependency — Linux support planned)
Python 3.11+
Microphone access (grant in System Settings > Privacy > Microphone)
Accessibility permission (grant in System Settings > Privacy > Accessibility)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
autotalk.py		autotalk.py
pyproject.toml		pyproject.toml
run.sh		run.sh
test_pipeline.py		test_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autotalk

Demo

Install

Quick Start

Modes

Open Mic (default)

Push (coming in v0.2)

Full Duplex Setup

CLI Reference

How It Works

Compared To

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

autotalk

Demo

Install

Quick Start

Modes

Open Mic (default)

Push (coming in v0.2)

Full Duplex Setup

CLI Reference

How It Works

Compared To

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages