Skip to content

Prevent Whisper hallucination/looping with no_context=true#116

Closed
krystophny wants to merge 5 commits intopeteonrails:mainfrom
krystophny:upstream-pr-whisper-nocontext
Closed

Prevent Whisper hallucination/looping with no_context=true#116
krystophny wants to merge 5 commits intopeteonrails:mainfrom
krystophny:upstream-pr-whisper-nocontext

Conversation

@krystophny
Copy link
Collaborator

Summary

Set no_context=true to prevent Whisper from conditioning on previous text segments, fixing phrase repetition.

Problem

Whisper would sometimes repeat phrases (e.g., "commit and push commit and push").

Solution

Disable context conditioning so each transcription is independent.

Dependencies

Depends on #113 (macOS support)

This switches the device hotplug detection in the evdev listener from
using inotify directly to using the notify crate, which abstracts
filesystem watching across platforms.

Benefits:
- Cleaner dependency (notify is already used for status --follow)
- The notify crate is better maintained and more widely used
- Prepares codebase for future cross-platform support

The notify crate uses inotify internally on Linux, so behavior is
unchanged on the primary platform.
@krystophny krystophny force-pushed the upstream-pr-whisper-nocontext branch from 0c43a49 to 824b18e Compare January 22, 2026 19:16
This adds full support for running voxtype on macOS:

**Hotkey capture:**
- CGEventTap-based global hotkey detection (requires Accessibility permissions)
- FN/Globe key support via SecondaryFn flag detection
- Platform-specific default: FN on macOS, SCROLLLOCK on Linux

**Text output:**
- CGEvent-based text injection for typing transcribed text
- Clipboard support via pbcopy/pbpaste

**Build configuration:**
- Conditional compilation for platform-specific code
- evdev moved to Linux-only dependency
- core-graphics and core-foundation added for macOS

**Other changes:**
- cfg guards for Linux-specific code (evdev, inotify, .init_array)
- Cross-platform test using printf instead of echo -n
- Documentation updates for macOS options
The audio_ctx optimization formula could produce values not aligned
to 8, causing GGML assertion failure (nb01 % 8 == 0) on Metal backend.

For example, a 4-second audio clip would produce audio_ctx=266,
which is not divisible by 8, causing a crash.

Fix: Round up audio_ctx to the next multiple of 8 using the formula
(raw_ctx + 7) / 8 * 8

This ensures compatibility with Metal backend alignment requirements
while preserving the optimization benefits for short audio clips.
When typing characters that don't need shift, explicitly set
CGEventFlagNull to clear all modifiers. This prevents Caps Lock
or stuck modifier keys from causing random capitalization in
the transcribed output.
@krystophny krystophny force-pushed the upstream-pr-whisper-nocontext branch from 824b18e to 9a0f6ac Compare January 22, 2026 19:24
@peteonrails peteonrails mentioned this pull request Jan 27, 2026
5 tasks
@peteonrails
Copy link
Owner

Superseded by #129 which includes this Whisper hallucination fix. Thank you for identifying and fixing this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants