Skip to content

Fix Metal backend crash with audio_ctx alignment#114

Closed
krystophny wants to merge 4 commits intopeteonrails:mainfrom
krystophny:upstream-pr-metal-fix
Closed

Fix Metal backend crash with audio_ctx alignment#114
krystophny wants to merge 4 commits intopeteonrails:mainfrom
krystophny:upstream-pr-metal-fix

Conversation

@krystophny
Copy link
Collaborator

Summary

Fixes a crash when using the Metal backend (macOS GPU acceleration) with short audio clips.

The audio_ctx optimization formula could produce values not aligned to 8:

  • 4-second clip → audio_ctx=266 (266 % 8 = 2, crashes)
  • 21-second clip → audio_ctx=1152 (1152 % 8 = 0, works)

The GGML Metal backend requires nb01 % 8 == 0, causing assertion failure.

Fix: Round up audio_ctx to the next multiple of 8.

Dependencies

Depends on #113 (macOS support)

Test plan

  • Record short clips (~4 seconds) on macOS with Metal backend
  • Verify no crash occurs
  • Verify transcription quality is unchanged

This switches the device hotplug detection in the evdev listener from
using inotify directly to using the notify crate, which abstracts
filesystem watching across platforms.

Benefits:
- Cleaner dependency (notify is already used for status --follow)
- The notify crate is better maintained and more widely used
- Prepares codebase for future cross-platform support

The notify crate uses inotify internally on Linux, so behavior is
unchanged on the primary platform.
@krystophny krystophny force-pushed the upstream-pr-metal-fix branch from 4849d02 to 64ff39a Compare January 22, 2026 19:16
This adds full support for running voxtype on macOS:

**Hotkey capture:**
- CGEventTap-based global hotkey detection (requires Accessibility permissions)
- FN/Globe key support via SecondaryFn flag detection
- Platform-specific default: FN on macOS, SCROLLLOCK on Linux

**Text output:**
- CGEvent-based text injection for typing transcribed text
- Clipboard support via pbcopy/pbpaste

**Build configuration:**
- Conditional compilation for platform-specific code
- evdev moved to Linux-only dependency
- core-graphics and core-foundation added for macOS

**Other changes:**
- cfg guards for Linux-specific code (evdev, inotify, .init_array)
- Cross-platform test using printf instead of echo -n
- Documentation updates for macOS options
The audio_ctx optimization formula could produce values not aligned
to 8, causing GGML assertion failure (nb01 % 8 == 0) on Metal backend.

For example, a 4-second audio clip would produce audio_ctx=266,
which is not divisible by 8, causing a crash.

Fix: Round up audio_ctx to the next multiple of 8 using the formula
(raw_ctx + 7) / 8 * 8

This ensures compatibility with Metal backend alignment requirements
while preserving the optimization benefits for short audio clips.
@krystophny krystophny force-pushed the upstream-pr-metal-fix branch from 64ff39a to 2681146 Compare January 22, 2026 19:24
@peteonrails peteonrails mentioned this pull request Jan 27, 2026
5 tasks
@peteonrails
Copy link
Owner

Superseded by #129 which includes this Metal backend fix. Thank you for identifying and fixing this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants