Skip to content

chore: update cactus to v1.11 with timestamp support#4537

Open
devin-ai-integration[bot] wants to merge 9 commits intomainfrom
chore/1773282643-update-cactus-1.11
Open

chore: update cactus to v1.11 with timestamp support#4537
devin-ai-integration[bot] wants to merge 9 commits intomainfrom
chore/1773282643-update-cactus-1.11

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 12, 2026

chore: update cactus to v1.11 with timestamp support

Summary

Updates cactus-sys dependency to v1.11 (a5acad3) and adapts the Rust wrapper and streaming transcription service to the new API:

  • Dependency: cactus-sys rev f8b714ca5acad3
  • New types: StreamSegment { start: f32, end: f32, text: String } added to StreamResult — these are word-level timestamps (not sentence-level)
  • Removed: confirmation_threshold from TranscribeOptions (removed upstream; v1.11 uses segment comparison for confirmation)
  • Streaming refactor: Extracted segment_timing_from_result() helper. Intentionally does not use cactus segment timestamps for session timing because the FFI returns timestamps in cactus-internal cumulative time which can diverge from the session's audio_offset tracking. Falls back to buffer_duration_ms / elapsed audio time. Word-level segments are still exposed in StreamResult for downstream consumers.
  • Deadlock fix: Resolved a stream hang on CloseStream — the worker thread used blocking_recv() which never unblocked because TranscriptionSession held the original audio sender. Worker now uses try_recv() with periodic cancellation checks. Session cancels transcription tokens on StopReceivingInput and calls handle_finalize to flush pending text before sending terminal metadata.
  • Test update: test_stream_transcriber chunk count increased 10 → 30 for v1.11's segment-based confirmation
  • Batch path: Not modified — cactus batch transcription doesn't return segment-level timestamps

Updates since last revision

  • Restored <|notimestamps|> in whisper prompt — cactus v1.11 still uses <|ns|> internally and removing it caused the model to enter degenerate token loops during streaming
  • Removed stub tauri-plugin-apple-contact / tauri-plugin-pdf entries from Cargo.lock
  • Removed dead last_segments field from ChannelState (was stored but never read)
  • Fixed segment_timing_from_result to avoid cactus segment timestamps due to coordinate system divergence (documented in code comment)
  • Fixed stream deadlock: worker blocking_recv() → cancellation-aware try_recv() loop; session now cancels tokens + calls handle_finalize on StopReceivingInput

Review & Testing Checklist for Human

  • Deadlock fix edge cases: The worker now polls try_recv() with a 5ms sleep. Verify that (1) cancellation while the worker is mid-process_f32 correctly lets it finish the current chunk before exiting, and (2) the stop() call (which flushes remaining unconfirmed text) is reached when the worker exits via cancellation from the idle/recv path vs the mid-processing path.
  • handle_finalize on clean exit: NEW BEHAVIOR — when CloseStream is received, the session now flushes any pending unconfirmed text as a final Results message before the terminal Metadata message. Previously this text was silently dropped. Confirm this is the desired behavior and doesn't break clients expecting the old behavior.
  • Segment timestamps not used for timing: segment_timing_from_result deliberately ignores result.segments[*].start/end for session timing to avoid coordinate system drift. Verify this is the correct trade-off (precision vs. consistency). Segments are still exposed in StreamResult for downstream word-level timestamp consumers.
  • Downstream consumers: confirmation_threshold was removed from TranscribeOptions and StreamSegment/segments were added to StreamResult. Verify no other consumers rely on the removed field or break on the new field.
  • Test plan: Run live streaming transcription on macOS aarch64 and verify (1) transcripts are complete (including pending text at end), (2) no deadlocks when closing streams, and (3) CI e2e tests pass consistently.

Notes

  • Batch response builders (crates/transcribe-cactus/src/service/batch/response.rs) were intentionally not modified — the cactus batch FFI (cactus_transcribe) does not return segment-level text timestamps.
  • Upstream change reference: Speech-to-Text Timestamps cactus-compute/cactus#515
  • The worker polling sleep (5ms) adds minimal latency compared to model inference time but could theoretically delay CloseStream acknowledgment by up to 5ms.

Requested by: @yujonglee
Link to Devin Session

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR that start with 'DevinAI' or '@devin'.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@netlify
Copy link

netlify bot commented Mar 12, 2026

Deploy Preview for hyprnote-storybook canceled.

Name Link
🔨 Latest commit 1d1b923
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote-storybook/deploys/69b2af23983b870008269117

@netlify
Copy link

netlify bot commented Mar 12, 2026

Deploy Preview for hyprnote canceled.

Name Link
🔨 Latest commit 1d1b923
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/69b2af232444ed0008c5d223

cursor[bot]

This comment was marked as resolved.

@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

2 similar comments
@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Copy link
Contributor Author

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 2 additional findings in Devin Review.

Open in Devin Review

@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

1 similar comment
@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

devin-ai-integration bot and others added 6 commits March 12, 2026 13:05
- Update cactus-sys dependency to v1.11 (rev a5acad3)
- Add StreamSegment struct with start/end/text fields to StreamResult
- Remove confirmation_threshold from TranscribeOptions (removed upstream)
- Remove NoTimestamps whisper token (upstream now enables timestamps)
- Refactor streaming session to use real segment timestamps from cactus
  via segment_timing_from_result() helper with fallback to buffer_duration_ms
- Track last_segments in ChannelState for segment-aware timing

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
With cactus v1.11's segment-based confirmation (replacing threshold-based),
more audio chunks are needed before segments stabilize and get confirmed.
Increase from 10 to 30 chunks to give the new algorithm enough data.

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
Cactus v1.11 still uses <|ns|> (notimestamps) in its own prompts.
The segment-level timestamps are provided separately by the engine;
the prompt token prevents the model from generating timestamp text
tokens which causes degenerate loops during streaming.

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
- Remove tauri-plugin-apple-contact and tauri-plugin-pdf stub entries
  from Cargo.lock (local workarounds that should not be committed)
- Remove unused last_segments field from ChannelState (stored but never read)

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
The FFI segment timestamps use cactus-internal cumulative time which
may diverge from the session's audio_offset tracking. Keep using
buffer_duration_ms for timing consistency with existing session logic.
Add doc comment explaining the reasoning.

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
@yujonglee yujonglee force-pushed the chore/1773282643-update-cactus-1.11 branch from 22db268 to 8c42717 Compare March 12, 2026 04:06
@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Signed-off-by: Yujong Lee <yujonglee.dev@gmail.com>
@cursor
Copy link

cursor bot commented Mar 12, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on April 10.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant