-
Notifications
You must be signed in to change notification settings - Fork 14
Performance: optimize joiner output disposal loop #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -102,6 +102,7 @@ export class ParakeetModel { | |
| this._targetLenTensor = new ort.Tensor('int32', this._targetLenArray, [1]); | ||
| this._encoderFrameBuffer = null; // Will be allocated when we know the dimension D | ||
| this._encoderFrameTensor = null; // Will be allocated when we know D | ||
| this._recycledOutputs = []; // Reusable array for joiner output disposal | ||
|
|
||
| // Incremental decode cache: stores decoder state at the end of the prefix | ||
| // keyed by a caller-provided cacheKey. This lets us skip decoding the | ||
|
|
@@ -323,10 +324,11 @@ export class ParakeetModel { | |
| const logits = out['outputs']; | ||
| const outputState1 = out['output_states_1']; | ||
| const outputState2 = out['output_states_2']; | ||
| const seenOutputs = new Set(); | ||
| for (const value of Object.values(out)) { | ||
| if (!value || typeof value.dispose !== 'function' || seenOutputs.has(value)) continue; | ||
| seenOutputs.add(value); | ||
| this._recycledOutputs.length = 0; // Clear recycled array | ||
| for (const key in out) { | ||
| const value = out[key]; | ||
| if (!value || typeof value.dispose !== 'function' || this._recycledOutputs.includes(value)) continue; | ||
| this._recycledOutputs.push(value); | ||
| if (value === logits || value === outputState1 || value === outputState2) continue; | ||
| value.dispose(); | ||
| } | ||
|
|
@@ -339,12 +341,12 @@ export class ParakeetModel { | |
| const failDecoderStep = (message) => { | ||
| logits?.dispose?.(); | ||
|
|
||
| const disposed = new Set(); | ||
| const disposed = []; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The PR description states that a recycled class-level array ( const disposed = this._recycledOutputs;
disposed.length = 0; |
||
| const disposeUniqueState = (state) => { | ||
| if (!state) return; | ||
| for (const tensor of [state.state1, state.state2]) { | ||
| if (!tensor || tensor === this._combState1 || tensor === this._combState2 || disposed.has(tensor)) continue; | ||
| disposed.add(tensor); | ||
| if (!tensor || tensor === this._combState1 || tensor === this._combState2 || disposed.includes(tensor)) continue; | ||
| disposed.push(tensor); | ||
| tensor.dispose?.(); | ||
| } | ||
| }; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 81
🏁 Script executed:
# List directories to understand project structure ls -laRepository: ysdede/parakeet.js
Length of output: 1513
🏁 Script executed:
# Search for any test-related files fd -e test.js -e spec.js -e tests.jsRepository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 2105
🏁 Script executed:
# List test files ls -la tests/Repository: ysdede/parakeet.js
Length of output: 2067
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 770
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 3517
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 275
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 90
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 177
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 444
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 5015
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 853
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 2082
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 1152
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 366
🏁 Script executed:
# Check the full decode_loop.test.mjs to see if it validates disposal wc -l tests/decode_loop.test.mjsRepository: ysdede/parakeet.js
Length of output: 93
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 4169
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 286
🏁 Script executed:
# Look at more of the decode_loop test file to see if disposal is checked tail -100 tests/decode_loop.test.mjsRepository: ysdede/parakeet.js
Length of output: 3441
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 4155
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 675
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 1182
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 212
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 44
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 1170
🏁 Script executed:
Repository: ysdede/parakeet.js
Length of output: 3354
Disposal logic is correct and covered by existing tests; document the serial-use constraint.
The new flow (clear → for…in → dedup via
includes→ skip kept tensors → dispose) is behaviorally equivalent to the previousSet+Object.valuesversion. Each unique disposable inoutis disposed at most once, andlogits/outputState1/outputState2are preserved. The clear-and-fill is fully synchronous after theawaitonjoinerSession.run, so_recycledOutputsstays internally consistent within one call.Existing decode tests (
decode_loop.test.mjs) do exercise the disposal path viatranscribe()calls, which invoke_runCombinedStepmultiple times. The basic logic is sound.One thing worth being explicit about:
_recycledOutputsis now a class-level mutable buffer, so concurrent invocations of_runCombinedStepon the sameParakeetModelinstance would clobber each other (same implicit constraint already applies to_targetIdArray/_targetTensor/_combState1/_combState2). That's fine for the current call sites intranscribe, but worth documenting if streaming/long-audio paths ever start sharing a single model across overlapping in-flight calls. Add a brief comment in the constructor or class docstring noting this serial-use assumption.🤖 Prompt for AI Agents