fix(generation): add timeout, progress fallback, and VRAM pre-flight … by Uni404x64 · Pull Request #671 · ace-step/ACE-Step-1.5

Uni404x64 · 2026-02-22T05:55:34Z

…— fixes #146

Addresses three root causes of the 50% UI freeze on mid-tier GPUs:

Synchronous deadlock (generate_music_execute.py)
- Wrapped service_generate() in a daemon thread with a configurable wall-clock timeout (default 600 s, override via ACESTEP_GENERATION_TIMEOUT).
- Safe env-var parsing: non-numeric values and <= 0 fall back to 600 s with a warning instead of crashing at import time or timing out every generation.
- Calls torch.cuda.empty_cache() before raising TimeoutError to recover fragmented VRAM for subsequent attempts.
- Logs orphaned service-generate thread count on timeout for operator visibility.
- Uses except BaseException in the worker to ferry SystemExit and friends across the thread boundary without swallowing them.
- Defensive guard on _result["outputs"] prevents a raw KeyError if the thread exits without populating either dict.
Estimator cold-start (progress.py)
- _start_diffusion_progress_estimator no longer returns (None, None) when no timing history exists; falls back to 2.5 s/step x batch_size so the progress bar always animates on first run.
- Self-calibrates after the first successful generation via existing mechanism.
VRAM pre-flight check (generate_music.py)
- _vram_preflight_check() runs before service_generate, using the project-standard get_effective_free_vram_gb() + DIT_INFERENCE_VRAM_PER_BATCH constants from gpu_config.py.
- Detects CFG mode via guidance_scale > 1.0 (two DiT passes = 2x activations).
- Scales estimate with duration (5 Hz latent rate).
- Returns a clean, actionable error payload instead of a silent hang.
- No-ops on non-CUDA backends; safe if get_effective_free_vram_gb() raises.

All logger calls use loguru (from loguru import logger) consistent with the rest of the handler package.

Tested: Windows 11 / Ryzen 9 5900X / RTX 3070 Ti (8 GB). Batch 2, 3:01 tracks, 100.78 s total. DiT phase 12.8 s. No hang, progress bar animated throughout.

Summary by CodeRabbit

Improvements
- Added GPU memory preflight checks to detect and report insufficient VRAM before generation starts.
- Implemented generation timeout protection that stops stalled jobs, surfaces errors, and ensures progress cleanup.
- Enhanced progress estimation with a conservative fallback for cold starts and clearer logging for progress and headroom.

…fixes ace-step#146 Addresses three root causes of the 50% UI freeze on mid-tier GPUs: 1. **Synchronous deadlock** (`generate_music_execute.py`) - Wrapped `service_generate()` in a daemon thread with a configurable wall-clock timeout (default 600 s, override via `ACESTEP_GENERATION_TIMEOUT`). - Safe env-var parsing: non-numeric values and <= 0 fall back to 600 s with a warning instead of crashing at import time or timing out every generation. - Calls `torch.cuda.empty_cache()` before raising `TimeoutError` to recover fragmented VRAM for subsequent attempts. - Logs orphaned `service-generate` thread count on timeout for operator visibility. - Uses `except BaseException` in the worker to ferry `SystemExit` and friends across the thread boundary without swallowing them. - Defensive guard on `_result["outputs"]` prevents a raw `KeyError` if the thread exits without populating either dict. 2. **Estimator cold-start** (`progress.py`) - `_start_diffusion_progress_estimator` no longer returns `(None, None)` when no timing history exists; falls back to `2.5 s/step x batch_size` so the progress bar always animates on first run. - Self-calibrates after the first successful generation via existing mechanism. 3. **VRAM pre-flight check** (`generate_music.py`) - `_vram_preflight_check()` runs before `service_generate`, using the project-standard `get_effective_free_vram_gb()` + `DIT_INFERENCE_VRAM_PER_BATCH` constants from `gpu_config.py`. - Detects CFG mode via `guidance_scale > 1.0` (two DiT passes = 2x activations). - Scales estimate with duration (5 Hz latent rate). - Returns a clean, actionable error payload instead of a silent hang. - No-ops on non-CUDA backends; safe if `get_effective_free_vram_gb()` raises. All logger calls use loguru (`from loguru import logger`) consistent with the rest of the handler package. Tested: Windows 11 / Ryzen 9 5900X / RTX 3070 Ti (8 GB). Batch 2, 3:01 tracks, 100.78 s total. DiT phase 12.8 s. No hang, progress bar animated throughout. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-22T05:55:49Z

📝 Walkthrough

Walkthrough

Adds GPU VRAM preflight validation, threading-based timeout protection for service execution, and a progress-estimator fallback for cold starts in the music generation flow; returns early on insufficient VRAM and enforces a configurable generation timeout with diagnostic logging.

Changes

Cohort / File(s)	Summary
GPU VRAM Preflight Validation `acestep/core/generation/handler/generate_music.py`	Adds `_vram_preflight_check` to estimate required VRAM from batch size, audio duration, and guidance scale; queries effective free VRAM and returns an error payload if insufficient; invoked before calling the generation service.
Timeout Protection & Threaded Execution `acestep/core/generation/handler/generate_music_execute.py`	Introduces `_DEFAULT_GENERATION_TIMEOUT` from env, runs service_generate inside a daemon thread with join-based timeout, captures/propagates worker exceptions, logs and raises TimeoutError on timeout, and integrates progress estimation while the worker runs.
Progress Estimator Cold-Start Fallback `acestep/core/generation/handler/progress.py`	Adds `_FALLBACK_PER_STEP_SEC` and uses a conservative per-step duration when no historical timing exists, logging the fallback and enabling progress updates for cold starts.

Sequence Diagram

sequenceDiagram
    participant Client
    participant GenerateMusicMixin
    participant VRAMPreflight as VRAM Preflight
    participant Executor as Service Executor
    participant WorkerThread as Worker Thread
    participant Progress as Progress Tracker
    participant GPU

    Client->>GenerateMusicMixin: generate_music(inputs)
    GenerateMusicMixin->>VRAMPreflight: _vram_preflight_check(batch, duration, guidance)
    VRAMPreflight->>GPU: get_effective_free_vram_gb()
    GPU-->>VRAMPreflight: free_vram_gb
    VRAMPreflight->>GenerateMusicMixin: error payload OR None

    alt Insufficient VRAM
        GenerateMusicMixin-->>Client: return error payload
    else Sufficient VRAM
        GenerateMusicMixin->>Executor: prepare and invoke service_generate
        Executor->>WorkerThread: spawn daemon thread to run service
        WorkerThread->>Progress: start progress estimator (fallback if needed)
        WorkerThread->>Executor: produce outputs / raise exception

        alt Timeout
            Executor->>GPU: clear CUDA cache
            Executor-->>Client: raise TimeoutError (with diagnostics)
        else Completed
            Executor-->>Client: return outputs
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Refactor(handler): decompose generate_music orchestration (part 18) #626 — Prior refactor/decomposition of the generate_music orchestration that this PR extends with VRAM checks and timeout handling.
Refactor(handler part 10): decompose generate_music orchestration into reques… #588 — Overlapping modifications to the service execution and guarding logic for music generation.

Suggested reviewers

ChuxiJ

Poem

🐰🎵
I checked the RAM before the hop,
I started threads that never stop,
When time is tight and codes run long,
My progress hums a braver song —
Hop on, generation — sing along!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the three main changes: timeout handling, progress fallback, and VRAM pre-flight checks. It directly reflects the core objectives.
Linked Issues check	✅ Passed	The PR addresses all key coding objectives from issue `#146`: timeout mechanism prevents indefinite hangs [generate_music_execute.py], VRAM pre-flight check detects insufficient memory [generate_music.py], and progress fallback ensures UI responsiveness [progress.py].
Out of Scope Changes check	✅ Passed	All changes are directly scoped to addressing issue `#146`: GPU timeout handling, VRAM validation, and progress estimation. No unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

acestep/core/generation/handler/generate_music_execute.py (2)
156-181: Orphaned daemon threads will accumulate on repeated timeouts.

Each timed-out service_generate call leaves a daemon thread pinned to the CUDA context. If users retry several times in a row, these pile up and can exhaust VRAM or host memory. The logging at Line 163-165 is a good start for observability, but consider adding a guard that refuses new generation attempts when orphaned thread count exceeds a threshold (e.g. ≥ 3), returning an error payload that recommends a server restart.

Not a blocker for this PR, but worth tracking.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/core/generation/handler/generate_music_execute.py` around lines 156 -
181, Detect excessive orphaned daemon threads created by stalled
service_generate runs by reusing the existing stalled count logic (sum(1 for t
in threading.enumerate() if t.name == "service-generate")) and add a guard in
the generation entry point (the handler around gen_thread/service_generate in
generate_music_execute.py) that refuses new generation attempts when stalled >=
3: return an error payload (or raise a specific exception) that clearly states
too many orphaned service_generate threads exist and recommends restarting the
server, instead of proceeding to launch another generation; keep the existing
timeout/cleanup behavior but check this threshold before starting gen_thread so
callers get an actionable error immediately.
177-181: Ruff TRY003: inline exception messages.

Static analysis flags the long string literals passed directly to TimeoutError and RuntimeError. You could extract them into module-level constants or slim them down, but given these are user-facing diagnostic messages, the current form is readable and acceptable. Suppressing with # noqa: TRY003 is fine if you prefer to keep them inline.

Also applies to: 191-194
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/core/generation/handler/generate_music_execute.py` around lines 177 -
181, The inline multi-line exception messages in generate_music_execute.py (the
TimeoutError raised using _DEFAULT_GENERATION_TIMEOUT and the similar
RuntimeError later) are flagged by Ruff TRY003; either extract those user-facing
strings into module-level constants (e.g. TIMEOUT_ERROR_MSG and
RUNTIME_ERROR_MSG) and reference those constants in the TimeoutError and
RuntimeError raises, or keep the inline strings and add a local suppression
comment "# noqa: TRY003" on those raise lines; update the raise sites that use
_DEFAULT_GENERATION_TIMEOUT, TimeoutError, and RuntimeError accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@acestep/core/generation/handler/generate_music.py`:
- Around line 51-70: The VRAM check in generate_music is blocking when
get_effective_free_vram_gb() returns 0.0 on error; change the logic so that if
free_gb is None or free_gb <= 0.0 you treat it as "unable to query VRAM" and do
not block generation: call logger.warning (or similar) to record the failed VRAM
query referencing get_effective_free_vram_gb/free_gb/needed_gb and then return
None to proceed; otherwise keep the existing check (if free_gb >= needed_gb
return None, else raise or handle insufficient VRAM).

---

Nitpick comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 156-181: Detect excessive orphaned daemon threads created by
stalled service_generate runs by reusing the existing stalled count logic (sum(1
for t in threading.enumerate() if t.name == "service-generate")) and add a guard
in the generation entry point (the handler around gen_thread/service_generate in
generate_music_execute.py) that refuses new generation attempts when stalled >=
3: return an error payload (or raise a specific exception) that clearly states
too many orphaned service_generate threads exist and recommends restarting the
server, instead of proceeding to launch another generation; keep the existing
timeout/cleanup behavior but check this threshold before starting gen_thread so
callers get an actionable error immediately.
- Around line 177-181: The inline multi-line exception messages in
generate_music_execute.py (the TimeoutError raised using
_DEFAULT_GENERATION_TIMEOUT and the similar RuntimeError later) are flagged by
Ruff TRY003; either extract those user-facing strings into module-level
constants (e.g. TIMEOUT_ERROR_MSG and RUNTIME_ERROR_MSG) and reference those
constants in the TimeoutError and RuntimeError raises, or keep the inline
strings and add a local suppression comment "# noqa: TRY003" on those raise
lines; update the raise sites that use _DEFAULT_GENERATION_TIMEOUT,
TimeoutError, and RuntimeError accordingly.

coderabbitai · 2026-02-22T05:59:03Z

acestep/core/generation/handler/generate_music.py

+        if not torch.cuda.is_available():
+            return None
+
+        duration_s = audio_duration or 60.0
+        # CFG doubles forward-pass memory: two DiT evaluations per step.
+        dit_key = "base" if guidance_scale > 1.0 else "turbo"
+        per_batch_gb = DIT_INFERENCE_VRAM_PER_BATCH.get(dit_key, 0.6)
+        # Longer audio = more latent frames (5 Hz rate) = more memory.
+        duration_factor = max(1.0, duration_s / 60.0)
+        needed_gb = per_batch_gb * actual_batch_size * duration_factor + VRAM_SAFETY_MARGIN_GB
+
+        free_gb = get_effective_free_vram_gb()
+        logger.info(
+            "[generate_music] VRAM pre-flight: {:.2f} GB free, ~{:.2f} GB needed "
+            "(batch={}, duration={:.0f}s, mode={}).",
+            free_gb, needed_gb, actual_batch_size, duration_s, dit_key,
+        )
+
+        if free_gb >= needed_gb:
+            return None


⚠️ Potential issue | 🟠 Major

get_effective_free_vram_gb() returning 0.0 on internal error will falsely block generation.

When CUDA is available but get_effective_free_vram_gb() hits an unexpected internal exception, it returns 0.0 (per its own except clause). Since 0.0 < needed_gb is always true, this will surface a "not enough VRAM" error even though the GPU may have plenty of memory.

Consider guarding against this — e.g., treat free_gb <= 0.0 as "unable to query VRAM" and let the generation attempt proceed:

Proposed fix

free_gb = get_effective_free_vram_gb() + if free_gb <= 0.0: + logger.warning( + "[generate_music] VRAM pre-flight: unable to query free VRAM; " + "skipping check and allowing generation to proceed." + ) + return None + logger.info( "[generate_music] VRAM pre-flight: {:.2f} GB free, ~{:.2f} GB needed "

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music.py` around lines 51 - 70, The VRAM check in generate_music is blocking when get_effective_free_vram_gb() returns 0.0 on error; change the logic so that if free_gb is None or free_gb <= 0.0 you treat it as "unable to query VRAM" and do not block generation: call logger.warning (or similar) to record the failed VRAM query referencing get_effective_free_vram_gb/free_gb/needed_gb and then return None to proceed; otherwise keep the existing check (if free_gb >= needed_gb return None, else raise or handle insufficient VRAM).

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

acestep/core/generation/handler/generate_music_execute.py (1)

105-116: Consider calling torch.cuda.empty_cache() before raising TimeoutError.

When the generation times out due to VRAM pressure, the orphaned CUDA kernels may leave fragmented memory. Clearing the cache here improves the chance that a subsequent retry succeeds — this was noted in the PR objectives but isn't implemented.

♻️ Proposed addition

             if gen_thread.is_alive():
+                orphaned = sum(
+                    1 for t in threading.enumerate() if t.name == "service-generate"
+                )
                 logger.error(
                     f"[generate_music] service_generate exceeded {_DEFAULT_GENERATION_TIMEOUT}s "
                     f"timeout (batch={actual_batch_size}, steps={inference_steps}, "
-                    f"duration={audio_duration}s).  The CUDA operation may still be "
-                    f"running in the background."
+                    f"duration={audio_duration}s).  {orphaned} orphaned service-generate "
+                    f"thread(s) still running."
                 )
+                try:
+                    import torch
+                    if torch.cuda.is_available():
+                        torch.cuda.empty_cache()
+                except Exception:
+                    pass
                 raise TimeoutError(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@acestep/core/generation/handler/generate_music_execute.py` around lines 105 -
116, The timeout branch in the generate_music execution path should release CUDA
cache before raising the TimeoutError: inside the if gen_thread.is_alive() block
(where logger.error references _DEFAULT_GENERATION_TIMEOUT, actual_batch_size,
inference_steps, audio_duration), call torch.cuda.empty_cache() if
torch.cuda.is_available() to free fragmented VRAM, optionally log that cache was
cleared (using the same logger) and then raise the existing TimeoutError; ensure
you import torch where generate_music_execute.py uses CUDA cleanup.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Line 126: The return unconditionally indexing _result["outputs"] can raise
KeyError if the worker never populated outputs; update the return to safely read
outputs (e.g., use _result.get("outputs", []) or a similar defensive check)
before returning from the function/method in generate_music_execute.py (look for
the return that references _result["outputs"] in the generate_music_execute
handler), so the returned dict always contains a valid outputs value even when
the worker thread failed to populate it.
- Around line 83-84: Replace the narrow except with a BaseException catcher so
worker-thread fatal exceptions (e.g. SystemExit, KeyboardInterrupt) are captured
and transported back to the caller: in the exception handling block that assigns
to _error (the try/except around the worker that currently does "except
Exception as exc" and sets _error["exc"] = exc), change it to catch
BaseException and store the exception object so the caller (the code that checks
gen_thread.is_alive() and later inspects _error/_result) can re-raise or handle
it; keep the same _error key and behavior but use BaseException to match the
intended cross-thread re-raise pattern used by this module
(generate_music_execute.py, the _error dict and the exc variable).
- Around line 9-12: The module-level _DEFAULT_GENERATION_TIMEOUT is currently
set with bare int(os.environ.get(...)) which will raise ValueError on
non-numeric input and allows zero/negative values; change the initialization to
parse ACESTEP_GENERATION_TIMEOUT safely: read the env var, attempt to convert to
int inside a try/except, fall back to 600 on any exception or if the parsed
value is <= 0 (optionally clamp to a minimum like 1), and use that validated
value for _DEFAULT_GENERATION_TIMEOUT; reference the symbol
_DEFAULT_GENERATION_TIMEOUT and the env var name ACESTEP_GENERATION_TIMEOUT when
making the change and add a debug/warn log if parsing fails.

---

Nitpick comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 105-116: The timeout branch in the generate_music execution path
should release CUDA cache before raising the TimeoutError: inside the if
gen_thread.is_alive() block (where logger.error references
_DEFAULT_GENERATION_TIMEOUT, actual_batch_size, inference_steps,
audio_duration), call torch.cuda.empty_cache() if torch.cuda.is_available() to
free fragmented VRAM, optionally log that cache was cleared (using the same
logger) and then raise the existing TimeoutError; ensure you import torch where
generate_music_execute.py uses CUDA cleanup.

coderabbitai · 2026-02-22T06:20:06Z

acestep/core/generation/handler/generate_music_execute.py

+# Maximum wall-clock seconds to wait for service_generate before declaring a hang.
+# Generous default: most generations finish in 30-120s, but large batches on slow
+# GPUs can take several minutes.  Override via ACESTEP_GENERATION_TIMEOUT env var.
+_DEFAULT_GENERATION_TIMEOUT = int(os.environ.get("ACESTEP_GENERATION_TIMEOUT", "600"))


⚠️ Potential issue | 🔴 Critical

Unsafe env-var parsing — non-numeric values crash at import time.

The PR description promises safe parsing with fallback to 600 s, but this bare int() raises ValueError on non-numeric input (e.g., ACESTEP_GENERATION_TIMEOUT="abc"), crashing the import. Zero and negative values also produce degenerate behavior (thread.join(timeout=0) returns immediately, always triggering the timeout path).

🐛 Proposed fix: safe parsing with validation

-# Maximum wall-clock seconds to wait for service_generate before declaring a hang. -# Generous default: most generations finish in 30-120s, but large batches on slow -# GPUs can take several minutes. Override via ACESTEP_GENERATION_TIMEOUT env var. -_DEFAULT_GENERATION_TIMEOUT = int(os.environ.get("ACESTEP_GENERATION_TIMEOUT", "600")) +# Maximum wall-clock seconds to wait for service_generate before declaring a hang. +# Generous default: most generations finish in 30-120s, but large batches on slow +# GPUs can take several minutes. Override via ACESTEP_GENERATION_TIMEOUT env var. +_DEFAULT_GENERATION_TIMEOUT: int = 600 + +_raw_timeout = os.environ.get("ACESTEP_GENERATION_TIMEOUT") +if _raw_timeout is not None: + try: + _parsed = int(_raw_timeout) + if _parsed > 0: + _DEFAULT_GENERATION_TIMEOUT = _parsed + else: + logger.warning( + f"ACESTEP_GENERATION_TIMEOUT={_raw_timeout!r} is <= 0; " + f"falling back to {_DEFAULT_GENERATION_TIMEOUT}s." + ) + except ValueError: + logger.warning( + f"ACESTEP_GENERATION_TIMEOUT={_raw_timeout!r} is not a valid integer; " + f"falling back to {_DEFAULT_GENERATION_TIMEOUT}s." + )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music_execute.py` around lines 9 - 12, The module-level _DEFAULT_GENERATION_TIMEOUT is currently set with bare int(os.environ.get(...)) which will raise ValueError on non-numeric input and allows zero/negative values; change the initialization to parse ACESTEP_GENERATION_TIMEOUT safely: read the env var, attempt to convert to int inside a try/except, fall back to 600 on any exception or if the parsed value is <= 0 (optionally clamp to a minimum like 1), and use that validated value for _DEFAULT_GENERATION_TIMEOUT; reference the symbol _DEFAULT_GENERATION_TIMEOUT and the env var name ACESTEP_GENERATION_TIMEOUT when making the change and add a debug/warn log if parsing fails.

coderabbitai · 2026-02-22T06:20:06Z

acestep/core/generation/handler/generate_music_execute.py

+            except Exception as exc:
+                _error["exc"] = exc


⚠️ Potential issue | 🟠 Major

except Exception won't catch BaseException subclasses in the worker thread.

If SystemExit (or another BaseException subclass) is raised inside the worker, neither _result nor _error gets populated. The thread exits silently, gen_thread.is_alive() returns False, the error check on line 117 passes, and line 126 crashes with KeyError: 'outputs'.

Use except BaseException here to ferry all exceptions across the thread boundary, matching the stated intent in the PR description. This also addresses the Ruff BLE001 hint — in this specific pattern (worker thread → re-raise on caller) broad catching is intentional.

🐛 Proposed fix

- except Exception as exc: + except BaseException as exc: _error["exc"] = exc

🧰 Tools

🪛 Ruff (0.15.1)

[warning] 83-83: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music_execute.py` around lines 83 - 84, Replace the narrow except with a BaseException catcher so worker-thread fatal exceptions (e.g. SystemExit, KeyboardInterrupt) are captured and transported back to the caller: in the exception handling block that assigns to _error (the try/except around the worker that currently does "except Exception as exc" and sets _error["exc"] = exc), change it to catch BaseException and store the exception object so the caller (the code that checks gen_thread.is_alive() and later inspects _error/_result) can re-raise or handle it; keep the same _error key and behavior but use BaseException to match the intended cross-thread re-raise pattern used by this module (generate_music_execute.py, the _error dict and the exc variable).

coderabbitai · 2026-02-22T06:20:06Z

acestep/core/generation/handler/generate_music_execute.py

                progress_thread.join(timeout=1.0)
-        return {"outputs": outputs, "infer_steps_for_progress": infer_steps_for_progress}
+
+        return {"outputs": _result["outputs"], "infer_steps_for_progress": infer_steps_for_progress}


⚠️ Potential issue | 🟠 Major

Missing defensive guard on _result["outputs"] — potential KeyError.

The PR description mentions a defensive guard here, but the code directly accesses _result["outputs"] which will raise KeyError if the worker thread exits without populating it (e.g., via an uncaught BaseException). Even with the except BaseException fix above, a guard here provides belt-and-suspenders safety.

🐛 Proposed fix

- return {"outputs": _result["outputs"], "infer_steps_for_progress": infer_steps_for_progress} + if "outputs" not in _result: + raise RuntimeError( + "service_generate thread exited without producing outputs or raising an exception." + ) + return {"outputs": _result["outputs"], "infer_steps_for_progress": infer_steps_for_progress}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music_execute.py` at line 126, The return unconditionally indexing _result["outputs"] can raise KeyError if the worker never populated outputs; update the return to safely read outputs (e.g., use _result.get("outputs", []) or a similar defensive check) before returning from the function/method in generate_music_execute.py (look for the return that references _result["outputs"] in the generate_music_execute handler), so the returned dict always contains a valid outputs value even when the worker thread failed to populate it.

ChuxiJ · 2026-02-22T14:32:19Z

please solve comments

coderabbitai bot reviewed Feb 22, 2026

View reviewed changes

fix: add timeout, progress fallback, and VRAM pre-flight update

213891a

coderabbitai bot reviewed Feb 22, 2026

View reviewed changes

ChuxiJ approved these changes Feb 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(generation): add timeout, progress fallback, and VRAM pre-flight …#671

fix(generation): add timeout, progress fallback, and VRAM pre-flight …#671
Uni404x64 wants to merge 2 commits intoace-step:mainfrom
Uni404x64:claude/awesome-rosalind

Uni404x64 commented Feb 22, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 22, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 22, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 22, 2026

Uh oh!

coderabbitai bot Feb 22, 2026

Uh oh!

coderabbitai bot Feb 22, 2026

Uh oh!

ChuxiJ commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Uni404x64 commented Feb 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

ChuxiJ commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uni404x64 commented Feb 22, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 22, 2026 •

edited

Loading