Skip to content

fix: replace 3 bare excepts with specific exception types#699

Closed
haosenwang1018 wants to merge 1 commit intoace-step:mainfrom
haosenwang1018:fix/bare-excepts
Closed

fix: replace 3 bare excepts with specific exception types#699
haosenwang1018 wants to merge 1 commit intoace-step:mainfrom
haosenwang1018:fix/bare-excepts

Conversation

@haosenwang1018
Copy link

@haosenwang1018 haosenwang1018 commented Feb 25, 2026

Summary

Replace 3 bare except: clauses with specific exception types.

Changes

  • acestep/llm_inference.py (2 sites): except:except (ValueError, TypeError): for int() conversions of bpm and duration metadata fields
  • scripts/lora_data_prepare/gemini_caption.py (1 site): except:except (json.JSONDecodeError, ValueError): for json.loads() parsing

Why

Bare except: catches KeyboardInterrupt and SystemExit, which can mask real errors and prevent clean process termination. Using specific exception types is safer, clearer, and follows Python best practices (PEP 8).

Summary by CodeRabbit

Release Notes

  • Refactor
    • Improved error handling throughout the application for enhanced robustness and stability.

- llm_inference.py: catch (ValueError, TypeError) for int() conversions
  of bpm and duration metadata fields
- gemini_caption.py: catch (json.JSONDecodeError, ValueError) for
  json.loads() parsing

Bare `except:` catches KeyboardInterrupt and SystemExit, which can mask
real errors. Using specific exceptions is safer and clearer.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

Two files refine their exception handling by replacing broad catch-all exceptions with specific exception types, improving error handling precision in BPM/duration parsing and JSON decoding operations without changing functional behavior.

Changes

Cohort / File(s) Summary
Exception Handling Refinement
acestep/llm_inference.py, scripts/lora_data_prepare/gemini_caption.py
Tightened exception catching in two methods: save_current_field() now explicitly catches ValueError and TypeError for BPM/duration parsing; analysis_audio_by_gemini() catches json.JSONDecodeError and ValueError with fallback to manual JSON extraction.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A rabbit hops through code so fine,
Where broad exceptions now align,
Each error caught with careful grace,
Not missing one in error's race! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: replacing bare except clauses with specific exception types across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
acestep/llm_inference.py (1)

1-1: Module far exceeds the 200 LOC hard cap (~3,977 lines)

LLMHandler has grown to nearly 4,000 lines. Natural split points exist: MLX backend methods (Lines 2659–3832, ~1,173 lines), vLLM backend, PyTorch backend, and the prompt-building / parsing utilities. Extracting each backend into its own module (e.g., llm_backends/mlx.py, llm_backends/vllm.py, llm_backends/pt.py) and keeping orchestration in llm_inference.py would bring each file within the cap.

As per coding guidelines: **/*.py — "hard cap 200 LOC … when a file exceeds 200 LOC, suggest splitting into smaller modules."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@acestep/llm_inference.py` at line 1, LLMHandler is far too large (~3,977
LOC); split the heavy backend and utility code into smaller modules: extract the
MLX backend methods (currently around lines 2659–3832) into a new
llm_backends/mlx.py, move the vLLM backend code into llm_backends/vllm.py, the
PyTorch backend into llm_backends/pt.py, and move prompt-building/parsing
utilities into llm_backends/prompts.py (or similar); keep LLMHandler in
llm_inference.py as the orchestrator that imports these backend modules, update
all references to moved functions/classes (e.g., methods on LLMHandler that call
MLX/vLLM/PT implementations) to use the new module-qualified names, adjust
relative imports and package init files, and run/update any unit tests or type
annotations to reflect the new module structure.
scripts/lora_data_prepare/gemini_caption.py (1)

572-572: Module exceeds the 200 LOC hard cap (572 lines)

This file is 572 lines, well past the hard cap of 200 LOC. Consider splitting GeminiService and its helper utilities into a dedicated module (e.g., gemini_service.py) and keeping only the analysis_audio_* / process_folder pipeline functions here.

As per coding guidelines: **/*.py — "hard cap 200 LOC … when a file exceeds 200 LOC, suggest splitting into smaller modules to improve readability and maintainability."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/lora_data_prepare/gemini_caption.py` at line 572, The file is too
large—extract the GeminiService class and its helper functions into a new module
(e.g., gemini_service.py) and keep only the pipeline functions
(analysis_audio_*, process_folder, and return/output orchestration) in this
file; move all methods, utility functions, constants and imports that are only
used by GeminiService into the new module, update imports in gemini_caption.py
to `from gemini_service import GeminiService, <helpers>` and adjust any
references (e.g., method names on GeminiService, helper functions) accordingly,
ensure module-level variables used by both remain or are exported, and
run/adjust any tests or import paths so return values like output_paths and the
behavior of analysis_audio_* and process_folder remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/lora_data_prepare/gemini_caption.py`:
- Around line 474-477: The try/except around json.loads(result) must handle
result==None and non-string inputs from _gemini_service.analyze_audio; update
the block in gemini_caption.py so you either pre-guard (if result is None or not
isinstance(result, (str, bytes, bytearray)): json_result =
extract_json_from_text(result)) or expand the except to include TypeError (or
ValueError and TypeError) so json.loads won't raise an unhandled TypeError; keep
using extract_json_from_text(result) as the fallback and preserve the later
raise Exception(...) behavior.

---

Nitpick comments:
In `@acestep/llm_inference.py`:
- Line 1: LLMHandler is far too large (~3,977 LOC); split the heavy backend and
utility code into smaller modules: extract the MLX backend methods (currently
around lines 2659–3832) into a new llm_backends/mlx.py, move the vLLM backend
code into llm_backends/vllm.py, the PyTorch backend into llm_backends/pt.py, and
move prompt-building/parsing utilities into llm_backends/prompts.py (or
similar); keep LLMHandler in llm_inference.py as the orchestrator that imports
these backend modules, update all references to moved functions/classes (e.g.,
methods on LLMHandler that call MLX/vLLM/PT implementations) to use the new
module-qualified names, adjust relative imports and package init files, and
run/update any unit tests or type annotations to reflect the new module
structure.

In `@scripts/lora_data_prepare/gemini_caption.py`:
- Line 572: The file is too large—extract the GeminiService class and its helper
functions into a new module (e.g., gemini_service.py) and keep only the pipeline
functions (analysis_audio_*, process_folder, and return/output orchestration) in
this file; move all methods, utility functions, constants and imports that are
only used by GeminiService into the new module, update imports in
gemini_caption.py to `from gemini_service import GeminiService, <helpers>` and
adjust any references (e.g., method names on GeminiService, helper functions)
accordingly, ensure module-level variables used by both remain or are exported,
and run/adjust any tests or import paths so return values like output_paths and
the behavior of analysis_audio_* and process_folder remain unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50d8ecb and ee069c0.

📒 Files selected for processing (2)
  • acestep/llm_inference.py
  • scripts/lora_data_prepare/gemini_caption.py

Comment on lines 474 to 477
try:
json_result = json.loads(result)
except:
except (json.JSONDecodeError, ValueError):
json_result = extract_json_from_text(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

TypeError regression: result=None now causes an unhandled exception

_gemini_service.analyze_audio() returns None on several code paths (API failure, extraction failure, etc.). When result is None, json.loads(None) raises TypeError: the JSON object must be str, bytes or bytearray, not NoneType, which is not caught by the new except (json.JSONDecodeError, ValueError) clause. The bare except: this replaces would have caught it and fallen through to extract_json_from_text(result)None → the proper raise Exception(...) at line 479.

🐛 Proposed fix — add `TypeError` and guard against `None`
+    if result is None:
+        raise Exception("无法解析json, 无响应结果")
     try:
         json_result = json.loads(result)
-    except (json.JSONDecodeError, ValueError):
+    except (ValueError, TypeError):
         json_result = extract_json_from_text(result)

Note: json.JSONDecodeError is a subclass of ValueError, so ValueError alone covers both. TypeError handles the None/non-string case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/lora_data_prepare/gemini_caption.py` around lines 474 - 477, The
try/except around json.loads(result) must handle result==None and non-string
inputs from _gemini_service.analyze_audio; update the block in gemini_caption.py
so you either pre-guard (if result is None or not isinstance(result, (str,
bytes, bytearray)): json_result = extract_json_from_text(result)) or expand the
except to include TypeError (or ValueError and TypeError) so json.loads won't
raise an unhandled TypeError; keep using extract_json_from_text(result) as the
fallback and preserve the later raise Exception(...) behavior.

@ChuxiJ
Copy link
Contributor

ChuxiJ commented Feb 25, 2026

@haosenwang1018 please solve the comments

@haosenwang1018
Copy link
Author

Closing per backlog policy; can revisit with narrower scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants