-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix: reduce retry log noise during concurrent chunk processing #380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Addresses issue #362 - users were seeing hundreds of ERROR/WARNING logs when processing large documents due to SurrealDB v2 transaction conflicts during concurrent chunk embedding operations. Changes: - Upgraded to surreal-commands v1.3.0 which includes retry_log_level feature - Increased retry attempts from 5 to 15 with max wait time 120s (from 30s) to handle deep queues during concurrent processing - Set retry_log_level to "debug" in embed_chunk and process_source commands - Changed repository.py RuntimeError logging from ERROR to DEBUG level - Updated command exception handlers to log retries at DEBUG level - Updated documentation to reflect retry strategy This is a temporary workaround for SurrealDB v2.x transaction conflict issues with SEARCH indexes. Settings can be reduced after migrating to SurrealDB v3 which fixes the underlying concurrency issue. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
|
Claude finished @lfnovo's task —— View job PR Review: Reduce Retry Log Noise During Concurrent Chunk ProcessingThis PR addresses a legitimate user experience issue where hundreds of ERROR/WARNING logs were appearing during large document processing, making it seem like something was seriously wrong when the retry mechanism was actually working correctly. ✅ StrengthsWell-Targeted SolutionThe three-layer fix is well thought out:
Appropriate Retry AdjustmentsThe increased retry resilience makes sense for the identified problem:
Good DocumentationExcellent inline documentation explaining:
Proper Exception HandlingWell-structured exception handling that distinguishes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 7 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="commands/CLAUDE.md">
<violation number="1" location="commands/CLAUDE.md:18">
P2: Documentation inconsistency: This line documents '15 attempts' but the Key Components section (line 7) still says 'max 5×' for `process_source_command`. Consider updating line 7 to reflect the new retry count for consistency.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
|
||
| - **Pydantic I/O**: All commands use `CommandInput`/`CommandOutput` subclasses for type safety and serialization. | ||
| - **Error handling**: Permanent errors return failure output; `RuntimeError` exceptions auto-retry via surreal-commands. | ||
| - **Retry configuration**: Aggressive retry settings (15 attempts, 1-120s backoff, DEBUG log level) are a temporary workaround for SurrealDB v2.x transaction conflicts with SEARCH indexes. These can be reduced after migrating to SurrealDB v3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Documentation inconsistency: This line documents '15 attempts' but the Key Components section (line 7) still says 'max 5×' for process_source_command. Consider updating line 7 to reflect the new retry count for consistency.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At commands/CLAUDE.md, line 18:
<comment>Documentation inconsistency: This line documents '15 attempts' but the Key Components section (line 7) still says 'max 5×' for `process_source_command`. Consider updating line 7 to reflect the new retry count for consistency.</comment>
<file context>
@@ -15,8 +15,9 @@
- **Pydantic I/O**: All commands use `CommandInput`/`CommandOutput` subclasses for type safety and serialization.
- **Error handling**: Permanent errors return failure output; `RuntimeError` exceptions auto-retry via surreal-commands.
+- **Retry configuration**: Aggressive retry settings (15 attempts, 1-120s backoff, DEBUG log level) are a temporary workaround for SurrealDB v2.x transaction conflicts with SEARCH indexes. These can be reduced after migrating to SurrealDB v3.
- **Model dumping**: Recursive `full_model_dump()` utility converts Pydantic models → dicts for DB/API responses.
-- **Logging**: Uses `loguru.logger` throughout; logs execution start/end and key metrics (processing time, counts).
</file context>
fix: reduce retry log noise during concurrent chunk processing
Fixes #362
Problem
Users were seeing hundreds of ERROR/WARNING logs when processing large documents due to SurrealDB v2 transaction conflicts during concurrent chunk embedding operations. This made it appear that something was seriously wrong, even though the retry mechanism was working correctly.
Related upstream issue: surrealdb/surrealdb#6681
Solution
Three-layer fix to eliminate retry log noise:
repository.py): Changed RuntimeError logging from ERROR → DEBUGretry_log_level: "debug"logger.debug()Additionally increased retry resilience to handle deep queues:
Testing
✅ Verified with document upload containing 200+ chunks
✅ Clean logs at INFO level - no scary ERROR messages
✅ Transaction conflicts only visible with
--log-level DEBUGFiles Changed
commands/embedding_commands.py- embed_chunk retry configcommands/source_commands.py- process_source retry configopen_notebook/database/repository.py- RuntimeError logging levelpyproject.toml- surreal-commands >= 1.3.0Notes
This is a temporary workaround for SurrealDB v2.x transaction conflict issues with SEARCH indexes. Retry settings can be reduced after migrating to SurrealDB v3, which fixes the underlying concurrency issue (see upstream issue link above).