fix: harden error recovery in validators, chatbot retry, and atomic_translate#118
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix
This PR fixes four error recovery bugs and increases the tokenizer safety margin.
1. Validators crash on None response content
When the LLM API returns an empty response,
get_content()returnsNone. All four validators (ChunkedTranslateValidator,AtomicTranslateValidator,ProofreaderValidator,ContextReviewerValidateValidator) pass this directly tore.search()ordetect_language_of(), which raisesTypeError: expected string or bytes-like object. This was observed on CI.Added
if not generated_content: return Falseguard to all four validators.2. ClaudeBot mutates caller's message list
ClaudeBot._create_chat()callsmessages.pop(0)to extract the system message, which modifies the caller's list in place. If the retry loop re-enters or if the same message list is reused, the system message is already gone.GeminiBotalready avoids this withdeepcopy(messages).Added
messages = list(messages)shallow copy before thepop.3.
_create_chatsilently returns unvalidated responseWhen
output_checkerfails on every retry attempt, the retry loop exits normally. Since the API call itself succeeded,responseis notNone, so theif not responseguard does not trigger. The method returns a response that never passed format validation, and the caller has no way to know.Added a
validatedflag that is set toTrueonly whenoutput_checkerpasses. After the loop, ifnot validated, a warning is logged. This makes the failure visible in logs while preserving the existing best-effort behavior (returning the last response rather than raising).4.
atomic_translateusesassertfor error handlingassertis stripped inpython -Omode. As the last fallback in the translation pipeline,atomic_translateshould always report failures reliably.Replaced
assert len(translated) == len(texts)withif ... raise ChatBotException(...).5. Tokenizer safety margin increased from 5% to 10%
_compute_max_tokens()and the three token budget calculations inContextReviewerAgentuse a safety margin to account for the difference between tiktoken's token estimates and the actual model's tokenizer. CI logs showed divergences of 3-3.5% on non-GPT models (Gemini via OpenRouter), which is close enough to 5% to causeLengthExceedExceptionon large chunks. Increased to 10% to provide sufficient headroom.