feat(testing): add FaultInjector framework for StubLlm#1233
feat(testing): add FaultInjector framework for StubLlm#1233
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the testing capabilities for resilience patterns by introducing a sophisticated fault injection framework. It allows developers to simulate various failure conditions, delays, and success scenarios with granular control, which is crucial for verifying the behavior of retry, failover, and circuit breaker logic without altering the core LLM implementation. This addition provides a powerful tool for creating more realistic and reproducible test environments. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a well-designed fault injection framework for StubLlm, which is a great addition for testing resilience features like retries and circuit breakers. The implementation is solid, with good test coverage. I've identified a couple of areas for improvement: one is a minor code simplification in the random fault generation, and the other is an opportunity to reduce code duplication in StubLlm by extracting the new fault handling logic into a shared helper method.
src/testing/fault_injection.rs
Outdated
| *state ^= *state << 17; | ||
| (*state as f64) / (u64::MAX as f64) | ||
| }; | ||
| if random_val.abs() < *error_rate { |
There was a problem hiding this comment.
There was a problem hiding this comment.
Agreed. The .abs() was redundant since the xorshift output is always non-negative. Removed in ed804a2, confirmed still absent after rebase.
src/testing/mod.rs
Outdated
| // Fault injector takes precedence over should_fail. | ||
| if let Some(ref injector) = self.fault_injector { | ||
| match injector.next_action() { | ||
| fault_injection::FaultAction::Fail(fault) => { | ||
| return Err(fault.to_llm_error(&self.model_name)); | ||
| } | ||
| fault_injection::FaultAction::Delay(duration) => { | ||
| tokio::time::sleep(duration).await; | ||
| } | ||
| fault_injection::FaultAction::Succeed => {} | ||
| } | ||
| } else if self.should_fail.load(Ordering::Relaxed) { | ||
| return Err(self.make_error()); | ||
| } |
There was a problem hiding this comment.
This block of fault injection logic is identical to the one in the complete method (lines 190-202). To adhere to the DRY (Don't Repeat Yourself) principle and improve maintainability, this logic should be extracted into a private helper method. For example, you could create a method like async fn apply_faults(&self) -> Result<(), LlmError> and call it at the beginning of both complete and complete_with_tools.
References
- Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.
There was a problem hiding this comment.
Done. Extracted check_faults() helper method that both complete() and complete_with_tools() call. Originally addressed in ed804a2, confirmed still present after rebase.
|
Both Gemini findings were addressed in ed804a2: removed redundant |
ilblackdragon
left a comment
There was a problem hiding this comment.
Review: feat(testing): add FaultInjector framework for StubLlm
Good addition — the three-mode design (SequenceOnce / SequenceLoop / Random) covers the major testing scenarios for retry, failover, and circuit breaker validation, and the integration into StubLlm via check_faults() is clean. A few issues to address before merging:
1. Bug: xorshift64 seed=0 is a fixed point (always triggers faults)
The xorshift64 PRNG in next_action() has a well-known absorbing state at zero. When state = 0:
*state ^= *state << 13; // 0 ^= 0 = 0
*state ^= *state >> 7; // 0 ^= 0 = 0
*state ^= *state << 17; // 0 ^= 0 = 0The state never escapes zero. random_val is always 0.0 / u64::MAX = 0.0, so 0.0 < error_rate is true for any positive rate. This means FaultInjector::random(0.3, fault, 0) fires the fault on every call, not 30% of calls.
This also silently affects SequenceOnce and SequenceLoop constructors which initialize rng_state: Mutex::new(0) — not currently a problem since those modes don't read the RNG, but it would bite anyone who adds a mode that does.
Fix: Guard the seed in the constructor:
pub fn random(error_rate: f64, fault: FaultType, seed: u64) -> Self {
let seed = if seed == 0 { 1 } else { seed };
Self {
// ...
rng_state: Mutex::new(seed),
}
}Add a regression test:
#[test]
fn random_seed_zero_does_not_always_fail() {
let injector = FaultInjector::random(0.5, FaultType::RequestFailed, 0);
let failures = (0..100)
.filter(|_| matches!(injector.next_action(), FaultAction::Fail(_)))
.count();
// With 50% rate, should not be 100/100
assert!(failures < 100, "seed=0 must not produce stuck RNG");
}2. Missing integration test: StubLlm::complete() with FaultInjector
All five unit tests exercise FaultInjector in isolation (calling next_action() directly). None of them test the actual integration path: constructing a StubLlm with with_fault_injector(), calling complete(), and verifying the error/success sequence through the LlmProvider trait.
The check_faults() method in StubLlm has real logic — it dispatches on three FaultAction variants, converts FaultType to LlmError, and handles Delay with tokio::time::sleep. That path is untested.
Suggested test (in src/testing/mod.rs tests section or a new integration test):
#[tokio::test]
async fn stub_llm_with_fault_injector_produces_errors() {
use crate::testing::fault_injection::*;
let injector = Arc::new(FaultInjector::sequence([
FaultAction::Fail(FaultType::RateLimited { retry_after: None }),
FaultAction::Succeed,
]));
let llm = StubLlm::new("test response".to_string())
.with_fault_injector(injector);
let request = CompletionRequest { /* minimal fields */ };
// First call: should fail with RateLimited
let result = llm.complete(request.clone()).await;
assert!(matches!(result, Err(LlmError::RateLimited { .. })));
// Second call: should succeed
let result = llm.complete(request).await;
assert!(result.is_ok());
assert_eq!(llm.call_count(), 2);
}3. Dead field: seed in FaultMode::Random is stored but never read
FaultMode::Random { error_rate, fault, seed } stores the seed, but next_action() destructures it as FaultMode::Random { error_rate, fault, .. } — the seed field is ignored. The actual RNG state lives in self.rng_state: Mutex<u64>.
Either:
- Remove it from the enum variant (the constructor parameter is sufficient), or
- Keep it and add a
reset()method that re-initializesrng_statefrom the stored seed, which would be useful for test reproducibility.
I'd lean toward option (b) since reset() is a natural companion to a seeded PRNG, but option (a) is fine if you don't want the extra API surface.
4. Style: pub mod fault_injection; splits the std::sync import block
In mod.rs, the module declaration lands between two std::sync imports:
use std::sync::Arc;
use std::sync::Mutex;
pub mod fault_injection; // <-- splits the block
use std::sync::atomic::{AtomicBool, AtomicU32, Ordering};Per project convention, module declarations (pub mod ...) should be grouped at the top with the other pub mod declarations (next to pub mod credentials;). This also avoids cargo fmt fighting with manual placement.
5. Nice-to-have: FaultInjector should derive Debug
FaultType, FaultAction, and FaultMode all derive Debug, but FaultInjector itself does not. This makes it harder to inspect in test failure output and log messages. AtomicU32 and Mutex<u64> both implement Debug, so #[derive(Debug)] should work out of the box.
6. Nice-to-have: empty sequence edge case test
SequenceOnce with an empty action list silently succeeds on every call (correct behavior), and SequenceLoop with an empty list does the same. A quick test documenting this contract would be useful:
#[test]
fn empty_sequence_always_succeeds() {
let injector = FaultInjector::sequence([]);
assert!(matches!(injector.next_action(), FaultAction::Succeed));
}Overall this is a well-designed addition. The type mapping from FaultType to LlmError is thorough (7 of 10 variants, covering all the ones relevant to retry/failover testing), the thread-safety model is correct, and the builder API on StubLlm is backward-compatible. Items 1 and 2 should be fixed before merge; the rest are improvements.
|
Thanks for the thorough review, @ilblackdragon! Seed=0 fixed point (bug) — Good catch. The xorshift64 zero-state issue is a real bug; StubLlm integration test — Agreed, we need coverage for Dead Import ordering — Will fix. Nice-to-haves — Will add @gemini's suggestions — Agree on removing the redundant Will push a follow-up commit addressing all of the above. |
|
All 6 review items addressed:
7 fault_injection tests + 29 testing::tests all pass, zero clippy warnings. |
f805b55 to
41e62af
Compare
|
@ilblackdragon -- All 6 review items addressed and rebased onto latest staging to fix the CI regression test check failure. Summary:
CI fix: Rebased onto latest staging (63a2355) to resolve the regression test enforcement failure. All 38 testing module tests + 7 fault_injection tests pass, zero clippy warnings. |
41e62af to
5c3c890
Compare
- Store seed in FaultMode::Random so reset() can re-init the RNG - Add reset() method for test reproducibility (re-seeds RNG, zeros counter) - Strengthen seed=0 regression test to 100 iterations with stricter assertion - Add reset_restores_random_rng_from_stored_seed test - Debug impl and empty_sequence test were already present from prior commit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review feedback addressedAll items from @ilblackdragon's review have been resolved:
All checks pass: fmt clean, clippy zero warnings, all tests green. |
Adds a configurable fault injection framework for testing retry, failover, and circuit breaker behavior. The FaultInjector attaches to StubLlm and provides per-call control over failure type, timing, and sequencing. Components: - FaultType: maps to LlmError variants (RequestFailed, RateLimited, AuthFailed, InvalidResponse, IoError, ContextLengthExceeded, SessionExpired) - FaultAction: Succeed, Fail(FaultType), Delay(Duration) - FaultMode: SequenceOnce (play then succeed), SequenceLoop (repeat forever), Random (seeded xorshift64 PRNG for reproducibility) - FaultInjector: thread-safe (AtomicU32 counter + Mutex RNG) Integration: - StubLlm gains optional fault_injector field via with_fault_injector() - When set, takes precedence over should_fail/error_kind - Backward compatible: existing StubLlm usage unchanged Closes #1220 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant .abs() in random fault comparison - Extract check_faults() helper to DRY up StubLlm methods - Guard xorshift seed=0 (fixed point) by mapping to 1 - Add StubLlm integration test (stub_llm_fault_injector_sequence) - Remove dead seed field from FaultMode::Random - Move pub mod fault_injection to top of mod.rs - Add Debug impl for FaultInjector - Add empty_sequence_always_succeeds test - Add random_seed_zero_does_not_always_fail test
- Store seed in FaultMode::Random so reset() can re-init the RNG - Add reset() method for test reproducibility (re-seeds RNG, zeros counter) - Strengthen seed=0 regression test to 100 iterations with stricter assertion - Add reset_restores_random_rng_from_stored_seed test - Debug impl and empty_sequence test were already present from prior commit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f3b3ed0 to
b006772
Compare
Summary
Adds a configurable fault injection framework for testing retry, failover, and circuit breaker behavior (#1220).
The existing
StubLlmonly supports a booleanshould_failtoggle with two error kinds. This is too coarse to test the retry/failover/circuit-breaker stack, which needs per-call control over failure type, timing, and sequencing.New types in
src/testing/fault_injection.rs:FaultType-- maps 1:1 toLlmErrorvariants:RequestFailed,RateLimited,AuthFailed,InvalidResponse,IoError,ContextLengthExceeded,SessionExpiredFaultAction--Succeed,Fail(FaultType),Delay(Duration)FaultMode--SequenceOnce(play then succeed),SequenceLoop(repeat forever),Random { error_rate, fault, seed }(deterministic via xorshift64)FaultInjector-- thread-safe (AtomicU32+Mutex<u64>for RNG)Integration with StubLlm:
fault_injector: Option<Arc<FaultInjector>>fieldwith_fault_injector()builder methodshould_failTest plan
sequence_once_plays_then_succeeds-- verifies sequence exhaustion + implicit succeedsequence_loop_repeats-- verifies cyclic behaviorrandom_mode_is_deterministic_with_seed-- same seed = same resultsfault_type_produces_correct_llm_errors-- all 7 fault types map correctlydelay_action_exists-- delay variant workscargo clippy --all --all-features-- zero warningsCloses #1220