Skip to content

feat(testing): add FaultInjector framework for StubLlm#1233

Open
zmanian wants to merge 5 commits intostagingfrom
feat/1220-fault-injector
Open

feat(testing): add FaultInjector framework for StubLlm#1233
zmanian wants to merge 5 commits intostagingfrom
feat/1220-fault-injector

Conversation

@zmanian
Copy link
Collaborator

@zmanian zmanian commented Mar 16, 2026

Summary

Adds a configurable fault injection framework for testing retry, failover, and circuit breaker behavior (#1220).

The existing StubLlm only supports a boolean should_fail toggle with two error kinds. This is too coarse to test the retry/failover/circuit-breaker stack, which needs per-call control over failure type, timing, and sequencing.

New types in src/testing/fault_injection.rs:

  • FaultType -- maps 1:1 to LlmError variants: RequestFailed, RateLimited, AuthFailed, InvalidResponse, IoError, ContextLengthExceeded, SessionExpired
  • FaultAction -- Succeed, Fail(FaultType), Delay(Duration)
  • FaultMode -- SequenceOnce (play then succeed), SequenceLoop (repeat forever), Random { error_rate, fault, seed } (deterministic via xorshift64)
  • FaultInjector -- thread-safe (AtomicU32 + Mutex<u64> for RNG)

Integration with StubLlm:

  • New fault_injector: Option<Arc<FaultInjector>> field
  • with_fault_injector() builder method
  • When set, consulted on every call before should_fail
  • Fully backward compatible

Test plan

  • sequence_once_plays_then_succeeds -- verifies sequence exhaustion + implicit succeed
  • sequence_loop_repeats -- verifies cyclic behavior
  • random_mode_is_deterministic_with_seed -- same seed = same results
  • fault_type_produces_correct_llm_errors -- all 7 fault types map correctly
  • delay_action_exists -- delay variant works
  • All 40 testing module tests pass
  • cargo clippy --all --all-features -- zero warnings

Closes #1220

@github-actions github-actions bot added size: L 200-499 changed lines risk: low Changes to docs, tests, or low-risk modules labels Mar 16, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the testing capabilities for resilience patterns by introducing a sophisticated fault injection framework. It allows developers to simulate various failure conditions, delays, and success scenarios with granular control, which is crucial for verifying the behavior of retry, failover, and circuit breaker logic without altering the core LLM implementation. This addition provides a powerful tool for creating more realistic and reproducible test environments.

Highlights

  • Fault Injection Framework: Introduced a new configurable fault injection framework to enable more robust testing of retry, failover, and circuit breaker behaviors within the StubLlm.
  • New Types for Fault Control: Added FaultType (mapping to LlmError variants), FaultAction (Succeed, Fail, Delay), and FaultMode (SequenceOnce, SequenceLoop, Random) to precisely define fault scenarios.
  • StubLlm Integration: Integrated the FaultInjector into StubLlm via an optional field and a new builder method with_fault_injector(), allowing fine-grained control over LLM call outcomes.
  • Backward Compatibility: Ensured the new fault injection mechanism is fully backward compatible, taking precedence over the existing should_fail toggle when configured.
Changelog
  • src/testing/fault_injection.rs
    • Added FaultType enum to represent different LLM error types.
    • Implemented to_llm_error method for FaultType to convert to LlmError.
    • Added FaultAction enum to define actions like Succeed, Fail, or Delay.
    • Introduced FaultMode enum for sequencing faults (once, loop, random).
    • Created FaultInjector struct with constructors for different fault modes and a next_action method.
    • Included comprehensive unit tests for all fault injection modes and error type conversions.
  • src/testing/mod.rs
    • Added fault_injection module to the testing crate.
    • Introduced fault_injector: Option<Arc<fault_injection::FaultInjector>> field to the StubLlm struct.
    • Initialized the fault_injector field to None in all StubLlm constructors.
    • Added a with_fault_injector builder method to StubLlm for attaching a FaultInjector instance.
    • Modified StubLlm::complete and StubLlm::tool_complete methods to prioritize actions from the fault_injector if present, before falling back to the should_fail logic.
Activity
  • Verified sequence_once_plays_then_succeeds test case.
  • Verified sequence_loop_repeats test case.
  • Verified random_mode_is_deterministic_with_seed test case.
  • Verified fault_type_produces_correct_llm_errors test case.
  • Verified delay_action_exists test case.
  • Confirmed all 40 testing module tests pass.
  • Ensured cargo clippy --all --all-features runs with zero warnings.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added the contributor: core 20+ merged PRs label Mar 16, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a well-designed fault injection framework for StubLlm, which is a great addition for testing resilience features like retries and circuit breakers. The implementation is solid, with good test coverage. I've identified a couple of areas for improvement: one is a minor code simplification in the random fault generation, and the other is an opportunity to reduce code duplication in StubLlm by extracting the new fault handling logic into a shared helper method.

*state ^= *state << 17;
(*state as f64) / (u64::MAX as f64)
};
if random_val.abs() < *error_rate {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The random_val is calculated from a u64 state, so it will always be a non-negative float in the range [0.0, 1.0]. The .abs() call is therefore redundant and can be removed for clarity.

Suggested change
if random_val.abs() < *error_rate {
if random_val < *error_rate {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The .abs() was redundant since the xorshift output is always non-negative. Removed in ed804a2, confirmed still absent after rebase.

Comment on lines 220 to 233
// Fault injector takes precedence over should_fail.
if let Some(ref injector) = self.fault_injector {
match injector.next_action() {
fault_injection::FaultAction::Fail(fault) => {
return Err(fault.to_llm_error(&self.model_name));
}
fault_injection::FaultAction::Delay(duration) => {
tokio::time::sleep(duration).await;
}
fault_injection::FaultAction::Succeed => {}
}
} else if self.should_fail.load(Ordering::Relaxed) {
return Err(self.make_error());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of fault injection logic is identical to the one in the complete method (lines 190-202). To adhere to the DRY (Don't Repeat Yourself) principle and improve maintainability, this logic should be extracted into a private helper method. For example, you could create a method like async fn apply_faults(&self) -> Result<(), LlmError> and call it at the beginning of both complete and complete_with_tools.

References
  1. Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Extracted check_faults() helper method that both complete() and complete_with_tools() call. Originally addressed in ed804a2, confirmed still present after rebase.

@zmanian
Copy link
Collaborator Author

zmanian commented Mar 16, 2026

Both Gemini findings were addressed in ed804a2: removed redundant .abs() and extracted check_faults() helper to eliminate duplication between complete and complete_with_tools.

Copy link
Member

@ilblackdragon ilblackdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat(testing): add FaultInjector framework for StubLlm

Good addition — the three-mode design (SequenceOnce / SequenceLoop / Random) covers the major testing scenarios for retry, failover, and circuit breaker validation, and the integration into StubLlm via check_faults() is clean. A few issues to address before merging:


1. Bug: xorshift64 seed=0 is a fixed point (always triggers faults)

The xorshift64 PRNG in next_action() has a well-known absorbing state at zero. When state = 0:

*state ^= *state << 13;  // 0 ^= 0 = 0
*state ^= *state >> 7;   // 0 ^= 0 = 0
*state ^= *state << 17;  // 0 ^= 0 = 0

The state never escapes zero. random_val is always 0.0 / u64::MAX = 0.0, so 0.0 < error_rate is true for any positive rate. This means FaultInjector::random(0.3, fault, 0) fires the fault on every call, not 30% of calls.

This also silently affects SequenceOnce and SequenceLoop constructors which initialize rng_state: Mutex::new(0) — not currently a problem since those modes don't read the RNG, but it would bite anyone who adds a mode that does.

Fix: Guard the seed in the constructor:

pub fn random(error_rate: f64, fault: FaultType, seed: u64) -> Self {
    let seed = if seed == 0 { 1 } else { seed };
    Self {
        // ...
        rng_state: Mutex::new(seed),
    }
}

Add a regression test:

#[test]
fn random_seed_zero_does_not_always_fail() {
    let injector = FaultInjector::random(0.5, FaultType::RequestFailed, 0);
    let failures = (0..100)
        .filter(|_| matches!(injector.next_action(), FaultAction::Fail(_)))
        .count();
    // With 50% rate, should not be 100/100
    assert!(failures < 100, "seed=0 must not produce stuck RNG");
}

2. Missing integration test: StubLlm::complete() with FaultInjector

All five unit tests exercise FaultInjector in isolation (calling next_action() directly). None of them test the actual integration path: constructing a StubLlm with with_fault_injector(), calling complete(), and verifying the error/success sequence through the LlmProvider trait.

The check_faults() method in StubLlm has real logic — it dispatches on three FaultAction variants, converts FaultType to LlmError, and handles Delay with tokio::time::sleep. That path is untested.

Suggested test (in src/testing/mod.rs tests section or a new integration test):

#[tokio::test]
async fn stub_llm_with_fault_injector_produces_errors() {
    use crate::testing::fault_injection::*;

    let injector = Arc::new(FaultInjector::sequence([
        FaultAction::Fail(FaultType::RateLimited { retry_after: None }),
        FaultAction::Succeed,
    ]));
    let llm = StubLlm::new("test response".to_string())
        .with_fault_injector(injector);

    let request = CompletionRequest { /* minimal fields */ };

    // First call: should fail with RateLimited
    let result = llm.complete(request.clone()).await;
    assert!(matches!(result, Err(LlmError::RateLimited { .. })));

    // Second call: should succeed
    let result = llm.complete(request).await;
    assert!(result.is_ok());
    assert_eq!(llm.call_count(), 2);
}

3. Dead field: seed in FaultMode::Random is stored but never read

FaultMode::Random { error_rate, fault, seed } stores the seed, but next_action() destructures it as FaultMode::Random { error_rate, fault, .. } — the seed field is ignored. The actual RNG state lives in self.rng_state: Mutex<u64>.

Either:

  • Remove it from the enum variant (the constructor parameter is sufficient), or
  • Keep it and add a reset() method that re-initializes rng_state from the stored seed, which would be useful for test reproducibility.

I'd lean toward option (b) since reset() is a natural companion to a seeded PRNG, but option (a) is fine if you don't want the extra API surface.


4. Style: pub mod fault_injection; splits the std::sync import block

In mod.rs, the module declaration lands between two std::sync imports:

use std::sync::Arc;
use std::sync::Mutex;
pub mod fault_injection;        // <-- splits the block

use std::sync::atomic::{AtomicBool, AtomicU32, Ordering};

Per project convention, module declarations (pub mod ...) should be grouped at the top with the other pub mod declarations (next to pub mod credentials;). This also avoids cargo fmt fighting with manual placement.


5. Nice-to-have: FaultInjector should derive Debug

FaultType, FaultAction, and FaultMode all derive Debug, but FaultInjector itself does not. This makes it harder to inspect in test failure output and log messages. AtomicU32 and Mutex<u64> both implement Debug, so #[derive(Debug)] should work out of the box.

6. Nice-to-have: empty sequence edge case test

SequenceOnce with an empty action list silently succeeds on every call (correct behavior), and SequenceLoop with an empty list does the same. A quick test documenting this contract would be useful:

#[test]
fn empty_sequence_always_succeeds() {
    let injector = FaultInjector::sequence([]);
    assert!(matches!(injector.next_action(), FaultAction::Succeed));
}

Overall this is a well-designed addition. The type mapping from FaultType to LlmError is thorough (7 of 10 variants, covering all the ones relevant to retry/failover testing), the thread-safety model is correct, and the builder API on StubLlm is backward-compatible. Items 1 and 2 should be fixed before merge; the rest are improvements.

@zmanian
Copy link
Collaborator Author

zmanian commented Mar 16, 2026

Thanks for the thorough review, @ilblackdragon!

Seed=0 fixed point (bug) — Good catch. The xorshift64 zero-state issue is a real bug; FaultInjector::random(0.3, fault, 0) would fire on every call. Will guard in the constructor with if seed == 0 { 1 } else { seed }.

StubLlm integration test — Agreed, we need coverage for StubLlm::complete() with FaultInjector end-to-end, not just next_action() directly. Will add.

Dead seed field in FaultMode::Random — Going with option (b): keep the field and add a reset() method. This gives us test reproducibility by allowing sequences to be replayed from a known seed.

Import ordering — Will fix. pub mod fault_injection; should be grouped with the other pub mod declarations, not splitting the std::sync imports.

Nice-to-haves — Will add #[derive(Debug)] on FaultInjector and an empty-sequence edge case test.

@gemini's suggestions — Agree on removing the redundant .abs() and extracting an apply_faults() helper to DRY up the fault handling paths.

Will push a follow-up commit addressing all of the above.

@zmanian
Copy link
Collaborator Author

zmanian commented Mar 16, 2026

All 6 review items addressed:

  1. xorshift seed=0 fixed -- constructor guards with if seed == 0 { 1 } else { seed }. Added random_seed_zero_does_not_always_fail test.
  2. StubLlm integration test -- stub_llm_fault_injector_sequence wires injector into StubLlm, calls complete(), verifies error/success sequence through LlmProvider trait.
  3. Removed dead seed field from FaultMode::Random (already stored in rng_state).
  4. Moved pub mod fault_injection to top of mod.rs next to pub mod credentials.
  5. Manual Debug impl for FaultInjector (shows call_index and mode).
  6. Empty sequence test -- empty_sequence_always_succeeds.

7 fault_injection tests + 29 testing::tests all pass, zero clippy warnings.

@zmanian zmanian requested a review from ilblackdragon March 16, 2026 15:47
@zmanian zmanian force-pushed the feat/1220-fault-injector branch from f805b55 to 41e62af Compare March 16, 2026 20:24
@zmanian
Copy link
Collaborator Author

zmanian commented Mar 16, 2026

@ilblackdragon -- All 6 review items addressed and rebased onto latest staging to fix the CI regression test check failure. Summary:

  1. xorshift seed=0 bug: Fixed. Constructor now guards with if seed == 0 { 1 } else { seed }. Added random_seed_zero_does_not_always_fail regression test that verifies the RNG is not stuck.

  2. StubLlm integration test: Added stub_llm_fault_injector_sequence which wires a FaultInjector into StubLlm, calls complete() through the LlmProvider trait, and verifies the RateLimited error on first call followed by success on the second.

  3. Dead seed field: Went with option (a) -- removed the field from FaultMode::Random since the constructor parameter is sufficient and the RNG state lives in rng_state. This keeps the API surface minimal.

  4. Import ordering: pub mod fault_injection; moved next to pub mod credentials; at the top of mod.rs.

  5. Debug impl: Added manual Debug for FaultInjector (shows call_index and mode).

  6. Empty sequence test: Added empty_sequence_always_succeeds.

CI fix: Rebased onto latest staging (63a2355) to resolve the regression test enforcement failure. All 38 testing module tests + 7 fault_injection tests pass, zero clippy warnings.

@zmanian zmanian force-pushed the feat/1220-fault-injector branch from 41e62af to 5c3c890 Compare March 16, 2026 20:27
zmanian added a commit that referenced this pull request Mar 16, 2026
- Store seed in FaultMode::Random so reset() can re-init the RNG
- Add reset() method for test reproducibility (re-seeds RNG, zeros counter)
- Strengthen seed=0 regression test to 100 iterations with stricter assertion
- Add reset_restores_random_rng_from_stored_seed test
- Debug impl and empty_sequence test were already present from prior commit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zmanian
Copy link
Collaborator Author

zmanian commented Mar 16, 2026

Review feedback addressed

All items from @ilblackdragon's review have been resolved:

  1. Bug: seed=0 fixed point -- Constructor already guarded with if seed == 0 { 1 }. Strengthened the random_seed_zero_does_not_always_fail test to use 100 iterations asserting failures < 100.

  2. Integration test -- stub_llm_fault_injector_sequence test already existed from prior commit, exercising full StubLlm->complete()->FaultInjector path.

  3. Dead seed field -- Added seed: u64 to FaultMode::Random and implemented reset() method that re-initializes rng_state from stored seed and zeros the call counter. Added reset_restores_random_rng_from_stored_seed test.

  4. Style -- pub mod fault_injection; already correctly placed at top of mod.rs.

  5. Debug derive -- Already present as manual impl Debug (needed for Mutex/AtomicU32 fields).

  6. Empty sequence test -- Already present from prior commit.

All checks pass: fmt clean, clippy zero warnings, all tests green.

@zmanian zmanian added the skip-regression-check Bypass regression test CI gate (tests exist but not in tests/ dir) label Mar 17, 2026
zmanian and others added 5 commits March 17, 2026 03:05
Adds a configurable fault injection framework for testing retry, failover,
and circuit breaker behavior. The FaultInjector attaches to StubLlm and
provides per-call control over failure type, timing, and sequencing.

Components:
- FaultType: maps to LlmError variants (RequestFailed, RateLimited,
  AuthFailed, InvalidResponse, IoError, ContextLengthExceeded, SessionExpired)
- FaultAction: Succeed, Fail(FaultType), Delay(Duration)
- FaultMode: SequenceOnce (play then succeed), SequenceLoop (repeat forever),
  Random (seeded xorshift64 PRNG for reproducibility)
- FaultInjector: thread-safe (AtomicU32 counter + Mutex RNG)

Integration:
- StubLlm gains optional fault_injector field via with_fault_injector()
- When set, takes precedence over should_fail/error_kind
- Backward compatible: existing StubLlm usage unchanged

Closes #1220

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove redundant .abs() in random fault comparison
- Extract check_faults() helper to DRY up StubLlm methods
- Guard xorshift seed=0 (fixed point) by mapping to 1
- Add StubLlm integration test (stub_llm_fault_injector_sequence)
- Remove dead seed field from FaultMode::Random
- Move pub mod fault_injection to top of mod.rs
- Add Debug impl for FaultInjector
- Add empty_sequence_always_succeeds test
- Add random_seed_zero_does_not_always_fail test
- Store seed in FaultMode::Random so reset() can re-init the RNG
- Add reset() method for test reproducibility (re-seeds RNG, zeros counter)
- Strengthen seed=0 regression test to 100 iterations with stricter assertion
- Add reset_restores_random_rng_from_stored_seed test
- Debug impl and empty_sequence test were already present from prior commit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zmanian zmanian force-pushed the feat/1220-fault-injector branch from f3b3ed0 to b006772 Compare March 17, 2026 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: low Changes to docs, tests, or low-risk modules size: L 200-499 changed lines skip-regression-check Bypass regression test CI gate (tests exist but not in tests/ dir)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create FaultInjector framework for StubLlm and test harness

2 participants