Skip to content

Fail fast on batched feature intervention inputs#89

Open
Sohailm25 wants to merge 1 commit intodecoderesearch:mainfrom
Sohailm25:codex/validate-single-sequence-interventions
Open

Fail fast on batched feature intervention inputs#89
Sohailm25 wants to merge 1 commit intodecoderesearch:mainfrom
Sohailm25:codex/validate-single-sequence-interventions

Conversation

@Sohailm25
Copy link
Copy Markdown

Summary

  • add explicit single-sequence validation for setup_attribution, setup_intervention_with_freeze, feature_intervention, and feature_intervention_generate in both ReplacementModel backends
  • raise clear ValueError messages for batched tensor inputs and batched prompt-list inputs instead of failing later inside the intervention code
  • add CPU regression tests covering both batched tensors and batched string lists on a dummy ReplacementModel

Motivation

While following up on #67, I found that the intervention paths are still fundamentally single-sequence internally, but batched inputs are not rejected cleanly. For example, batched tensor input to feature_intervention(...) currently falls through into the delta code and raises a cryptic internal error like ValueError: too many values to unpack (expected 2).

This PR does not claim to add batch intervention support. Instead, it makes the current contract explicit and fails fast with a clear message, which is safer for users and easier to debug.

Testing

  • python -m ruff check circuit_tracer/replacement_model/_validation.py circuit_tracer/replacement_model/replacement_model_transformerlens.py circuit_tracer/replacement_model/replacement_model_nnsight.py tests/test_replacement_model_input_validation.py
  • python -m pyright circuit_tracer/replacement_model/_validation.py tests/test_replacement_model_input_validation.py
  • python -m pytest -q tests/test_replacement_model_input_validation.py tests/test_attributions_llama.py::test_small_llama_model tests/test_attributions_llama.py::test_large_llama_model

Related to #67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant