Skip to content

Track upstream transformers async-load materialization crash workaround #205

@speediedan

Description

@speediedan

Summary

We currently carry a Windows-only workaround in src/interpretune/adapters/transformer_lens.py that sets HF_DEACTIVATE_ASYNC_LOAD=1 during Hugging Face model loading.

The workaround exists because transformers v5 async thread-based tensor materialization appears to trigger an upstream crash during from_pretrained() on our CI path.

Current local workaround

BaseITLensModule.hf_configured_model_init() now does the following on Windows when the env var is not already set:

  • set HF_DEACTIVATE_ASYNC_LOAD=1
  • call model_cls.from_pretrained(...)
  • restore the previous environment afterward

Why track this

  • The workaround is intentionally narrow, but it is still a behavioral fork from upstream defaults.
  • We should reduce it to a documented compatibility shim, not a permanent hidden behavior.
  • Once upstream behavior is fixed or clarified, we should remove the workaround and add a regression test covering the original failure mode.

Follow-up

  • Capture the exact upstream exception/stack trace from the Windows CI failure if it is not already archived.
  • Open or link the upstream transformers issue once we have a minimized repro.
  • Remove the workaround after verifying the upstream fix in CI.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions