-
Notifications
You must be signed in to change notification settings - Fork 603
Description
Describe the bug
The te_llama.py example in docs/examples/te_llama/ fails with an AttributeError when used with recent versions of HuggingFace transformers (4.57.3). The TELlamaDecoderLayer.forward() method receives hidden_states as a tuple instead of a tensor, causing the error when TransformerLayer tries to call .contiguous() on it. This appears to be a compatibility issue between the example code and newer versions of the transformers library.
Steps/Code to reproduce bug
- Start the NVIDIA PyTorch container:
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:25.08-py3
- Clone TransformerEngine and navigate to the example:
git clone https://github.com/NVIDIA/TransformerEngine.git
cd TransformerEngine/docs/examples/te_llama
- Install required dependencies:
pip install accelerate datasets peft
- Run the tutorial_accelerate_hf_llama_with_te.ipynb notebook, specifically the cells that call init_te_llama_model().
Error message
File "/workspace/TransformerEngine/docs/examples/te_llama/te_llama.py", line 76, in forward
super().forward(
File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/transformer.py", line 700, in forward
hidden_states = hidden_states.contiguous()
^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'contiguous'
Expected behavior
The TELlamaDecoderLayer should handle hidden_states input correctly and complete the forward pass without errors.
Environment overview
nvcr.io/nvidia/pytorch:25.08-py3
Environment details
| Component | Version |
|---|---|
| Transformer Engine | 2.5.0+f05f12c |
| transformers (HuggingFace) | 4.57.3 |
| Python | 3.12 |
Additional context
• The error suggests that hidden_states is arriving as a tuple rather than a tensor in the TELlamaDecoderLayer.forward() method.
• This may be related to changes in how HuggingFace transformers handles decoder layer outputs internally in newer versions.