Safetensors files for long-t5-tglobal models fail to load correctly

### System Info

- `transformers` version: 4.55.4
- Platform: Linux-6.1.123+-x86_64-with-glibc2.35
- Python version: 3.12.11
- Huggingface_hub version: 0.34.4
- Safetensors version: 0.6.2
- Accelerate version: 1.10.1
- Accelerate config: 	not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.8.0+cu126 (CUDA)
- Tensorflow version (GPU?): 2.19.0 (True)
- Flax version (CPU?/GPU?/TPU?): 0.10.6 (gpu)
- Jax version: 0.5.3
- JaxLib version: 0.5.3
- Using distributed or parallel set-up in script?: No
- Using GPU in script?: No
- GPU type: NVIDIA L4

### Who can help?

@ArthurZucker @younesbelkada @amyeroberts (model loading)


### Information

- [x] The official example scripts
- [x] My own modified scripts

### Tasks

- [x] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

## desc/info

> Original discussion: https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary/discussions/22

After several back-and-forth investigations and attempts at solutions, there is an issue with loading and running inference with long-t5 models in transformers that once worked. The "simplest" way to characterize this issue today is that weights are unable to be loaded from `.safetensors` files, causing garbage output.

## repro

Run inference with any long-t5 model in safetensors format

```py
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# This uses safetensors by default and produces incorrect behavior
model_id = "pszemraj/long-t5-tglobal-base-sci-simplify"
model = AutoModelForSeq2SeqLM.from_pretrained(
    model_id,
)
tokenizer = AutoTokenizer.from_pretrained(
    model_id,
)

# Test generation - model produces garbage output
text = "Summarize: The quick brown fox jumps over the lazy dog. " * 50
inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(**inputs, max_length=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# thea: and to of
# Output is corrupted/nonsensical
```

worth calling out that when the model is loaded it will display a warning telling you it will not use some weights:

```
Some weights of LongT5ForConditionalGeneration were not initialized from the model checkpoint at test_model and are newly initialized: ['decoder.embed_tokens.weight', 'encoder.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
```

> ![NOTE]
> I've put replication and expected functionality [in this colab notebook](https://colab.research.google.com/gist/pszemraj/cb256b49c333f4745208faceeb649ebf/scratchpad.ipynb)

### Expected behavior

The model should load correctly and produce coherent summaries. This works when explicitly loading from PyTorch checkpoint:

```py
model = AutoModelForSeq2SeqLM.from_pretrained(
    model_id,
    use_safetensors=False  # Force PyTorch checkpoint (works)
)
```

In this case:

```py
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# This uses safetensors by default and produces incorrect behavior
model_id = "pszemraj/long-t5-tglobal-base-sci-simplify"
model = AutoModelForSeq2SeqLM.from_pretrained(
    model_id,
    use_safetensors=False  # Force PyTorch checkpoint (works)
)
tokenizer = AutoTokenizer.from_pretrained(
    model_id,
)

text = "Summarize: The quick brown fox jumps over the lazy dog. " * 50
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150,)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# SUMMARZE: The Quick Brown Fox Jumps Over The Lazy Dog.
# Imperfect (model size/quality) but sensical
```

In some cases for later-created models (ex: [this checkpoint](https://huggingface.co/pszemraj/long-t5-tglobal-base-synthsumm_direct)) the model does not have a `.bin`, unsure if it's dead in the water, or what.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Safetensors files for long-t5-tglobal models fail to load correctly #40635

System Info

Who can help?

Information

Tasks

Reproduction

desc/info

repro

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Safetensors files for long-t5-tglobal models fail to load correctly #40635

Description

System Info

Who can help?

Information

Tasks

Reproduction

desc/info

repro

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions