You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RuntimeError: Error(s) in loading state_dict for Siglip2VisionModel:
size mismatch for vision_model.embeddings.patch_embedding.weight: copying a param with shape torch.Size([768, 3, 16, 16]) from checkpoint, the shape in current model is torch.Size([768, 768]).
size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([196, 768]) from checkpoint, the shape in current model is torch.Size([256, 768]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
How to reproduce:
pip install git+https://github.com/huggingface/[email protected]
from PIL import Image
import requests
from transformers import AutoProcessor, Siglip2VisionModel
model = Siglip2VisionModel.from_pretrained("google/siglip2-base-patch16-224")
processor = AutoProcessor.from_pretrained("google/siglip2-base-patch16-224")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
last_hidden_state = outputs.last_hidden_state
pooled_output = outputs.pooler_output # pooled features
Same for google/siglip-so400m-patch14-384
Who can help?
pip install git+https://github.com/huggingface/[email protected]
from PIL import Image
import requests
from transformers import AutoProcessor, Siglip2VisionModel
model = Siglip2VisionModel.from_pretrained("google/siglip2-base-patch16-224")
processor = AutoProcessor.from_pretrained("google/siglip2-base-patch16-224")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
last_hidden_state = outputs.last_hidden_state
pooled_output = outputs.pooler_output # pooled features
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
pip install git+https://github.com/huggingface/[email protected]
from PIL import Image
import requests
from transformers import AutoProcessor, Siglip2VisionModel
model = Siglip2VisionModel.from_pretrained("google/siglip2-base-patch16-224")
processor = AutoProcessor.from_pretrained("google/siglip2-base-patch16-224")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
last_hidden_state = outputs.last_hidden_state
pooled_output = outputs.pooler_output # pooled features
Expected behavior
It loads the weights
The text was updated successfully, but these errors were encountered:
System Info
How to reproduce:
Same for google/siglip-so400m-patch14-384
Who can help?
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
It loads the weights
The text was updated successfully, but these errors were encountered: