-
Notifications
You must be signed in to change notification settings - Fork 28.2k
Issues: huggingface/transformers
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Implement Maximal Update Parametrization (muP)
WIP
Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
#16157
opened Mar 14, 2022 by
thegregyang
Siamese Multi-depth Transformer-based Hierarchical Encoder
Feature request
Request for a new feature
New model
#9526
opened Jan 11, 2021 by
lalitpagaria
3 tasks done
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to HuggingFace
Feature request
Request for a new feature
trainer
#32861
opened Aug 17, 2024 by
JasonZhu1313
Extend Chat Template Tokenization for Training/Finetuning
Feature request
Request for a new feature
#27609
opened Nov 20, 2023 by
siddk
Add Deepseek AI's Janus model
Good Difficult Issue
New model
#35928
opened Jan 28, 2025 by
ArthurZucker
2 tasks done
Add cosmos from Nvidia
Good Difficult Issue
New model
#35565
opened Jan 8, 2025 by
ArthurZucker
2 tasks done
BERT and SpanBERT for Coreference Resolution
New model
#6497
opened Aug 15, 2020 by
sayanb-7c6
3 tasks done
[Community Event] Doc Tests Sprint
Good First Issue
#16292
opened Mar 21, 2022 by
patrickvonplaten
100+
SDPA
is_causal=False
has no effect due to LlamaModel._prepare_4d_causal_attention_mask_with_cache_position
bug
#36150
opened Feb 12, 2025 by
ringohoffman
4 tasks
The same situation as #31377 occurred when using Qwen/Qwen2-VL-7B-Instruct
bug
Cache
Multimodal
#33399
opened Sep 10, 2024 by
toondata
3 of 4 tasks
Supporting Selective Activation Checkpointing and CPU Offloading Option.
Feature request
Request for a new feature
#29648
opened Mar 14, 2024 by
SeunghyunSEO
Qwen2VL exhibits significant performance differences under different attention implementations.
bug
#35749
opened Jan 17, 2025 by
masn1310
2 of 4 tasks
How to run Trainer + DeepSpeed + Zero3 + PEFT
WIP
Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
#26412
opened Sep 26, 2023 by
BramVanroy
1 of 4 tasks
[Tensor Parallelism] Megatron-LM to transformers
Tensor Parallel
WIP
Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
#10321
opened Feb 21, 2021 by
stas00
Initializing via AutoImageProcessor before AutoProcessor is imported causes Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
AttributeError
bug
WIP
#34307
opened Oct 22, 2024 by
alex-jw-brooks
1 of 4 tasks
Helsinki-NLP/opus-mt-it-en
isn't on HuggingFace Hub
New model
#26382
opened Sep 25, 2023 by
KickItLikeShika
1 of 2 tasks
Add time progress bar to track the group_by_length computation for bigger datasets on Trainer
Feature request
Request for a new feature
#28069
opened Dec 15, 2023 by
T-Almeida
Evidentiality-guided Generator - Retrieval model
New model
#15387
opened Jan 28, 2022 by
patrickvonplaten
3 tasks done
Implement Cross Attention in LLAMA Model
Feature request
Request for a new feature
#27285
opened Nov 4, 2023 by
eitamar-saraf
Previous Next
ProTip!
Adding no:label will show everything without a label.