-
Notifications
You must be signed in to change notification settings - Fork 545
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[JAX] Re-use RHT matrix constant
#2386
opened Nov 14, 2025 by
jberchtold-nvidia
•
Draft
8 of 13 tasks
Set RPATH for cuda libraries from python package
#2381
opened Nov 14, 2025 by
take-cheeze
•
Draft
4 of 13 tasks
Add num_splits support for FA3 backend
2.10.0
#2380
opened Nov 14, 2025 by
cyanguwa
Loading…
8 of 13 tasks
[Pytorch] Fix backward_dw cuda graph order
#2376
opened Nov 13, 2025 by
Wohox
Loading…
1 of 13 tasks
FSDP2 Allgather Perf improvement and support for FusedAdam with FSDP2
2.10.0
#2370
opened Nov 12, 2025 by
vthumbe1503
Loading…
2 of 13 tasks
[PyTorch] Enable reference Current Scaling recipe
#2368
opened Nov 11, 2025 by
negvet
Loading…
13 tasks
[JAX] cuBlasMp integration for CollectiveGemm custom op
2.10.0
#2361
opened Nov 7, 2025 by
denera
Loading…
5 of 13 tasks
Add device-Initiated Grouped GEMM supporting m_splits on device
#2360
opened Nov 7, 2025 by
QiZhangNV
Loading…
1 of 13 tasks
[PyTorch][NVFP4][MOE] NVFP4 Grouped Hadamard Amax Kernel
#2351
opened Nov 6, 2025 by
zhongbozhu
Loading…
4 of 17 tasks
[Core] Fix inconsistent logic in C++ tensor class
#2330
opened Nov 1, 2025 by
timmoon10
Loading…
7 of 13 tasks
[Common] Added an optimized gated rowwise MXFP8 SwiGLU kernel
#2328
opened Oct 31, 2025 by
Oleg-Goncharov
Loading…
5 of 13 tasks
[Pytorch] change fused cross entropy backward grad to fp32 and reduce one read/…
#2325
opened Oct 31, 2025 by
RandMist
Loading…
8 of 13 tasks
[PyTorch] Implement Selective Activation Checkpointing for LayerNormMLP with checkpoint flag
#2311
opened Oct 28, 2025 by
jaimec00
Loading…
7 of 13 tasks
[JAX] Make test_layer.py tolerances stricter
#2306
opened Oct 27, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.