-
Notifications
You must be signed in to change notification settings - Fork 544
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix faketensor error when dev_weights is undefined.
cla signed
fb-exported
#3755
opened Mar 3, 2025 by
spcyppt
Loading…
Enable multi-processing in CPU TBE micro-benchmarks
cla signed
fb-exported
#3753
opened Feb 28, 2025 by
excelle08
Loading…
Tuned fp8/bf16 GroupGEMM for MOE_17B
cla signed
fb-exported
#3752
opened Feb 28, 2025 by
zjing14
Loading…
Use GEMM kernel for KleidiAI to accelerate FP32Benchmark
cla signed
#3751
opened Feb 28, 2025 by
milpuz01
Loading…
Back out "Move
execute_backward_adagrad
into a class"
cla signed
fb-exported
#3749
opened Feb 28, 2025 by
q10
Loading…
fix volatile synchronization with acquire/relax
cla signed
fb-exported
#3728
opened Feb 24, 2025 by
xw285cornell
Loading…
Force determinism by unswizzle
cla signed
fb-exported
#3727
opened Feb 24, 2025 by
xw285cornell
Loading…
Enable preshuffled mixed dtype Cutlass Gemm
cla signed
fb-exported
#3722
opened Feb 21, 2025 by
jwfromm
Loading…
Implement generate_vbe_metadata cpu
cla signed
fb-exported
#3715
opened Feb 19, 2025 by
spcyppt
Loading…
[fbgemm_gpu] Add benchmark workflows
cla signed
module: rocm
#3713
opened Feb 19, 2025 by
q10
Loading…
update to tune for small
m
s and quantized gemv
cla signed
fb-exported
#3712
opened Feb 19, 2025 by
YUNQIUGUO
Loading…
Unifying TBE API using List (Frontend)
cla signed
fb-exported
#3711
opened Feb 19, 2025 by
spcyppt
Loading…
Add fp_rowwise_gemm configurations that can invoke the Ping Pong Scheduler on AMD
cla signed
fb-exported
#3703
opened Feb 18, 2025 by
njriasan
Loading…
Refactor stacked version of FP8 Grouped Gemm for reduced overhead
cla signed
fb-exported
#3699
opened Feb 17, 2025 by
jwfromm
Loading…
Add D_folded support for jagged_to_padded_dense_backward meta function
cla signed
fb-exported
#3670
opened Feb 8, 2025 by
brad-mengchi
Loading…
Adding Missing includes and explicitly declaring Tensor in aten namespace.
cla signed
fb-exported
#3638
opened Jan 30, 2025 by
pradeepfn
Loading…
Partial revert of D66986498 (Optimized backward pass for ROCm devices, pt 1), 2nd attempt
ciflow/rocm
cla signed
fb-exported
module: rocm
#3637
opened Jan 29, 2025 by
q10
Loading…
avoid using warning tensor in cpu tbe op
cla signed
fb-exported
#3631
opened Jan 29, 2025 by
842974287
Loading…
Update bf16i4 gemm with new cutlass version
cla signed
fb-exported
#3630
opened Jan 29, 2025 by
jwfromm
Loading…
finish #1808 cherry-pick, adjust interface
cla signed
fb-exported
#3627
opened Jan 28, 2025 by
coconutruben
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.