Feature(MInference): add triton-based decoding in case flash_attn is … #10
Job | Run time |
---|---|
4s | |
11m 38s | |
11m 53s | |
11m 44s | |
12m 0s | |
11m 49s | |
12m 10s | |
12m 11s | |
11m 38s | |
12m 9s | |
11m 45s | |
11m 51s | |
12m 16s | |
12m 22s | |
12m 19s | |
11m 36s | |
12m 10s | |
11m 36s | |
12m 17s | |
11m 57s | |
11m 35s | |
11m 58s | |
11m 39s | |
11m 39s | |
11m 42s | |
12m 12s | |
11m 57s | |
13m 2s | |
12m 32s | |
10m 0s | |
5h 45m 41s |