Skip to content

issue/843: 增加QY和NVIDIA上per_channel_quant_int8算子#855

Open
xgqdut2016 wants to merge 3 commits intomainfrom
issue/843
Open

issue/843: 增加QY和NVIDIA上per_channel_quant_int8算子#855
xgqdut2016 wants to merge 3 commits intomainfrom
issue/843

Conversation

@xgqdut2016
Copy link
Copy Markdown
Collaborator

@xgqdut2016 xgqdut2016 commented Dec 26, 2025

260322a4-042f-4311-9ee8-9b85dc753058 f14e0a67-2c94-4952-843f-1355c914f5f6 测试w8a8需要使用xmake clean && xmake f --nv-gpu=true --cuda_arch=sm_90a --cutlass=true -cv && xmake build && xmake install && python test/infiniop/w8a8int8.py --nvidia f56294bc-7561-4797-9967-241c59c768f4

@xgqdut2016 xgqdut2016 changed the base branch from main to dev January 5, 2026 01:54
@xgqdut2016 xgqdut2016 changed the base branch from dev to main January 7, 2026 02:22
@xgqdut2016 xgqdut2016 requested a review from whjthu January 7, 2026 07:28
@@ -0,0 +1,40 @@
#ifndef __QUANT_H__
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

过于简单,不符合头文件部分的宏定义习惯

x_zero_desc, \
x_desc);
switch (handle->device) {
#ifdef ENABLE_NVIDIA_API
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个写法有些奇怪,为什么已经 switch 了,还要再 ifdef

blockPerChannelQuantI8<Tdata, BLOCK_SIZE>
<<<M, BLOCK_SIZE, 0, stream>>>(x_packed, x_scale, x_zero, x, M, K);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

空行

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants