diff --git a/challenges/medium/81_int4_matmul/challenge.html b/challenges/medium/81_int4_matmul/challenge.html index 1cd5e43..6df9ade 100644 --- a/challenges/medium/81_int4_matmul/challenge.html +++ b/challenges/medium/81_int4_matmul/challenge.html @@ -6,132 +6,6 @@ W is the dequantized float16 weight matrix of shape N × K.

- - - - - - - - - - - - STEP 1: UNPACK - - - w_q[n, i] - - - hi 7:4 - lo 3:0 - - - - - - - 9 - - 10 - - - − 8 - - - - - +1 - - +2 - - signed int4 [−8, 7] - - - - - STEP 2: DEQUANTIZE (example: one row n, K=8, group_size=4) - - - k → - - - group 0: scale[n, 0] - - +1 - - +2 - - −1 - - +3 - - - - - group 1: scale[n, 1] - - 0 - - −3 - - +7 - - −2 - - - - - int4 - - - × scale[n, 0] - × scale[n, 1] - - - - - fp16 - - W[n, 0..3] float16 - - W[n, 4..7] float16 - - - W[n, k] = (nibble − 8) × scales[n, k // group_size] - - - - - STEP 3: MATMUL - - - - x [M×K] - float16 - - - × - - - - Wᵀ [K×N] - float16 - - - = - - - - y [M×N] - float16 - - - - dequantized - -

Packing format: Each byte of w_q stores two INT4 weights. The high nibble (bits 7–4) holds weight w[n, 2i] and the low nibble (bits