Skip to content

unprivileged/integrated-matrix: Align C tile layout with authoritativ…#40

Merged
joseemoreira merged 1 commit intomainfrom
ptomsich/c-tile-row-major-conformance
May 2, 2026
Merged

unprivileged/integrated-matrix: Align C tile layout with authoritativ…#40
joseemoreira merged 1 commit intomainfrom
ptomsich/c-tile-row-major-conformance

Conversation

@ptomsich
Copy link
Copy Markdown
Collaborator

@ptomsich ptomsich commented May 1, 2026

…e row-major prose

The geometry-section prose at lines 88-90 establishes that the C tile is row-major in vector registers. Several artefacts elsewhere in the spec treated C as column-major in registers, contradicting that authoritative position:

  • Per-instruction descriptions (11 places across vmmacc.vv, vwmmacc.vv, vqwmmacc.vv, v8wmmacc.vv, vfmmacc.vv, vfwmmacc.vv, vfqwmmacc.vv, vf8wmmacc.vv, vfwimmacc.vv, vfqwimmacc.vv, vf8wimmacc.vv) all said "vd as an M x N accumulator tile (column-major register layout)". Changed to "row-major register layout".

  • Instruction-selection table at lines 1322-1327 paired the order-preserving load/store (vmtl.v / vmts.v) with column-major C and the transposing variants with row-major C. Swapped so order- preserving maps row-major-in-registers to row-major-in-memory and the transposing variants are used for column-major-in-memory C.

  • GEMM example at lines 1336-1380 stored C column-major in memory using the order-preserving vmts.v. Rewritten to store C row-major in memory, with the index expression and stride-comment updated accordingly (i*ldc + j; ldc = N_total).

  • SAIL mat_C_idx already used i * M + j with comment "C is stored row-major" - numerically correct (C is square so M = N), but conceptually misleading. Renamed the third parameter to N (the row stride) and updated all four callers (int_gemm, fp_gemm, fp_scaled_gemm, int_scaled_gemm) to pass g.N instead of g.M. No behaviour change.

The out-of-tree QEMU helper, the JIT path, and the example test references in examples/src/test-*.cxx remain non-conforming and are tracked separately.

…e row-major prose

The geometry-section prose at lines 88-90 establishes that the C tile is
row-major in vector registers.  Several artefacts elsewhere in the spec
treated C as column-major in registers, contradicting that authoritative
position:

 - Per-instruction descriptions (11 places across vmmacc.vv, vwmmacc.vv,
   vqwmmacc.vv, v8wmmacc.vv, vfmmacc.vv, vfwmmacc.vv, vfqwmmacc.vv,
   vf8wmmacc.vv, vfwimmacc.vv, vfqwimmacc.vv, vf8wimmacc.vv) all said
   "_vd_ as an M x N accumulator tile (column-major register layout)".
   Changed to "row-major register layout".

 - Instruction-selection table at lines 1322-1327 paired the
   order-preserving load/store (vmtl.v / vmts.v) with column-major C and
   the transposing variants with row-major C.  Swapped so order-
   preserving maps row-major-in-registers to row-major-in-memory and the
   transposing variants are used for column-major-in-memory C.

 - GEMM example at lines 1336-1380 stored C column-major in memory using
   the order-preserving vmts.v.  Rewritten to store C row-major in
   memory, with the index expression and stride-comment updated
   accordingly (i*ldc + j; ldc = N_total).

 - SAIL mat_C_idx already used i * M + j with comment "C is stored
   row-major" - numerically correct (C is square so M = N), but
   conceptually misleading.  Renamed the third parameter to N (the row
   stride) and updated all four callers (int_gemm, fp_gemm,
   fp_scaled_gemm, int_scaled_gemm) to pass g.N instead of g.M.  No
   behaviour change.

The out-of-tree QEMU helper, the JIT path, and the example test
references in examples/src/test-*.cxx remain non-conforming and are
tracked separately.
Copy link
Copy Markdown
Collaborator

@joseemoreira joseemoreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty extensive but regular changes.

I am a bit surprise that tests passed, but I guess the confusion was "consistent" across the load/store and computational instructions. It would have been caught in an application-level test.

@joseemoreira joseemoreira merged commit e9ce69a into main May 2, 2026
6 checks passed
@ptomsich
Copy link
Copy Markdown
Collaborator Author

ptomsich commented May 2, 2026

In fact, the application-level tests assumed the wrong layout (given that the insns specifie it this way).
To match this, we are rewriting the tests against the new specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants