Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VXRM randomization not supported for now #8

Open
kv-sc opened this issue Mar 27, 2024 · 0 comments
Open

VXRM randomization not supported for now #8

kv-sc opened this issue Mar 27, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@kv-sc
Copy link
Collaborator

kv-sc commented Mar 27, 2024

Snippy command line (reproduce on release 1.0):

$ ./llvm-snippy -march=riscv64-linux-gnu -mattr="+v" -num-instrs=100 -seed=0 -model-plugin=None ./layout-vxrm.yaml

Use following config layout-vxrm.yaml:

sections:
  - no:        0
    VMA:       0x200000
    SIZE:      0x10000
    LMA:       0x200000
    ACCESS:    r
  - no:        1
    VMA:       0x210000
    SIZE:      0x100000
    LMA:       0x210000
    ACCESS:    rx
  - no:        2
    VMA:       0x100000
    SIZE:      0x100000
    LMA:       0x100000
    ACCESS:    rw

riscv-vector-unit:
  mode-distribution:
    VM:
      - [all_ones, 2.0]
    VL:
      - [vlmax, 2.0]
      - [any_legal, 1.0]
    VXRM:
      rnu: 1.0
      rne: 1.0
      rdn: 1.0
      ron: 1.0
    VXSAT:
      on: 1.0
      off: 1.0
    VTYPE:
      SEW:
        sew_8: 1.0
        sew_16: 1.0
        sew_32: 1.0
        sew_64: 1.0
      LMUL:
        m1: 1.0
        m2: 1.0
        m4: 1.0
        m8: 1.0
        mf2: 1.0
        mf4: 1.0
        mf8: 1.0
      VMA:
        mu: 1.0
        ma: 1.0
      VTA:
        tu: 1.0
        ta: 1.0

histogram:
    - [VSETVLI, 1.0]
    - [VAADDU_VX, 1.0]
    - [VAADDU_VV, 1.0]
    - [VAADD_VX, 1.0]
    - [VAADD_VV, 1.0]
    - [VASUBU_VX, 1.0]
    - [VASUBU_VV, 1.0]
    - [VASUB_VX, 1.0]
    - [VASUB_VV, 1.0]

Now disassemble result:

$SC_GCC_PATH/bin/riscv64-unknown-linux-gnu-objdump -d layout-vxrm.yaml.elf > layout-vxrm.yaml.dis

Disassembled output do not contain any VXRM mode switches:

...
   0: 00c00c93            li  s9,12
   4: 049cffd7            vsetvli t6,s9,e16,m2,ta,mu
   8: 7e002057            vmset.m v0
   c: 2f2d2757            vasub.vv  v14,v18,v26
  10: 2caa2a57            vasub.vv  v20,v10,v20,v0.t
  14: 2ec86e57            vasub.vx  v28,v12,a6
  18: 27e72f57            vaadd.vv  v30,v30,v14
  1c: 28ebea57            vasubu.vx v20,v14,s7,v0.t
  20: 29a82557            vasubu.vv v10,v26,v16,v0.t
  24: 28a06557            vasubu.vx v10,v10,zero,v0.t
  28: 294e2657            vasubu.vv v12,v20,v28,v0.t
  2c: 20472457            vaaddu.vv v8,v4,v14,v0.t
  30: 2f232257            vasub.vv  v4,v18,v6
  34: 28e56e57            vasubu.vx v28,v14,a0,v0.t
  38: 2da32d57            vasub.vv  v26,v26,v6,v0.t
  3c: 2689e157            vaadd.vx  v2,v8,s3
  40: 00400b13            li  s6,4
  44: 04fb70d7            vsetvli ra,s6,e16,mf2,ta,mu
...

We need this support in llvm-snippy.

@kv-sc kv-sc added the enhancement New feature or request label Mar 28, 2024
asi-sc pushed a commit to asi-sc/snippy that referenced this issue May 23, 2024
… smstart/smstop. (#78294)

This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that
expands to
a 'nop', but which stops the register allocator from coalescing a COPY
node when
its use/def crosses a SMSTART or SMSTOP instruction.

For example:

    %0:fpr64 = COPY killed $d0
    undef %2.dsub:zpr = COPY %0       // <- Do not coalesce this COPY
    ADJCALLSTACKDOWN 0, 0
MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0
    $d0 = COPY killed %0
    BL @use_f64, csr_aarch64_aapcs

If the COPY would be coalesced, that would lead to:

    $d0 = COPY killed %0

being replaced by:

    $d0 = COPY killed %2.dsub

which means the whole ZPR reg would be live upto the call, causing the
MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register:

    str     q0, [sp]   // 16-byte Folded Spill
    smstop  sm
    ldr     z0, [sp]   // 16-byte Folded Reload
    bl      use_f64

which would be incorrect for two reasons:
1. The program may load more data than it has allocated.
2. If there are other SVE objects on the stack, the compiler might use
the
   'mul vl' addressing modes to access the spill location.

By disabling the coalescing, we get the desired results:

    str     d0, [sp, syntacore#8]  // 8-byte Folded Spill
    smstop  sm
    ldr     d0, [sp, syntacore#8]  // 8-byte Folded Reload
    bl      use_f64

(cherry picked from commit dd73666)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant