Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/colophon.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ h|Extension h|Version h|Status
|*Zawrs* |*1.01* |*Ratified*
|*Zacas* |*1.0* |*Ratified*
|*Zabha* |*1.0* |*Ratified*
|*Zalasr* |*1.0* |*Ratified*
|*RVWMO* |*2.0* |*Ratified*
|*Ztso* |*1.0* |*Ratified*
|*CMO* |*1.0* |*Ratified*
Expand Down
1 change: 1 addition & 0 deletions src/riscv-unprivileged.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ include::a-st-ext.adoc[]
include::zawrs.adoc[]
include::zacas.adoc[]
include::zabha.adoc[]
include::zalasr.adoc[]
include::rvwmo.adoc[]
include::ztso-st-ext.adoc[]
include::cmo.adoc[]
Expand Down
135 changes: 135 additions & 0 deletions src/zalasr.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
== "Zalasr" Atomic Load-Acquire and Store-Release Instructions, Version 1.0

The Zalasr (Load-Acquire and Store-Release) extension provides load-acquire and store-release instructions in RISC-V.
These can be important for high performance designs by enabling finer-grained synchronisation than is possible with fences alone, by providing a unidirectional fence.
Load-acquire and store-release are widely used in language-level memory models:
both the Java and {cpp} memory models make use of acquire-release semantics, and {cpp}'s `atomic` provides primitives that are meant to map directly to load-acquire and store-release instructions.

The Zalasr extension builds on the atomic support provided by the Zaamo (Atomic Memory Operations), Zalrsc (Load-Reserved and Store-Conditional), and Zabha (Byte and Halfword Atomic Memory Operations) extensions by providing additional atomic operations (although it can be implemented independently of them).
All of the AMO operations in Zaamo (and Zabha) are read-modify-write operations that both load and store.
The Zalrsc extension provides operations that are only loads or stores.
However, since it is designed to perform an atomic operation on a single memory word or doubleword, the loads and stores are designed to be paired.
The load-reserved implies that a future store-conditional will follow while store-conditional requires that there was a previous load-reserved without other intervening loads or stores.
Therefore, the Zalrsc extension does not provide a general atomic and ordered load or store.

Zalasr fills this gap by offering truly standalone atomic and ordered loads and stores.
The Zalasr instructions are atomic loads and stores that support ordering annotations.
With the combination of Zaamo, Zabha, and Zalasr all {cpp} atomic operations can be supported with single instructions.

=== Load-Acquire and Store-Release Instructions

The Zalasr instructions always sign-extend the value placed in _rd_ and ignore the upper bits of the value of _rs2_.
The instructions in the Zalasr extension require that the address held in _rs1_ be naturally aligned to the size in bytes (2^width^) of the operand.
If the address is not naturally aligned, an address-misaligned exception or an access-fault exception will be generated.
The access-fault exception can be generated for a memory access that would otherwise be able to complete except for the misalignment, if the misaligned access should not be emulated.

The misaligned atomicity granule PMA, defined in Volume II of this manual, optionally relaxes this alignment requirement.
If all accessed bytes lie within the same misaligned atomicity granule, the instruction will not raise an exception for reasons of address alignment, and the instruction will give rise to only one memory operation for the purposes of RVWMO—i.e., it will execute atomically.

<<<

[#insns-ldatomic,reftext="Load Acquire"]
=== Load Acquire

Synopsis::
The load-acquire instruction atomically loads a 2^width^-byte value from the address in _rs1_ and places the sign-extended value into the register _rd_, subject to the ordering annotations specified in the instruction.

Mnemonic::
====
lb.{aq,aqrl} _rd_, (_rs1_)

lh.{aq,aqrl} _rd_, (_rs1_)

lw.{aq,aqrl} _rd_, (_rs1_)

ld.{aq,aqrl} _rd_, (_rs1_)
====
Encoding::
[wavedrom, ,svg]
....
{reg: [
{bits: 7, name: 'opcode', attr: ['7', 'AMO'], type: 8},
{bits: 5, name: 'rd', attr: ['5', 'dest'], type: 2},
{bits: 3, name: 'funct3', attr: ['3', 'width'], type: 8},
{bits: 5, name: 'rs1', attr: ['5', 'addr'], type: 4},
{bits: 5, name: 'rs2', attr: ['5', '0'], type: 4},
{bits: 1, name: 'rl', attr: ['1', 'ring'], type: 8},
{bits: 1, name: 'aq', attr: ['1', 'orde', '1'], type: 8},
{bits: 5, name: 'funct5', attr: ['5', 'Load Acquire', '00110'], type: 8},
]}
....

Description::

This instruction loads 2^width^ bytes of memory from rs1 atomically and writes the result into rd.
If the size (2^width+3^) is less than XLEN, it is sign-extended to fill the destination register.
This load must have the ordering annotation _aq_ and may have ordering annotation _rl_ encoded in the instruction.
The instruction always has an "acquire-RCsc" annotation, and if the bit _rl_ is set the instruction has a "release-RCsc" annotation.
+
The versions without the _aq_ bit set are RESERVED.
LD.{AQ, AQRL} is RV64-only.


[NOTE]
====
The _aq_ bit is mandatory because the two encodings that would be produced are not seen as useful at this time.
The version with neither the _aq_ nor the _rl_ bit set would correspond to a load with no ordering annotations that was guaranteed to be performed atomically.
This can be achieved with ordinary load instructions by suitably aligning pointers.
The version with only the _rl_ bit would correspond to load-release.
Load-release has theoretical applications in seqlocks, but is not supported in language-level memory models and so is not included.
====

<<<

[#insns-sdatomic,reftext="Store Release"]
=== Store Release

Synopsis::
The store-release instruction atomically stores the 2^width^-byte value from the low bits of register _rs2_ to the address in _rs1_, subject to the ordering annotations specified in the instruction.

Mnemonic::
====
sb.{rl,aqrl} _rs2_, (_rs1_)

sh.{rl,aqrl} _rs2_, (_rs1_)

sw.{rl,aqrl} _rs2_, (_rs1_)

sd.{rl,aqrl} _rs2_, (_rs1_)
====

Encoding::
[wavedrom, ,svg]
....
{reg: [
{bits: 7, name: 'opcode', attr: ['7', 'AMO'], type: 8},
{bits: 5, name: 'rd', attr: ['5', '0'], type: 2},
{bits: 3, name: 'funct3', attr: ['3', 'width'], type: 8},
{bits: 5, name: 'rs1', attr: ['5', 'addr'], type: 4},
{bits: 5, name: 'rs2', attr: ['5', 'src'], type: 4},
{bits: 1, name: 'rl', attr: ['1', 'ring', '1'], type: 8},
{bits: 1, name: 'aq', attr: ['1', 'orde'], type: 8},
{bits: 5, name: 'funct5', attr: ['5', 'Store Release', '00111'], type: 8},
]}
....

Description::

This instruction stores 2^width^ bytes of memory from rs1 atomically.
This store must have ordering annotation _rl_ and may have ordering annotation _aq_ encoded in the instruction.
The instruction always has an "release-RCsc" annotation, and if the bit _aq_ is set the instruction has a "acquire-RCsc" annotation.
+
The versions without the _rl_ bit set are RESERVED.
SD.{RL, AQRL} is RV64-only.


[NOTE]
====
The _rl_ bit is mandatory because the two encodings that would be produced are not seen as useful at this time.
The version with neither the _aq_ nor the _rl_ bit set would correspond to a store with no ordering annotations that was guaranteed to be performed atomically.
This can be achieved with ordinary store instructions by suitably aligned pointers.
The version with only the _aq_ bit would correspond to store-acquire.
Store-acquire has theoretical applications in seqlocks, but is not supported in language-level memory models and so is not included.
====

<<<
Loading