Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion cfgs/qc_iu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ kind: architecture configuration
type: fully configured
name: qc_iu
arch_overlay: qc_iu
description: Configuration with the Xqci, Xqccmp, and Xqccmt custom extensions.
description: Configuration with the Xqci, Xqccmp, Xqccmt, and Xqccmi custom extensions.
implemented_extensions:
- { name: Sm, version: "= 1.13" }
- { name: Smrnmi, version: "= 1.0" }
Expand All @@ -23,6 +23,7 @@ implemented_extensions:
- { name: Zihpm, version: "= 2.0" }
- { name: Xqccmp, version: "= 0.3" }
- { name: Xqccmt, version: "= 0.1.0" }
- { name: Xqccmi, version: "= 0.1.0" }
- { name: Xqci, version: "= 0.13" }
- { name: Xqcia, version: "= 0.7.0" }
- { name: Xqciac, version: "= 0.3.0" }
Expand Down
17 changes: 17 additions & 0 deletions spec/custom/isa/qc_iu/csr/Smrnmi/mnepc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

---
# Overlay: allow bit 0 of mnepc.PC to be written and read back unchanged.
#
# When Xqccmi is present, qc.cm.ilut sets bit 0 of mepc/mnepc to indicate
# whether an exception originated from the first (bit 0 = 0) or second
# (bit 0 = 1) packed instruction in an ILUT entry. The standard mnepc
# definition forces bit 0 to 0 on write and masks it on read; both
# behaviours are removed here so that bit 0 is preserved faithfully.
fields:
PC:
sw_write(csr_value): |
return csr_value.PC;
sw_read(): |
return CSR[mnepc].PC;
45 changes: 45 additions & 0 deletions spec/custom/isa/qc_iu/csr/Xqccmi/qc.itba.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

# yaml-language-server: $schema=../../../../../schemas/csr_schema.json

$schema: csr_schema.json#
kind: csr
name: qc.itba
long_name: Instruction Table Base Address Register
address: 0x800
priv_mode: U
length: MXLEN
writable: true
description: |
The `qc.itba` register holds the base address of the Instruction Lookup Table (ILUT)
used by the `qc.cm.ilut` instruction.
The BASE field specifies bits[XLEN-1:6] of the ILUT base address. The lower 6 bits
are implicitly zero, so the ILUT base address is always 64-byte aligned.
The memory region pointed to by `qc.itba.base` is treated as instruction memory for
the purpose of executing ILUT instructions, requiring execute (X) access permission.
Read (R) permission is not required.
`qc.itba` adds architectural state to the system software context and must be
saved and restored on context switches.
definedBy:
extension:
name: Xqccmi
fields:
base:
location_rv32: 31-6
location_rv64: 63-6
type: RW
reset_value: UNDEFINED_LEGAL
description: |
Bits[XLEN-1:6] of the ILUT base address. The full base address is formed by
appending 6 implicit zero bits: `base_addr = {base, 6'b000000}`.
The base address must be 64-byte aligned.
mode:
location: 5-0
type: RO
reset_value: 0
description: |
Reserved for future use. Always reads as zero. Writes are ignored.
Comment thread
ayosher marked this conversation as resolved.
39 changes: 39 additions & 0 deletions spec/custom/isa/qc_iu/csr/Xqccmi/qc.itdec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

# yaml-language-server: $schema=../../../../../schemas/csr_schema.json

$schema: csr_schema.json#
kind: csr
name: qc.itdec
long_name: Instruction Table Double Entry Count Register
address: 0x801
priv_mode: U
length: MXLEN
writable: true
description: |
The `qc.itdec` register specifies the number of leading 64-bit double entries
in the Instruction Lookup Table (ILUT) used by the `qc.cm.ilut` instruction.

The DEC field holds an 11-bit count. The first DEC logical ILUT entries
(indices 0 through DEC−1) are 64-bit double entries. The remaining entries
(indices DEC through 2047) are 32-bit single entries.

The byte offset from the ILUT base address for logical index i is:
- If i < DEC: offset = i × 8
- If i ≥ DEC: offset = DEC × 8 + (i − DEC) × 4

The reset value of 0 selects backward-compatible mode in which all entries
are 32-bit single entries.
definedBy:
extension:
name: Xqccmi
fields:
dec:
location: 13-3
type: RW
reset_value: 0
description: |
11-bit double entry count. Specifies the number of leading 64-bit entries
in the ILUT. Legal range: 0..2047. A value of 0 means all entries are
32-bit (backward-compatible mode).
16 changes: 16 additions & 0 deletions spec/custom/isa/qc_iu/csr/mepc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

---
# Overlay: allow bit 0 of mepc.PC to be written and read back unchanged.
#
# When Xqccmi is present, qc.cm.ilut sets bit 0 of mepc to indicate whether
# an exception originated from the first (bit 0 = 0) or second (bit 0 = 1)
# packed instruction in an ILUT entry. The standard mepc definition forces
# bit 0 to 0 on write and masks it on read; both behaviours are removed here.
fields:
PC:
sw_write(csr_value): |
return csr_value.PC;
sw_read(): |
return CSR[mepc].PC;
219 changes: 219 additions & 0 deletions spec/custom/isa/qc_iu/ext/Xqccmi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
# Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
# SPDX-License-Identifier: BSD-3-Clause-Clear

# yaml-language-server: $schema=../../../../schemas/ext_schema.json

$schema: ext_schema.json#
kind: extension
name: Xqccmi
long_name: Qualcomm Custom Compressed Instruction Lookup Table
description: |
== Xqccmi: Qualcomm Custom Compressed Instruction Lookup Table

The Xqccmi extension adds the `qc.cm.ilut` instruction, a 16-bit compressed
instruction that fetches and executes one or two packed instructions from an
Instruction Lookup Table (ILUT) stored in memory. The extension also defines
two new user-level CSRs: `qc.itba` (ILUT Base Address) and `qc.itdec` (ILUT
Double Entry Count).

NOTE: Xqccmi uses the same encoding space as the `c.fld` instruction defined
by the Zcd extension (bits[1:0]=00, bits[15:13]=001). Therefore, Xqccmi and
Zcd are mutually exclusive and cannot be implemented together.

=== Instruction Lookup Table (ILUT)

`qc.cm.ilut` is a 16-bit instruction that takes an 11-bit immediate index
(range 0..2047) into the ILUT. The ILUT base address is held in the `qc.itba`
CSR (address 0x800). The hardware computes the byte offset of the addressed
entry from the base address, fetches the entry, parses the packed instructions
within it, and executes them in order.

ILUT memory is treated as instruction memory. It requires execute (X)
permission and does NOT require read (R) permission. A `fence.i` instruction
must be executed after any write to ILUT memory to guarantee that subsequent
`qc.cm.ilut` executions observe the updated contents.

=== qc.itba CSR (Address 0x800)

The `qc.itba` CSR holds the base address of the ILUT.

[cols="^1,^1,^3,^3",options="header"]
|===
| Bits | Name | Access | Description
| 31:6 | base | WARL (RW) | 64-byte aligned base address of the ILUT. Bits[5:0] of the physical base address are implicitly zero.
| 5:0 | mode | Read-only 0 | Reserved for future use. Always reads as zero; writes are ignored.
|===

`qc.itba` is a user-level read/write CSR. Software must save and restore
`qc.itba` on context switches.

=== qc.itdec CSR (Address 0x801) — Double Entry Count

The `qc.itdec` CSR specifies how many of the leading ILUT entries are 64-bit
double entries. All remaining entries are 32-bit.

[cols="^1,^1,^3,^3",options="header"]
|===
| Bits | Name | Access | Description
| 13:3 | dec | WARL (RW) | 11-bit double entry count. Legal range: 0..2047. Reset value: 0.
| 31:14 | — | Read-only 0 | Reserved. Always reads as zero; writes are ignored.
| 2:0 | — | Read-only 0 | Reserved. Always reads as zero; writes are ignored.
|===

`qc.itdec` is a user-level read/write CSR. The default reset value of 0
selects backward-compatible mode in which all entries are 32-bit.

=== Table Layout

The first DEC logical entries (indices 0 through DEC−1) are 64-bit double
entries. The remaining entries (indices DEC through 2047) are 32-bit entries.

The byte offset from the ILUT base address for logical index _i_ is:

* If _i_ < DEC: offset = _i_ × 8
* If _i_ ≥ DEC: offset = DEC × 8 + (_i_ − DEC) × 4

The maximum table size is DEC×8 + (2048−DEC)×4 bytes. This ranges from
approximately 8 KB (DEC=0, all 32-bit entries) up to approximately 16 KB
(DEC=2047, nearly all 64-bit entries). No alignment constraint beyond the
64-byte base alignment imposed by `qc.itba` is required for individual entries.

=== Dual-Instruction Packing

Each ILUT entry may contain one or two instructions packed together. The
hardware determines instruction sizes by parsing bits[1:0] of each
sub-instruction after fetching the full entry.

==== 32-bit Entry Valid Combinations

[cols="^1,^3",options="header"]
|===
| Combination | Description
| One 32-bit instruction | bits[1:0] == 2'b11; the entire 32-bit entry is a single RVI instruction.
| Two 16-bit instructions | bits[1:0] != 2'b11; the first 16-bit RVC instruction occupies bits[15:0] and the second occupies bits[31:16].
|===

==== 64-bit Entry Valid Combinations

[cols="^1,^3",options="header"]
|===
| Combination | Description
| 16 + 32 bits | First instruction is 16-bit (bits[1:0] != 2'b11); second instruction is 32-bit.
| 16 + 48 bits | First instruction is 16-bit; second instruction is 48-bit.
| 32 + 16 bits | First instruction is 32-bit (bits[1:0] == 2'b11); second instruction is 16-bit occupying bits[47:32].
| 32 + 32 bits | Both instructions are 32-bit.
| Single 48-bit instruction | A single 48-bit instruction occupying bits[47:0]; bits[63:48] must be padded with `c.nop` (16'h0001).
|===

A single 32-bit instruction placed in a 64-bit entry is valid as the 32+16
case with `c.nop` (16'h0001) padding in bits[63:48].

Unused bits in any entry MUST be padded with `c.nop` (16'h0001). Instruction
combinations that do not fit within the entry size cause an illegal instruction
exception. Hardware determines instruction sizes by parsing bits[1:0] of each
sub-instruction after fetching the full entry.

=== Instruction Restrictions

PC-relative instructions are NOT permitted in the ILUT. This includes:

* `auipc`
* All branch instructions (beq, bne, blt, bge, bltu, bgeu, c.beqz, c.bnez, etc.)
* All jump instructions (jal, jalr, c.j, c.jal, c.jr, c.jalr, etc.)

If a PC-relative instruction is fetched from the ILUT, an illegal instruction
exception is raised. 16-bit compressed instructions (bits[1:0] != 2'b11) are
otherwise valid in ILUT entries.

=== Execution Model

Logically, the PC points to the `qc.cm.ilut` instruction throughout the
execution of all instructions contained within the fetched entry. The
instructions within an entry execute as if they were a single atomic unit from
the perspective of interrupt handling:

* Interrupts cannot be taken between two instructions of the same entry once
the first instruction has committed.
* If the first instruction has not yet committed when an interrupt arrives, the
interrupt is taken with mepc set to the PC of `qc.cm.ilut`.
* If the first instruction has already committed when an interrupt arrives, the
interrupt is serviced after the second instruction commits, with mepc pointing
to the next instruction after `qc.cm.ilut`.

=== Exception Handling

On any exception caused by an instruction fetched from the ILUT:

* mepc (or mnepc for a double-trap) is set to the PC of `qc.cm.ilut`.
* Bit 0 of mepc/mnepc is set to 0 if the first instruction in the entry caused
the exception.
* Bit 0 of mepc/mnepc is set to 1 if the second instruction in the entry caused
the exception.

Bit 0 of mepc/mnepc is writable to 0 only; software cannot set it to 1.
Return-from-trap instructions (`mret`, `mnret`, `qc.c.mret`, `qc.c.mnret`,
`qc.c.mileaveret`) leave bit 0 of mepc/mnepc unchanged when returning.

NOTE: This is an architectural exception to the standard RISC-V requirement
that *epc bit 0 is always zero. The non-zero bit 0 encodes which instruction
within the ILUT entry caused the exception, enabling precise exception restart.

=== Encoding

`qc.cm.ilut` uses the same encoding as `c.fld` (bits[1:0]=00,
bits[15:13]=001). Xqccmi is therefore mutually exclusive with the Zcd
extension.
Comment thread
ayosher marked this conversation as resolved.

[cols="^2,^5,^2",options="header"]
|===
| Bits[15:13] | Bits[12:2] | Bits[1:0]
| 001 | ilut_index[10:0] | 00
|===

The 11-bit `ilut_index` field (bits[12:2]) encodes the logical index into the
ILUT, selecting one of up to 2048 entries.

=== Instruction Summary

[%header,cols="^1,^1,4,8"]
|===
|RV32
|RV64
|Mnemonic
|Instruction

|yes
|no
|qc.cm.ilut _index_
|<<#insns-qc_cm_ilut>>

|===
type: unprivileged
versions:
- version: "0.1.0"
state: development
contributors:
- name: Albert Yosher
company: Qualcomm Technologies, Inc.
email: ayosher@qti.qualcomm.com
- name: Derek Hower
company: Qualcomm Technologies, Inc.
email: dhower@qti.qualcomm.com
- name: Gil Zukerman
company: Qualcomm Technologies, Inc.
email: gzukerma@qti.qualcomm.com
changes:
- Initial version. Custom instruction lookup table extension with dual-instruction
packing and 64-bit double entries. Supports 16-bit, 32-bit, and 48-bit instructions
in the ILUT. Introduces qc.itba and qc.itdec CSRs.
requirements:
allOf:
- extension:
name: Zca
version: ">= 1.0.0"
- extension:
name: Zicsr
version: ">= 2.0.0"
- not:
extension:
name: Zcd
Loading
Loading