Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csr.bus: redesign Multiplexer shadow registers. #41

Merged
merged 1 commit into from
Aug 4, 2023

Conversation

jfng
Copy link
Member

@jfng jfng commented Jul 24, 2023

Before this commit, csr.Multiplexer had separate shadows for every element in its memory map. The same shadow was shared for read and write accesses to an Element; a combined read/write transaction was impossible despite being allowed by the CSR interface.

After this commit, csr.Multiplexer has separate shadows for read and write accesses, but both shadows are shared by every Element using them. For multiplexers with many elements, this approach also results in significant resource savings.

Example

To measure the resource savings, let's consider this example:

from amaranth import *
from amaranth.back import rtlil
from amaranth_soc import csr

mux = csr.Multiplexer(addr_width=1, data_width=8, alignment=2)

for i in range(8):
    a = csr.Element(32, "rw", name=f"a{i}")
    mux.add(a, extend=True)
    b = csr.Element( 8, "r",  name=f"b{i}")
    mux.add(b, extend=True)
    c = csr.Element( 8, "r",  name=f"c{i}")
    mux.add(c, extend=True)
    d = csr.Element(16, "w",  name=f"d{i}")
    mux.add(d, extend=True)

ports = [
    mux.bus.addr,
    mux.bus.r_stb,
    mux.bus.r_data,
    mux.bus.w_stb,
    mux.bus.w_data,
]
for e, _, _ in mux.bus.memory_map.resources():
    if e.access.readable():
        ports += [e.r_stb, e.r_data]
    if e.access.writable():
        ports += [e.w_stb, e.w_data]

print(rtlil.convert(mux, ports=ports))

Statistics

  • Before this commit, yosys -p "synth_ecp5 -nowidelut" would report this:
=== top ===

   Number of wires:                602
   Number of wire bits:           2203
   Number of public wires:         602
   Number of public wire bits:    2203
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:               1276
     LUT4                          700
     TRELLIS_FF                    576
  • After this commit, with the default shadow_overlaps=None, yosys -p "synth_ecp5 -nowidelut" reports this:
=== top ===

   Number of wires:                218
   Number of wire bits:           1254
   Number of public wires:         218
   Number of public wire bits:    1254
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:                437
     LUT4                          353
     TRELLIS_FF                     84

The new design uses 2x less LUT4s and 7x less FFs !

  • With shadow_overlaps=3, yosys -p "synth_ecp5 -nowidelut" reports this:
=== top ===

   Number of wires:                335
   Number of wire bits:           1773
   Number of public wires:         335
   Number of public wire bits:    1773
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:                662
     LUT4                          442
     TRELLIS_FF                    220

The shadow_overlaps parameter is exposed to users in case it proves useful, although LUT packing benefits seem to be outweighed by the increased cost in address decoding logic. None seem to be a good default for FPGAs, at least.

In these three cases, running nextpnr-ecp5 --out-of-context (or --pack-only) afterwards reports the same amount of LUTs/FFs.

@jfng jfng requested a review from whitequark July 24, 2023 10:34
@jfng jfng force-pushed the fix-csr-mux-shadow branch 2 times, most recently from 3846a9b to d778b3d Compare July 24, 2023 11:17
@whitequark
Copy link
Member

The new design uses 2x less LUT4s and 7x less FFs !

That's amazing! I expected the savings to be good but I didn't expect them to be this good.

amaranth_soc/csr/bus.py Outdated Show resolved Hide resolved
Copy link
Member

@whitequark whitequark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nicely and cleanly implemented. Great job!

Before this commit, csr.Multiplexer had separate shadows for every
element in its memory map. The same shadow was shared for read and
write accesses to an element; a combined read/write transaction was
impossible despite being allowed by the CSR interface.

After this commit, csr.Multiplexer has separate shadows for read and
write accesses, but both shadows are shared by every element using
them. For multiplexers with many elements, this approach also results
in significant resource savings.
@jfng jfng added this pull request to the merge queue Aug 4, 2023
Merged via the queue into amaranth-lang:main with commit bc3f0f3 Aug 4, 2023
2 checks passed
@jfng jfng deleted the fix-csr-mux-shadow branch August 4, 2023 14:52
@jfng
Copy link
Member Author

jfng commented Aug 4, 2023

Merged, at last! Thank you @whitequark for providing the idea behind this redesign, and for reviewing it.

@whitequark
Copy link
Member

Happy to have brought this to completion! This will be extremely useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants