svd2ir: rework handling of names with placeholders by datdenkikniet · Pull Request #136 · embassy-rs/chiptool

datdenkikniet · 2026-04-19T08:25:06Z

To summarize what this PR does:

Fieldsets with non-array placeholders are supported
As a side-effect, non-array placeholder fieldsets that were supported previously by pretending that they were arrays (such as field{A, B}set turning into fieldset(n: usize)) are now no longer treated as arrays, unless they actually are arrays (field{A,B}set becomes fieldAset and fieldBset)

For a table overview, see: #136 (comment)

To explain the real life use case that this solves:

Some parts of Renesas' SVDs have the following pattern:

<register>
    <dim>2</dim>
    <dimIncrement>0x4</dimIncrement>
    <dimIndex>P0B,P2B,P3B,P4B</dimIndex>
    <name>BUSSCNT%s</name>
</register>
<register>
    <dim>4</dim>
    <dimIncrement>0x4</dimIncrement>
    <dimIndex>MBIU,RAM0</dimIndex>
    <name>BUSSCNT%s</name>
</register>

Currently, chiptool crashes while generating fieldsets for such patterns, as it normalizes both BUSSCNT%s to BUSSCNT and fails to find unique names for the individual fieldsets.

In order to improve that behaviour, we add an extra level of mapping:

We create a fieldset with the name stripped of its placeholder (adding a number at the end if there are duplicates)
We map the full register name to this fieldset name
We use the full register name to generate the register fieldset access, unless the fieldset is an actual array. Then we use the numeric access pattern (fieldset_name(n: usize)) to get fieldsets at offsets.

This has the pleasant side effect of resolving the fact that convert_periperhal converts input like this:

<register>
    <dim>2</dim>
    <dimIncrement>0x4</dimIncrement>
    <dimIndex>A,B</dimIndex>
    <name>DLLCR%s</name>
</register>
<register>
    <dim>2</dim>
    <dimIncrement>0x4</dimIncrement>
    <dimIndex>MAIN,BACKUP</dimIndex>
    <name>INST%sREG0</name>
</register>

into

impl Peripheral {
    fn dlccr(&self, n: usize) -> Reg<Dlccr>;
    fn instreg0(&self, n: usize) -> Reg<_>;
}

instead, we now get this (identical fieldsets are returned, but at different offsets):

impl Peripheral {
    fn dlccra(&self) -> Reg<Dlccr>;
    fn dlccrb(&self) -> Reg<Dlccr>;
    fn instmainreg0(&self) -> Reg<_>;
    fn instbackupreg0(&self) -> REg<_>;
}

We can probably do the same for fields, but this PR doesn't do that yet.

This fixes #90 and closes #92

inferiorhumanorgans · 2026-04-19T20:33:13Z

This works with the SVDs for:

RA2A1
RA4M1

This does not work for these SVDs:

RA4L1

Error: Failed to find unique name for ["CANFD", "CFDRMDF_11"]. n: ["CANFD", "CFDRMDF_11"]

RA6M5

Err: on rename Fieldset "CANFD::regs::CFDTMDF1__01", new name "canfd::regs::Cfdtmdf101" already exist
Err: on rename Fieldset "CANFD::regs::CFDTMDF1__11", new name "canfd::regs::Cfdtmdf111" already exist

Also, it would be nice to keep the indices noted somewhere (appended to the description, as a replacement for %s in the description, or as a separate ir field).

datdenkikniet · 2026-04-19T20:56:41Z

Added an ugly-hack workaround for now. It seems to handle the RA4L1 SVD now, but not sure it'll also hold up to the transforms. There is a better way to do it but I cannot come up with it right now :P

inferiorhumanorgans · 2026-04-19T21:03:36Z

Yeah, there are lots of ugly hacks to deal with these SVDs unfortunately (and the less said about the other metadata I'm ingesting the better 😂).

FWIW the MCUs I've got on hand are the 2A1, 4L1, 4M1, 6M5, and 8M1 so that's what I'm checking. The 8M1's SVD causes the svd-parser validation to choke so I'm skipping that for now.

I'm testing these with the ra-pac workflow (just extract-peripherals ra4m1). I can walk you through setting that up if you'd like but that also depends on some transforms that aren't in the main branch so it may not be worth the time.

datdenkikniet · 2026-04-20T06:13:46Z

Did some mucking about and have come to the conclusion that having an underscore suffix, and just checking if the new name has been used at all, is easiest and is most likely to resolve most issues. There is probably a better way to find a unique name beyond "iterate until we've found one that doesn't exist yet" but I don't want to figure that one out just yet

inferiorhumanorgans · 2026-04-20T23:37:00Z

I think the more elegant solution would be to include the indicies in the name, but that has the potential for unreasonably long fieldset names. What I landed on was to keep a hashmap of seen fieldset names to keep track of the counts.

datdenkikniet · 2026-04-21T06:14:34Z

to include the indicies in the name

Like you say, this could cause really long names. Additionally, it doesn't necessarily improve the naming: inserting the dimension indices at the end of the name is equally "bad" IMO, and inserting them at the placeholder name doesn't really make it any better either.

Take, for example, CFDRMDF%s_1 with indicies 1-7, CFDRMDF%s_1 with indicies 8-15, and CFDRMDF%s_11 with indicies 8-15 (picked from the RA4L1 SVD). They're not arrays (by the standards implemented in this PR at least) because the placeholder is not at the end. We'd end up with:

Approach	CFDRMDF%s_1,`1-7`	CFDRMDF%s_1,`8-15`	CFDRMDF%s_11, `8-15`
`main`	CFDRMDF_1	CFDRMDF_1 (causes collision)	CFDRMDF_11
This PR	CFDRMDF_1	CFDRMDF_1_1	CFDRMDF_11
With indices at placeholder	CFDRMDF1234567_1	CFDRMDF_89101112131415_1	CFDRMDF89101112131415_11
With indices at end	CFDRMDF_1_1234567	CFDRMDF_1_89101112131415	CFDRMDF_11_89101112131415

I don't really see including the indices helping much of anything. We could/should include them in the description, but then they become stale as soon as anyone merges or deletes anything, and having some sort of two-way binding seems really heavy.

inferiorhumanorgans · 2026-04-21T22:08:52Z

In the case of CFDRMDF%s_1, the field itself is CFDRMDF1 or "RX Message Buffer Data Field 1 Registers". For the indicies I would collapse contiguous ranges to m_n with an end result along the lines of CFDRMDF_1_m_n. I've not actually read any of the docs on the CANFD peripheral but I'd expect the indicies are individual mailboxes.

That breaks down with PFS where you get constructs like P00%sPFS where the intended index is actually 00 or 000. PmnnPFS breaks down to m = index of the port peripheral, nn = index of the pin on the port.

Ultimately I'm not so concerned with the generated form as with PFS I rewrote the whole thing by hand because the fieldsets were so useless, and I may do the same with CANFD.

datdenkikniet · 2026-04-22T06:30:19Z

I would collapse contiguous ranges to m_n with an end result along the lines of CFDRMDF_1_m_n

I may have misunderstood the intent of your comment, but to make sure we're on the same page: I agree that this as an end result would be great, but it also sounds like something that is way out of scope for svd2ir. Merging fieldsets based on compatibility (or assuming that they are compatible based on names) sounds like a transform-level concern, not an svd2ir-level concern. svd2ir should mirror the SVD as closely as possible, which it would do with this change. This approach doesn't merge anything, but just uses what the SVD tells us (names containing placeholders refer to the same thing by definition) to generate a slightly more concise IR.

An alternative would be that we always expand all placeholders (except for arrays, [%s]), e.g. generate 8 fieldsets for CFDRMDF{0,1,2,3,4,5,6,7}_1 and leave it up to the user to re-merge them, but that seems unnecessary since the IR is made to deduplicate identical fieldsets, and we know that their definitions are identical.

Since we'll have to do manual transforms to get it into a good shape anyways, I don't think the naming scheme matters beyond that it is mostly recognizable as belonging to certain inner blocks, which it is with the current naming scheme.

Ultimately I'm not so concerned with the generated form

OK, let's conclude the naming discussion then: the name of the fieldset doesn't really matter so long as it is unique and can be associated with the inner block functions that use it, because whatever it becomes, it is likely to get renamed anyways. The current approach does that, while preserving the names we could already generate, and doesn't generate comically long names.

Opinions? I don't think bikeshedding the naming thing any more will help.

inferiorhumanorgans · 2026-04-23T00:19:27Z

Opinions? I don't think bikeshedding the naming thing any more will help.

Agreed.

However as of 163fc34 chiptool still chokes on the CANFD peripheral in the 4L1 and 6M5.

4L1:

thread 'main' (20068057) panicked at src/transform/mod.rs:84:60:
called `Result::unwrap()` on an `Err` value:
Err: on rename Fieldset "CANFD::regs::CFDRMDF_1_1", new name "canfd::regs::Cfdrmdf11" already exist
Err: on rename Fieldset "CANFD::regs::CFDRMDF_1_3", new name "canfd::regs::Cfdrmdf13" already exist
Err: on rename Fieldset "CANFD::regs::CFDRMDF_1_2", new name "canfd::regs::Cfdrmdf12" already exist

6M5:

thread 'main' (20068572) panicked at src/transform/mod.rs:84:60:
called `Result::unwrap()` on an `Err` value:
Err: on rename Fieldset "CANFD::regs::CFDTMDF1__1_1", new name "canfd::regs::Cfdtmdf111" already exist
Err: on rename Fieldset "CANFD::regs::CFDTMDF1__0_1", new name "canfd::regs::Cfdtmdf101" already exist

The minimal set of transforms I'm applying is:

Sanitize
DeleteUselessEnums
Sort

datdenkikniet · 2026-04-23T05:44:37Z

Ahh, okay: Sanitize is removing underscores from fieldset names, but underscores are the thing we use for separating... I'll look into what we can do about that

datdenkikniet · 2026-04-23T17:26:13Z

OK, after some thinking, I think the way to go here is:

Introduce a RenameFieldsets transform
Merge this PR as-is, and force the user to choose better names before calling Sanitize.

I've listed some other alternatives in #141, but I don't think any of them are better than this.

ETA: Have opened #141

With RenameFieldsets, I end up with the following:

  - !RenameFieldsets
    from: CANFD::regs::CFDRMDF_(\d+)_(\d+)
    to: CANFD::regs::CFDRMDF${1}copy${2}
  - !Sanitize
  - !MergeFieldsets
    from: canfd::regs::Cfdrmdf(\d+)(copy(\d+))?
    to: canfd::regs::Cfdrmdf

but of course, in this case, one could:

  - !MergeFieldsets
    from: CANFD::regs::CFDRMDF_(\d+)(_(\d+))?
    to: CANFD::regs::CFDRMDf
  - !Sanitize

instead, even without RenameFieldsets, because the register can actually be merged.

inferiorhumanorgans · 2026-04-28T10:10:32Z

Another alternative might be to have the counts zero padded. so that 11,1 → 011_001 and 1,1 → 001_001 don't collide. It would be nice (but is not a deal breaker for me) to not have to pre-sanitize input.

datdenkikniet · 2026-04-29T20:43:50Z

The only way to truly avoid it is to change the name-generation strategy based on how Sanitize does its renames, but that would mean that changes to Sanitize would directly mean that there have to be changes to svd2ir, which I don't like.

After all, we could decide that Sanitize should strip leading zeros following an underscore, and we'll be right back in the same situation. I know that it's easy to come up with hypotheticals like this, but I don't think doing special cases "just" to be able to Sanitize immediately is the right approach.

Let's see if dirbaio finds time to review this and/or has general comments.

inferiorhumanorgans · 2026-05-03T13:19:02Z

BTW this construct is no longer turning into an array (from the 8M1 SVD):

<register>
    <dim>2</dim>
    <dimIncrement>0x2</dimIncrement>
    <dimIndex>1,2</dimIndex>
    <name>PVD%sCR1</name>
    <description>Voltage Monitor %s Circuit Control Register 1</description>

This SVD has the added distinction of crashing Helix.

datdenkikniet · 2026-05-03T13:39:41Z

BTW this construct is no longer turning into an array (from the 8M1 SVD):

Yes, that's intentional (but changing it is trivial): only maybe-arrays that have a [%s] placeholder or end in a %s placeholder that is purely numeric are turned into array access.

The reasoning is that only [%s]-placeholders are "actually" arrays (going by the SVD spec, "A list of cluster names can be build using the placeholder %s. Use the placeholder [%s] at the end of the identifier to generate arrays in the header file."), and things ending in %s with only numerical dimensions still "read correctly" if array access is used (but, going purely by the SVD spec, it's not correct to do that).

Registers with placeholders in the middle, such as your example (also note that placeholders can also be non-numeric, see for example BUSMCNT%s in R7FA4M1AB.svd) aren't arrays at all, and treating them as such creates a pretty terrible disconnect between the actual register name (pvd2cr1) and the way it is accessed through the PAC .pvdcr1(1). Also note the "off-by-one" array accessor (pvd2cr1 is at index 1, not 2). In the BUSMCNT case: busmcntm4i is accessed through busmcnt(0), which simply makes no sense. It also messes up assumptions/guarantees about the uniqueness of a name.

Name	Placeholders	Old access	Notes	New access
BUS%sCNTR	1-2	`buscntr({0,1})`	Not unique: separate `BUS%sCNTR` with different dims or `BUSCNTR%s` may exist. Access offset does not correspond to actual name	`bus1cntr()`,`bus2cntr()`
BUS%sCNTR	3-4	N/A	`svd2ir` fails	`bus3cntr()`, `bus4cntr()`
BUS%sCNTR	SPI,USB	`buscntr({0,1})`	Not unique: separate `BUS%sCNTR` with different dims or `BUSCNTR%s` may exist.	`busspicntr()`, `bususbcntr()`
BUS%sCNTR	I2C,MDIO	N/A	`svd2ir` fails	`busi2ccntr()`, `busmdiocntr()`
BUSCNTR[%s]	0-1	`buscntr({0,1})`	-	`buscntr({0,1})`
BUSCNTR%s	1-2	`buscntr({0,1})`	Not unique: separate `BUSCNTR%s` with different dims or `BUS%sCNTR` may exist. Access offset does not correspond to actual name	`buscntr1()`, `buscntr2()` (as of `febf739`)
BUSCNTR%s	3-4	N/A	`svd2ir` fails	`buscntr3()`, `buscntr4()`
BUSCNTR%s	SPI, USB	`buscntr({0,1})`	Not unique: separate `BUSCNTR%s` with different dims or `BUS%sCNTR` may exist.	`buscntrspi()`, `buscntrusb()`
BUSCNTR%s	I2C,MDIO	N/A	`svd2ir` fails	`buscntri2c()`, `buscntrmdio()`

This SVD has the added distinction of crashing Helix.

Lol, that's not great

inferiorhumanorgans · 2026-05-03T23:22:07Z

Registers with placeholders in the middle, such as your example (also note that placeholders can also be non-numeric, see for example BUSMCNT%s in R7FA4M1AB.svd) aren't arrays at all,

It's certainly easy enough to apply a transform after, but those registers are intended to be arrays and the trailing number is not intended to be an array index. In this case PVD%sCR1 means that there's an array of control register 1 where there might be multiple control registers each with different sets fields for each voltage monitor. So you'd want arrays of voltage monitor registers each with CR0 and CR1.

datdenkikniet · 2026-05-04T18:11:04Z

In this case PVD%sCR1 means that there's an array of control register 1 where there might be multiple control registers each with different sets fields for each voltage monitor.

I don't understand what you are trying to say here. The SVD only tells us: this register exists more than once, with this amount of bytes between them. svd2ir cannot/should not try to figure out that there are additional registers that happen to have the same placeholders and that overlap so that they can form a block. There is nothing in the SVD that can tell us this, only guesswork and happy coincidences that we should really leave up to those writing transforms.

The SVD doesn't even put these registers in a cluster, so there's nothing telling us that they should somehow be grouped together beyond the fact that their dimensions happen to be equal, and that the expanded version of these overlaps. The only reason we want them to be an array is because you (or I) can deduce it from the SVD, but the SVD itself doesn't tell us directly (which is what svd2ir's concern is)

but those registers are intended to be arrays

I disagree. The SVD doesn't contain any information that directly tells us that they are arrays (if they had been, the [%s] array placeholder should have been used). It only tells us that there are multiple instances of the same register at some offset from each other.

The register you gave as an example has a non-array placeholder, which is the only thing that is causing the change you see.

The effect of the change is that the access pattern in the PAC changes like this:

Register	Old Access	New access
`SYS.PVD1CR0`	`SYSC.PVDCR0(0)`	`SYSC.PVD1CR0()`
`SYS.PVD2CR0`	`SYSC.PVDCR0(1)`	`SYSC.PVD2CR0()`
`SYS.PVD1CR1`	`SYSC.PVDCR1(0)`	`SYSC.PVD1CR1()`
`SYS.PVD2CR1`	`SYSC.PVDCR1(1)`	`SYSC.PVD2CR1()`

Alternatively, going back to BUSMCNT%s, which has dimensions M4I,M4D,SYS,DMA:

Register	Old Access	New access
`BUSMCNTM4I`	`BUSMCNT(0)`	`BUSMCNTM4I()`
`BUSMCNTM4D`	`BUSMCNT(1)`	`BUSMCNTM4D()`
`BUSMCNTSYS`	`BUSMCNT(2)`	`BUSMCNTSYS()`
`BUSMCNTDMA`	`BUSMCNT(3)`	`BUSMCNTDMA()`

The old access is just wrong: it disconnects the actual placeholder from the thing you're accessing entirely, with the only benefit being that we can access it in an array-like manner (which doesn't help: how on earth does this random number map to an actual register offset/name that we care about?)

datdenkikniet force-pushed the clusters branch from 67d5374 to ab57951 Compare April 19, 2026 08:27

datdenkikniet mentioned this pull request Apr 19, 2026

Handle non-numeric array indices for registers #92

Open

datdenkikniet force-pushed the clusters branch from ab57951 to 732d7fd Compare April 19, 2026 12:00

datdenkikniet changed the title ~~Support converting SVDs with (duplicate) clustered names to IR~~ svd2ir: Support converting SVDs with (duplicate) clustered names to IR Apr 19, 2026

datdenkikniet changed the title ~~svd2ir: Support converting SVDs with (duplicate) clustered names to IR~~ svd2ir: Support converting SVDs with (duplicate) clustered names Apr 19, 2026

datdenkikniet force-pushed the clusters branch 2 times, most recently from 4f94476 to 9a9d61d Compare April 19, 2026 18:27

datdenkikniet marked this pull request as draft April 19, 2026 20:56

datdenkikniet force-pushed the clusters branch 2 times, most recently from e145d1e to 81c3d1a Compare April 20, 2026 06:11

datdenkikniet marked this pull request as ready for review April 20, 2026 06:12

datdenkikniet force-pushed the clusters branch from 81c3d1a to b4f7e2d Compare April 20, 2026 06:14

datdenkikniet added 3 commits April 21, 2026 22:39

reduce significant rightward drift

9b70ab5

replace_suffix -> remove_placeholder

be2d5a8

Split out create_block_item

7019219

datdenkikniet force-pushed the clusters branch from b4f7e2d to 163fc34 Compare April 21, 2026 20:56

datdenkikniet marked this pull request as draft April 21, 2026 20:58

datdenkikniet changed the title ~~svd2ir: Support converting SVDs with (duplicate) clustered names~~ svd2ir: rework handling of names with placeholders Apr 22, 2026

Support loading names with placeholders

25c59bd

datdenkikniet mentioned this pull request Apr 23, 2026

Add RenameFieldsets transform #141

Closed

datdenkikniet force-pushed the clusters branch from 163fc34 to 25c59bd Compare April 23, 2026 19:04

datdenkikniet marked this pull request as ready for review April 29, 2026 20:43

don't treat clusters ending in numbers as arrays

febf739

Conversation

datdenkikniet commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inferiorhumanorgans commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datdenkikniet commented Apr 19, 2026

Uh oh!

inferiorhumanorgans commented Apr 19, 2026

Uh oh!

datdenkikniet commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inferiorhumanorgans commented Apr 20, 2026

Uh oh!

datdenkikniet commented Apr 21, 2026

Uh oh!

inferiorhumanorgans commented Apr 21, 2026

Uh oh!

datdenkikniet commented Apr 22, 2026

Uh oh!

inferiorhumanorgans commented Apr 23, 2026

Uh oh!

datdenkikniet commented Apr 23, 2026

Uh oh!

datdenkikniet commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inferiorhumanorgans commented Apr 28, 2026

Uh oh!

datdenkikniet commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inferiorhumanorgans commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datdenkikniet commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inferiorhumanorgans commented May 3, 2026

Uh oh!

datdenkikniet commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

datdenkikniet commented Apr 19, 2026 •

edited

Loading

inferiorhumanorgans commented Apr 19, 2026 •

edited

Loading

datdenkikniet commented Apr 20, 2026 •

edited

Loading

datdenkikniet commented Apr 23, 2026 •

edited

Loading

datdenkikniet commented Apr 29, 2026 •

edited

Loading

inferiorhumanorgans commented May 3, 2026 •

edited

Loading

datdenkikniet commented May 3, 2026 •

edited

Loading

datdenkikniet commented May 4, 2026 •

edited

Loading