Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid virtual addresses in CHERI-RISC-V #84

Open
andresag01 opened this issue Aug 3, 2023 · 8 comments
Open

Invalid virtual addresses in CHERI-RISC-V #84

andresag01 opened this issue Aug 3, 2023 · 8 comments

Comments

@andresag01
Copy link

In RISC-V, valid virtual addresses are sign-extended. For example, if RV64 supports SV39 (i.e. bits [38:0] are the "useful" address), the valid addresses must have every bit in [63:39] equal to bit 38; all other addresses are invalid. Attempting to load, store, execute from an invalid address gives rise to an exception.

RISC-V also does not require that CSRs (like Xtvec, Xepc, etc) hold all invalid addresses because presumably implementations can optimize away the "unused" VA bits. However, what happens when writing a tagged capability with an invalid address (eg. almighty cap with invalid address) to, say, Xepc in CHERI-RISC-V? Surely, this must result in an untagged capability being written to the CSR, otherwise we might end up with a tagged capability with an unexpected address (since writing invalid addresses to Xepc may be lossy) that breaks the metadata encoding.

A related issue is that using a tagged cap with an invalid address should ideally result in the expected standard RISC-V exceptions as opposed to a CHERI exception for the sake of backwards compatibility.

Has there been any work done on these issues?

It would be good to specify the behaviour CHERI-RISC-V without breaking compatibility with RISC-V. A proposal would be to:

  • Clear the tag of capabilities with invalid addresses when these are written to locations that cannot represent all invalid addresses like pc, Xepc, etc
  • Check whether the address in a capability is invalid before performing any other CHERI check. This implies that when (e.g.) an out of bounds capability or a capability with tag cleared and invalid address will raise a RISC-V exception on a cjalr due to the invalid address -- it will not raise the regular CHERI tag or length exceptions.
@jrtc27
Copy link
Member

jrtc27 commented Aug 3, 2023

I disagree, this kind of shadow state is a bad idea. Moreover, how would you preserve it across context switch? The behaviour you get is that of a CSetAddr of the legalised address, and that’s the only sensible specification I can see.

@jrtc27
Copy link
Member

jrtc27 commented Aug 3, 2023

(And if you want to reuse the tag for that purpose, that’s a strange and unusual thing to do when the result is representable, IMO)

@jrtc27
Copy link
Member

jrtc27 commented Aug 3, 2023

Other than being consistent with the rest of the instruction set, this has the advantage that, if xepcc is an almighty, unsealed, integer-mode capability, all the CHERI operations and checks on it are inherently no-ops, providing your backwards compatibility guarantee. Only once you start to use CHERI do the checks ever fire.

@jonwoodruff
Copy link
Contributor

After a bit more discussion with Peter, this is the understanding I've arrived at:
The address legalisation of Xepc (and other CSRs that hold addresses) is a subset of all addresses that must be legalised.
Xepc, for example, will only hold PCs, which presumably already needed to be legalised.
So for now, let's think only about how we would legalise PC(C).

Where we can trap on a jump, we can check that an address can become a new PC by the following:

  1. Legalise the address (potentially mangling high bits)
  2. Assert this address is within the bounds of PCC
  3. SetAddress this address to PCC (including representability check)
    (Order of 2 and 3 is strange, but deliberate...)

As every address that is in bounds is also representable, the representability check in step 3 is not actually necessary microarchitecturally. With this being the case, we preserve baseline invalid address behaviour where capability bounds are generous enough to allow the invalid transformation (e.g. almighty cap). (The particular transformation to an invalid address seems to be implementation defined, so it's possible that implementations may differ in some cases in which fault they throw.)

If PCC is thus legalised, then Xepc populated from an exception would always be pre-legalised.

The path of reading and writing legalised address CSRs would also new_cap := SetAddress(cap,legalised(cap.address)), including the representability check, to implement WARL.
As this is the slow path, and SetAddress is required for a number of capability CSR writes that only write the address, hopefully this is not an extra overhead.

So to summarise, related to original comment, we're not thinking that we would immediately clear the tag for illegal addresses, but logically perform a representability check first (except in the fiddly case of a sealed capability, in which case it should always clear the tag). This representability check isn't actually necessary for PC, thankfully, and should be endurable for CSR writes.

@jonwoodruff
Copy link
Contributor

jonwoodruff commented Aug 17, 2023

One additional comment:
There may be some utility in choosing your illegal address representation carefully.
Specifically, if the legalisation can preserve the address interpretation for "small" steps into illegal space, then the representability check can be avoided in some important cases.

If you simply preserve the 40th bit (for sv39) such that invalidAddress == (addr[39]!=addr[38]), and then sign extend from bit 40 to produce any full address, this has the advantage of actually accurately preserving addresses just under the system range and just above the user range so that representability checks are not required when it is statically known that you can't have wandered too far from a legal address.

Examples that would fall into this category is executing up into illegal addresses, or branching (with a 12-bit immediate) into illegal addresses.

@tariqkurd-repo
Copy link

I agree with the first of @jonwoodruff 's comments - I think that nicely summarises what we discussed.
For the second point about sign extending from addr[40] - yes this is certainly possible, so the legalisation for invalid addresses (any bit in addr[63:39] not matching addr[38]) would be

addr[63:41] - truncated
addr[40] - unchanged
addr[39]!=addr[38]
addr[38:0] - unchanged

and then when expanding back to 64-bits sign extend from addr[40].

I think that for the CHERI spec itself this part should be a recommendation, so the exact legalisation format and expansion back to 64-bits is not dictated by the spec. Storing addr[40] and sign extending from addr[40] to to addr[63] is more expensive than removing addr[40] and zero extending for example.

@PeterRugg
Copy link
Contributor

@tariqkurd-repo Agreed. All the spec needs to say is that the legalisation is always conceptually a setAddress, which of course comes with a representability check if it's part of a capability. Presumably the spec should also point out (in italics?) that certain legalisations allow the representability check to always pass for cases where it would be expensive, and recommend such a legalisation.

@tariqkurd-repo
Copy link

yes - I think leaving some of it loose is helpful - so that it doesn't constrain the implementation too tightly whilst maintaing the integrity of what CHERI is trying to achieve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants