Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StrictProvenance backend support, take 2022.05 #537

Merged
merged 14 commits into from
Jun 9, 2022

Conversation

nwf-msr
Copy link
Contributor

@nwf-msr nwf-msr commented May 31, 2022

Here we go again. This gets easier every time, at least. Especially after all the recent backend refactoring, I would almost dare suggest it was straightforward. This PR should be reviewable commit-by-commit. It has a prefix of preparatory NFC commits that don't (well, aren't supposed to) change anything and then the rest that actually do something.

This series re-introduces capptr::Arena and friends and pushes that through the backend ranges. The LargeBuddyRange-s always deal with Arena-bounded pointers in both directions; their consumers -- the backend chunk_alloc-ator and the SmallBuddyRange -- know how to tighten those down to Chunk-bounded pointers. For chunks that are being handed out to the frontend as such, the backend stashes the Arena-bounded authority inside the SlabMetadata using a wrapper type (for non-StrictProvenance architectures, there is a similar wrapper type in the same place that provides the usual sleight of hand). For chunks used by the SmallBuddyRange, it relies on a ProvenanceCaptureRange to hold the Arena-bounded authority within the Pagemap itself, rather like the LargeBuddyRange, while communicating using Chunk-bounded pointers with its child range(s).

This is missing the increased care around zeroing out internal pointers present in #510, so for that reason, if no other, is probably not ready to merge.

@nwf-msr nwf-msr requested review from mjp41, davidchisnall and rmn30 May 31, 2022 15:23
@nwf nwf force-pushed the 202205-cheri branch 3 times, most recently from 2e4ab61 to e87a1b3 Compare May 31, 2022 16:43
@nwf-msr
Copy link
Contributor Author

nwf-msr commented May 31, 2022

FWIW, I also have a version of this sitting atop #535.

@mjp41
Copy link
Member

mjp41 commented Jun 1, 2022

@nwf-msr so I think this is good. But I want to understand the precise design and threat model

  • We need to bound any allocation that we provide to the user.
  • We need to be able to consolidate allocations that have been given to the user back into larger allocations.
  • CHERI does not provide a consolidation mechanism, so we have to stash the original mmap permission and use that when we combine things.

I believe these are the core design constraints. These force the expanding the SlabMetadata and stashing the original capability in it.

Now there is an additional constraint that I am unsure if we want or need

  • Meta data should have reduced bounds.

Now, if we don't require meta-data to have reduced bounds, then we do not actually need to modify SmallBuddy or introduce the ProvenanceCaptureRange. The backend and meta data are all bounded to the original mmap range, and only the allocations going out to the user are bounded.

So as a principle of least required priviledge, bounding seems good, but it is unclear to me if it actually provides any additional safety guarantees. For instance, bounding the FrontendMetadata to its size, would mean that it doesn't give access to anything else. However, it contains a field with a permission to a lot of memory. This field is used infrequently, so could be "sealed", where as the metadata pointer is used on a fast path, so you might not want to seal it.

What is the principle you are driving for? Only infrequently accessed pointers have a large capability?

It seems the smallest change to get CHERI working is less than this PR?

@nwf-msr
Copy link
Contributor Author

nwf-msr commented Jun 4, 2022

@nwf-msr so I think this is good. But I want to understand the precise design and threat model

  • We need to bound any allocation that we provide to the user.
  • We need to be able to consolidate allocations that have been given to the user back into larger allocations.
  • CHERI does not provide a consolidation mechanism, so we have to stash the original mmap permission and use that when we combine things.

I believe these are the core design constraints.

I think that's right.

These force the expanding the SlabMetadata and stashing the original capability in it.

Of the varieties of amplification we've tried, I think the "larger SlabMetadata" is the least bad we've come up with, but I'd not characterize that particular data structure decision as forced, exactly. (I keep hoping for a less-redundant encoding, but that's probably at odds with keeping the number of cache-lines loaded per free down. So it goes.)

Now there is an additional constraint that I am unsure if we want or need

  • Meta data should have reduced bounds.

Now, if we don't require meta-data to have reduced bounds, then we do not actually need to modify SmallBuddy or introduce the ProvenanceCaptureRange. The backend and meta data are all bounded to the original mmap range, and only the allocations going out to the user are bounded.

If metadata were confined to the (eventual) allocator compartment, that's quite plausible. Right now, though, metadata includes not only the SlabMetadata but also the Allocator objects that the user comes to hold (

Config::Backend::template alloc_meta_data<T>(nullptr, sizeof(T));
). So, I think, in fact, this PR doesn't quite go far enough at the moment, in that those are not bounded to their size.

However, since we envision the compartmentalized design, in which Allocator pointers held by the user are sealed, that might argue that we want to bound metadata not in alloc_meta_data or the small-buddy range itself but only in callers that actually ship objects out to the user. We could have the small-buddy range understand two different kinds of dealloc based on the bounds of the object being given back. That might more-tightly couple SmallBuddyRange and ProvenanceCaptureRange, but we might be able to keep amplification as a (potentially) more generic call to the parent range that only some offer? Needs more thought, I think.

@nwf nwf force-pushed the 202205-cheri branch 2 times, most recently from a44bab3 to a1cea58 Compare June 6, 2022 23:25
@davidchisnall
Copy link
Collaborator

At the moment, malicious code can just grab the PageMap capability from the captable, so I don't think bounding the metadata actually buys us anything. Once we have per-library captables, domain transitions between libraries, and separate stacks, the fact that the allocator object is stored in TLS and can reach the unbounded metadata capabilities becomes concerning, but at that point we can solve it by sealing the capability in TLS, so I'm not sure if there's any point at which bounding the metadata capabilities actually buys us any security.

Copy link
Member

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with this in its current state. When this lands can we enable CHERI CI?

@nwf-msr
Copy link
Contributor Author

nwf-msr commented Jun 7, 2022

There's still some breakage in the morello-llvm compiler (https://git.morello-project.org/morello/llvm-project/-/issues/51) that means we can't generate .sos, but we can work around that and still run most of the tests, yes. (Self-hosting is, anyway, challenging right now because our build pipeline is a mixture of CHERI and non-CHERI executables.)

Speaking of, I have the start of a CHERI-specific test I should also add to this PR.

It'd also be good to check that none of this runs afoul of @rmn30's work with CHERI+MTE; ideally it should just be a bit of spelling changes to use this for his work, but if it turns out I've made something harder or impossible by mistake I wouldn't be hugely surprised.

@rmn30
Copy link
Contributor

rmn30 commented Jun 7, 2022

It'd also be good to check that none of this runs afoul of @rmn30's work with CHERI+MTE; ideally it should just be a bit of spelling changes to use this for his work, but if it turns out I've made something harder or impossible by mistake I wouldn't be hugely surprised.

All looks good so far. I'll find out once I try a rebase / merge.

@nwf-msr
Copy link
Contributor Author

nwf-msr commented Jun 7, 2022

Rebased and added two things to the test harness (the last two commits): a new bit in func/malloc to check for VMEM and a new func/cheri test for some narrow, CHERI-specific behaviors (verifying that we zero large objects by mmap correctly, implying that we internally amplified correctly) and that the pooled, metadata CoreAlloc objects are indeed bounded and lack VMEM authority.

@nwf-msr nwf-msr marked this pull request as ready for review June 7, 2022 17:06
Comment on lines 90 to 94
void arena_set(capptr::Arena<void> a)
{
arena = a;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this exist, given that it updates a field that has the same visibility as the set method?

Edit: I see that it's necessary to match the interface shared with the Lax version. Can we have a concept that encapsulates this and some comments explaining why the method exists here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the arena field could be private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a class, so it's all private, and the Backend is friend-ed to allow access. ("Ah, C++.")

I've moved this out to src/snmalloc/backend_helpers/cheri_slabmetadata_mixin.h and added a concept. See bdb64a2 (which replaces 72b47b2).

@davidchisnall
Copy link
Collaborator

It would be easier to review if the change from the CapPtr constructor to unsafe_from could be a separate NFC PR. There's a lot of code churn in this that hides the important bits.

@nwf-msr
Copy link
Contributor Author

nwf-msr commented Jun 8, 2022

It would be easier to review if the change from the CapPtr constructor to unsafe_from could be a separate NFC PR. There's a lot of code churn in this that hides the important bits.

I'm happy to split out any subset of the commits, sure, and I should perhaps have noted up front that this PR is structured to be reviewed commit-by-commit rather than all at once.

Comment on lines 90 to 94
void arena_set(capptr::Arena<void> a)
{
arena = a;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the arena field could be private?

nwf-msr added 7 commits June 8, 2022 11:49
Most ranges just deal with whatever kinds of ranges their parent deal with, but
that might be Chunk- or (soon) Arena-bounded.  This commit does not yet
introduce nuance, but just sets the stage.
Do not hard-code FrontendSlabMetadata, but rather take it as a template argment.
We're going to plumb other types through for StrictProvenance.
Expose a static CapPtr<T,B>::unsafe_from() and use that everywhere instead
(though continue to allow implicit and explicit construction of CapPtr from
nullptr).
Make these generic, with the SmallBuddyRange taking its cue from the parent
Range, since we're about to change them anyway and might want to vary them again
in the future.
This allows us to have a single Pipe-line of ranges where we can, nevertheless,
jump over the small buddy allocator when making large allocations.  This, in
turn, will let us differentiate the types coming from the small end and the
large "tap" on this Pipe-line.
nwf-msr added 7 commits June 8, 2022 11:49
Now that we've split the range Pipe-line externally, the small-buddy ranges
should never be seeing large requests.
Update the backend concept so that metadata allocations are Arena-bounded.
Wrap the FrontendSlabMetadata with a struct that holds the Arena-bounded
authority for Chunks that the Backend ships out to the Frontend or, for
non-StrictProvenance architecture, encapsulates the sleight of hand that turns
Chunk-bounded CapPtr-s to Arena-bounded ones.
These pieces of metadata (specifically, the Allocator structures) are never
deallocated at the moment, so we need not consider how we might amplify these
bounded pointers back to higher authority.
@nwf-msr
Copy link
Contributor Author

nwf-msr commented Jun 8, 2022

Bad timing on my part; sorry @rmn30!

The changes since what you reviewed are mostly confined to bdb64a2 but, in full, are https://github.com/microsoft/snmalloc/compare/af2adf3..ed88792

@nwf-msr nwf-msr requested review from davidchisnall, mjp41 and rmn30 June 8, 2022 10:59
Copy link
Member

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nwf-msr nwf-msr merged commit da19291 into microsoft:main Jun 9, 2022
@nwf nwf deleted the 202205-cheri branch June 9, 2022 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants