-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated naming conventions #84
base: main
Are you sure you want to change the base?
Conversation
…cn-tutorial into naming_conventions
This PR includes the original naming-conventions PR plus a bunch of extra tidying, polishing, and reworking in light of discussions. I believe everything is now consistent and the conventions are working pretty well. It is ready to merge IMHO. |
@cp526 We are waiting for your confirmation before merging this, whenever you get back to work... |
That looks reasonable, but – without wanting to drag this revision out more – one point I'm not completely convinced about is requiring CN identifiers to be uppercase. I see there's maybe some value in syntactically distinguishing C and CN variables for emphasising that C variables refer to (pure snapshots of) mutable objects whereas CN variables are immutable, but on the other hand CN specifications should (as much as reasonable) look natural to C programmers, who may want to follow the same naming conventions in their CN development as in their C code (possibly with project/organisation-specific naming conventions for identifiers). A second concern is entangled with specification language design. Rust, Haskell, and OCaml all require lowercase variable names and uppercase constructor names. A benefit is that variables and constructors are syntactically distinct, so when the user misspells a constructor name in a pattern-match expression the compiler can detect this and not misunderstand it as a catch-all variable pattern (which would be a difficult-to-diagnose-bug). CN is currently fine in that regard even without an enforced restriction on the capitalisation of variable and constructor names, because CN constructors always take a parenthesis-enclosed list of arguments ( Either way, ignoring variables names, I see no benefit in requiring CN identifiers for types, record field names, or constructor argument names to be capitalised (though I may have missed some discussion about this), but would think that lower case names for these would better fit typical C code. |
I'm pretty persuaded that CN variables should be uppercase -- the tutorial
examples were unbelievably full of inconsistencies and confusions arising
from different possible answers to the question, "When I Own a bit of heap
structure from a C variable called x, what is the resulting CN value
called?" I really want there to be exactly one standard answer (and X
seems like the best one).
I care less about types, record fields, function names, etc.
… Message ID: ***@***.***>
|
On Thu, 5 Sept 2024 at 20:05, Benjamin Pierce ***@***.***> wrote:
I'm pretty persuaded that CN variables should be uppercase -- the tutorial
examples were unbelievably full of inconsistencies and confusions arising
from different possible answers to the question, "When I Own a bit of heap
structure from a C variable called x, what is the resulting CN value
called?" I really want there to be exactly one standard answer (and X
seems like the best one).
Another thought about this: in the larger examples there will tend to be
quite a lot of pure CN code, expressing the pure functional behaviour
part of the spec - eg for pKVM the CN analogue of all of the
executable-in-C spec here:
https://github.com/rems-project/linux/blob/pkvm-verif-6.4/arch/arm64/kvm/hyp/nvhe/ghost/ghost_spec.c
Requiring all those CN term variables to be upper case seems
quite noisy and unnatural to me. It also reminds me of the initial
HOL choice to make in-the-logic identifiers upper case to be
simply distinct from the SML identifiers, which I think they later
regretted.
p
…
I care less about types, record fields, function names, etc.
> Message ID: ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#84 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABFMZZQ67IGFSRCX7WYISRTZVCTORAVCNFSM6AAAAABNPGFDSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZSGQ2DKNRUGI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
What's the next step here? Continue with this PR as is, and try adopting the naming scheme @bcpierce00 proposed? Or is there an alternate proposal? I expect we'll make more changes to the syntax over the next year, so there will be future opportunities to change the naming scheme as well, as painful as that is. |
Renaming everything in the whole tutorial is a somewhat painful process,
and it will get worse as the tutorial and reference materials get bigger,
so it would be much better to reach agreement now if we can.
I think the proposals on the table are (1) go ahead with the PR as-is, or
(2) try an experiment with a different scheme that "makes more things
lowercase." Obviously, (2) would give us more information to guide
people's preferences. But we are lacking an explicit proposal for
precisely what the general rules should be and what guiding principles to
use to settle edge cases.
… Message ID: ***@***.***>
|
@@ -0,0 +1,8 @@ | |||
/*@ | |||
datatype Dll { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Dllist
to mirror the C type struct dllist
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be better.
Ok, that makes sense. Before we think about considering other schemes, let me recap to make sure I'm understanding correctly. The main goal in adopting the uppercase naming scheme is to make a clear, explicit connection between CN-level things and C-level things, e.g.
The drawbacks are
Does that sound right? |
I suppose it makes sense to list non-goals as well. Here's what we're not trying to accomplish with the uppercase naming convention:
|
1. Avoiding top-level name conflicts between C and CN. Library
developers will already have a convention, and it would make sense to use
the same convention for CN identifiers. E.g. the pthreads library prefixes
pthreads_ to top-level identifiers, so CN top-level identifiers should
use the same prefix (but maybe capitalized 🙂).
2. Syntactically distinguishing C vs. CN identifiers. If we're worried
about developers getting confused by the type/provenance of an identifier,
there are other approaches we should consider, like go-to-definition code
navigation, showing the identifier type on hover in the IDE, etc.
In my mind (2) has been a "soft goal" of the naming conventions -- not to
enforce a rigid separation, but to maintain a general convention that
usually gives a clear hint whether someone is looking at a C thing or a CN
thing.
But, indeed, I could give up on this if others disagree. The thing that
seems crucial is offering a convention for naming analogous (heap) values
in the C and CN worlds.
… Message ID: ***@***.***>
|
From my perspective, maintaining a syntactic distinction between C things and CN things should be a non-goal. Indeed, in the implementation of the CN frontend we've put some effort to integrate C things more seamlessly into CN specifications (e.g. one can mention C variables without a quotation mechanism and "dereference" owned pointers in specifications). It's not completely clear where exactly the line should be drawn anyway: in |
From my understanding of earlier comments, this part of the scheme is one of the main motivations for using uppercase letters for CN identifiers -- so one can mechanically derive a good name for a CN variable introduced by a resource A different option would be to use The text above is from the section 'For new code', but for code not written by us (the more interesting case), a deterministic naming scheme is trickier.
|
It would be an option if we needed it, because pattern matching can only ever introduce new CN variables. If CN restricted variables to lower-case names and constructors to upper-case names, an uppercase string in a pattern (in a pattern-matching expression) could reliably be interpreted as a constructor name since it could not be introducing any new C variables.
|
I was just going to suggest something similar, e.g.
FWIW this is how I found myself naturally writing CN specs before @bcpierce00 started on the style guide. I think it's also worth being more explicit somewhere in the tutorial about this style of verification, where one builds a separate (logical) representation of correct behavior and then relates the logical representation to the C code – calling that out lets us make the point that one should pick some naming scheme for the logical representation that reflects the corresponding C implementation, even if it's not the one we recommend. |
I agree that the I remain uneasy about mashing the C and CN worlds together -- to me, |
I better understand this concern now, but I wonder whether this problem, in fact, arises from how we are using the name of the pointer (here Would it not be sufficient, and make specifications clearer, if the user could more freely pick a suitable name for the resource output -- perhaps based on its type, as previously suggested -- and we advise them to uniformly use the For completeness, rems-project/cerberus#312 discusses replacing the |
Hm. Maybe I'm confused in the same way we fear our user might be 🙂 . In my mind, the important distinction is made by the types involved. Our verification strategy, broadly speaking, is to develop a CN model of correct computation and relate it to the C implementation. The model happens to use richer abstractions, like mathematical integers, to more easily represent programmer intentions. In that sense, the If that's more or less accurate, and the types and the location of their definitions (in C vs in CN) convey the distinction, then I'd be in favor of relying on the tools for showing types and definitions in the IDE rather than pushing that information into naming conventions. |
Your `let v = Owned<int>(p)` example is telling — this really captures the
inherent confusion that has been nagging at me for some time, and that has
made me give u- on the idea of really, fully separating the two worlds. I
agree that the ‘embedding C in a larger world’ perspective is the right one.
Message ID: ***@***.***>
… |
Adds a style guide for CN style and naming conventions to distinguish between CN identifiers and C identifiers. There will be changes to the style guide shortly. See the PR for more details: #84
Adds a style guide for CN style and naming conventions to distinguish between CN identifiers and C identifiers. There will be changes to the style guide shortly. See the PR for more details: #84 Co-authored-by: Liz Austell <[email protected]>
No description provided.