Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stdlib #423

Merged
merged 12 commits into from
Dec 18, 2024
Merged

Stdlib #423

merged 12 commits into from
Dec 18, 2024

Conversation

mrjazzybread
Copy link
Contributor

@mrjazzybread mrjazzybread commented Oct 20, 2024

Here is my pull request for the updated standard library. The goals of
this update are to:

  • Give a precise specification to the defintions within the library so
    that there is no ambiguity as to their meaning

  • Gear this library to the definition of mathematical and logical
    formulae that will be used within specifications. This means removing
    elements such as references and exceptions which do not make sense
    in purely logical contexts

In order to achieve these two goals, we have removed or modified some
of the definitions within this library. I will now go over them and
justify them (note that some of these definitions are still in the
standard library but only so that the tests still pass. In the
future, these should be modified so as to not rely on these
definitions):

Removed Definitions

References

Self explanatory: it does not make sense to talk about references
within purely logical formulae.

Bitwise Operations

Although one could imagine a specification where one uses bitwise
operations, these operations are out of place in a mathematical
library. Additionally, these definitions are already present in the
OCaml standard library and it does not make sense to duplicate them
here. A better alternative would be to give those functions a gospel
specification and allow users to use them if necessary in their
specifications.

Operations on Pairs

Similar justification to bitwise operations.

Sequence Module

In the sequence module we remove the functions filter_map,
fold_right, and renamed fold_left to fold In the case of filter_map
and fold_right, these are just compositions of previously defined
functions; in the case of the former filter and map, and in the case
of the later fold and rev. The renaming of fold_left felt natural
since there is only one fold function. I don't feel very strongly
about these changes and if enough of a need arises, we could add them
again.

List Module

The entire List module was removed from the standard library. The main
reason is that the list type serves a similar purpose to sequence
within specifications and it feels unnatural to have two very similar
ways of reasoning over an ordered sequence of elements. Currently, to
reason over lists in Gospel specifications, we just use a coercion
that transforms them into sequences; in the future when we add some
form of representation predicates to Gospel, we could have it so that
the logical model of the type list is sequence.

Array Module

As i said with regards to references, it does not make sense to talk
about arrays in a logical context

Bag Module

The occurrences function has been renamed to multiplicity. This also
implied changing some of the tests.

The is_empty function has been removed since one could simple state
b = empty as opposed to is_empty b.

The choose and choose_opt functions have also been removed. It is
unclear why one would use these in a specification.

The map has also been removed. This is because it is very difficult to
define it in the case that the parameterized function is not
injective.

The fold function has also been removed. This is because it is
impossible to define this function without also defining a canonical
ordering of the elements within the bag.

The _exists and for_all predicates have also been removed as well as
the partition function. To be honest, I do not remember the reasons
why.

The conversion function to a sequence has also been removed since this
would also imply a canonical ordering. The conversion from a sequence
has also been removed, but I think that might have been a mistake.

Set Module

The compare function has been removed, no idea what kind of
specification one could give it or why one would use it in a
specification.

The remaining removed functions are similar to those from the Bag
module and therefore I don't think there is a need to explain them
again.

Order module

Since this module is about mathematical definitions it feels out of
place to define a class of valid ordering functions for OCaml
code. Although it makes sense to have this definition somewhere in
Gospel, we should place it somewhere else.

System Variables and Exceptions

Similar Reasoning as to references.

Added Definitions

I don't think anything that we have added merits in depth explanation:
mostly it was just defining all the functions we could and
axiomatizing the rest. The majority of these definitions have been
formalized and verified in Coq with a few exceptions these beings some
axioms regarding cardinality and a few recently added axioms regarding
subsequences.

Future Work

One glaring omission is the absence of axioms regarding arithmetic operations. These are absent since they are very cumbersome although fairly straight forward.

I would like have better notation for all of these Gospel data types:
notably it would be pleasant to have pretty notation for the init and mem functions.

Naturally, if these changes are accepted, the tests should be changed
so as to not rely on deprecated definitions. Personally, I think we
should leave them as is and tackle them once we have ways of talking
about ownership of OCaml values. This is because some of these
conflicts, notably with references and arrays, are due to the fact
that Gospel does not have a good way of talking about them in
specifications. I believe that once we have added some form of
representation predicates, that these tests can be rewritten quite
naturally.

Finally, at some point we should separate these modules into distinct
files. For now, since the standard library is still quite small, it is
still manageable, however, as we add more things to it, it can very
quickly become a big mess.

@mrjazzybread mrjazzybread changed the title HStdlib Stdlib Oct 20, 2024
@n-osborne n-osborne self-requested a review October 31, 2024 15:26
Copy link
Contributor

@n-osborne n-osborne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I suggest you run git rebase -i --exec "dune runtest --auto-promote; dune fmt" main. This will put the PR in a cleaner state, render the last commit unnecessary and move the ocaml-ci to green.

On the topic of CI, this PR definitely deserve a changelog!

I see you didn't include the Coq/Rocq proof. I would have been at least curious, and I think it could make sense to include it in the repo.

Regarding commit messages, I suggest we follow these good practices.
They are pretty common and already followed by a non-negligible subset of the commits of this repo.

Now, regarding Gospel formatting, I believe

In commit 44ec14c, you introduce a white space in the signature of integer_of_int at the level of the argument, this is not consistent with the previous style, nor it is a general modification of style. This also has no logical connection with the topic of the commit.

As a general remark about formatting, I feel we lack consistency. This is not just an aesthetic consideration, a consistent formatting will help the reader.

For example, we have three different formatting styles here:

  (*@ axiom append_elems_right :
     forall s s' i.
      length s <= i < length s + length s' ->
      (s ++ s')[i] = s'[i - length s] *)

  (*@ axiom subseq_l :
       forall s i. 0 <= i < length s -> s[i ..] = s[i .. length s] *)

  (*@ axiom subseq : forall s i i1 i2.
    0 <= i1 <= i < i2 <= length s -> s[i] = (s[i1 .. i2])[i-i1] *)

My personal preference goes to the first with:

  • quantification on the first line
  • one antecedent per line (I personally like the arrow at the beginning of the line)
  • conclusion on the last line

Some functions are just declared and their semantic is given in the form of axioms, and other are implemented.
I guess it is to minimize the amount of Coq/Rocq proof?
Still, it is a bit surprising for the reader, letting them wonder about whether this is an interface, an implementation file, something else?
The reader also has to switch between reading logical axiom and implementation while going through the standard library, this is not the most pleasant experience.
Some module documentation with some explanation about the purpose of this standard library and how the semantic of the symbols is defined could do a lot of good.

In order to homogenize how the semantic of functions are defined, maybe we could just declare the one that are, at the moment of this review, implemented, and use the implementation term in an axiom? Would this generate too much rework on the Coq/Rocq side?

For example:

  (*@ function singleton (x: 'a) : 'a t = init 1 (fun _ -> x) *)

Would become:

  (*@ function singleton (x: 'a) : 'a t *)

  (*@ axiom singletion_definition :
       forall x.
       singletion x = init 1 (fun _ -> x) *)

This would also have the benefit of letting tools needing an implementation (i.e Ortac) free of choosing an efficient one.

For example, the following implementation of Sequence.mem is clearly not what we want in the context of dynamic verification:

  (*@ predicate mem (x: 'a) (s: 'a t) = multiplicity x s > 0 *)

I'm still not convinced about using the arrow type for maps. But I guess this is a design question and it has been already discussed elsewhere.

On the topic of the removal of fold_right, I think we should keep it. fold_left and fold_right are pretty common combinators. I suspect the user discovering a fold function would wonder which it is. Also, again regarding dynamic verification (i.e Ortac), using a combination of fold(_left) and rev is not what we want for performance reasons. Finally, I think it will render specifications needing fold_right a bit cumbersome and heavy. (Note that if you want to get the fold_right signature that OCaml programmer's are used to, you'll have to flip the function argument -- and we don't have Fun.flip).

The same line of argument applies to the removal of filter_map.

@mrjazzybread mrjazzybread force-pushed the stdlib branch 3 times, most recently from ce369d0 to 4c425c0 Compare November 5, 2024 22:29
@mrjazzybread
Copy link
Contributor Author

I agree with most of your points, I normalized the formatting as you requested. I also made it so each function is abstract and re-added fold_left fold_right. With regards to the Coq proof, I didn't add it since I only want to put it in the repo once we have a stable version of the library. If there isn't anything too objectionable with what we have here, I will work on adding these changes to the Coq proof and adding it to the PR.

@mrjazzybread mrjazzybread force-pushed the stdlib branch 5 times, most recently from 5169df2 to 751ea17 Compare November 10, 2024 10:55
@mrjazzybread
Copy link
Contributor Author

I have added the Coq proof, as well as fixing some small mistakes. There are two Coq files for the proof, the first is the gospelstdlib_mli.v which is the automatically generated file which contains a module type Stdlib with all of the required definitions and lemmas. The second is the gospelstdlib_verfied.v which contains a module Gospelstdlib of type Stdlib. This means that this project will only compile if all of the definitions in the Stdlib module type are present. I have also added a _CoqProject file as well as a Coq Makefile. Important to note that this proof requires TLC.

In the future, once we integrate my changes into Gospel, we could have an extra test which checks if the gospelstdlib_verified file is still congruent with the current standard library by translating it into Coq and checking if the project still compiles.

Copy link
Contributor

@n-osborne n-osborne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I went through the Sequence module again and made some inlined suggestions (feel free to squash them at the right place afterward).
I also have two more general remarks:

  1. Not all functions and predicates have an informal documentation
  2. I wonder about the order. Some axioms at the beginning of the module make use of functions that are defined later (append_* axioms mostly)

@n-osborne
Copy link
Contributor

Since we have partial functions in Gospel, I am wondering what is the semantic of second order functions when one of the function argument is partial.

For example, the Sequence.init function is defined as:

(*@ function init (n: integer) (f: integer -> 'a) : 'a t *)

My guess would be that init n f is not defined when f is not defined for an x in 0..n-1].

Do we need to document that? And if we do, how do we do it?

@mrjazzybread mrjazzybread force-pushed the stdlib branch 3 times, most recently from 135730b to db86738 Compare December 9, 2024 12:17
@mrjazzybread
Copy link
Contributor Author

Hello, I fixed all of the things you noted. I also updated the change
log. With regards to what you noted about higher order functions, I
don't understand the problem. When you say init is undefined when f is
not defined in the given range, wouldn't this problem arise regardless
of whether or not the function was partially applied?

In any case, I would say that if Gospel assumes functional
extensionality then I would say that we do not need any special
specifications for partially applied functions since we can leverage
the fact that forall f g x y. f x = g <-> f x y = g y.

@n-osborne
Copy link
Contributor

Thanks!

The more I review this PR, the more I think this is a very nice new standard library.

I haven't reviewed the Coq/Rocq proof, as I am certainly not an expert.
Am I correct in understanding that this provides a Coq/Rocq implementation of the Gospel standard library proven correct w.r.t the Gospel axioms?

I have just a few remaining small remarks:

  • line 347: Creates -> creates
  • move line 428 to 422, attaching the informal documentation to its item
  • line 466: predicate finite doesn't have any informal documentation (even a trivial one)
  • line 510: maybe we can remove it already?
  • line 666: predicate finite doesn't have any informal documentation (again, even a trivial one)
  • line 673: maybe we can remove it already?

With regards to what you noted about higher order functions, I don't understand the problem.

There is no problem :-), I was just wondering. In retrospect, this is something that should be documented in the language definition rather than in the standard library documentation.

How do you want to proceed about the three modules at the end to be removed? You want to do it in this PR or another one? I believe this should be done before a release with the new standard library.

@n-osborne
Copy link
Contributor

n-osborne commented Dec 10, 2024

The same line of argument applies to the removal of filter_map.

I believe you didn't add back Sequence.filter_map. We do use it in a test in Ortac here and there.

@mrjazzybread
Copy link
Contributor Author

I am going to remove as many deprecated definitions as I can in this PR. I am thinking I will do these changes in a separate commit, unless you prefer it get squashed into e2e183b

@n-osborne
Copy link
Contributor

It makes sense to remove them in another commit as it will contain an update of the tests.

Thanks for adding back Sequence.filter_map! (If you want to be acribic about your git history, you remove the definition in the first commit (here) before adding it back in the third one (here), with a better documentation though.
But it is totally ok as it is 😃 )

@mrjazzybread
Copy link
Contributor Author

mrjazzybread commented Dec 15, 2024

I removed the deprecated modules and changed the tests accordingly,
which consisted of coercing values of type list and array into
sequence or something similarly simple.

A few other changes I made:

  • Since the Order module was used by several tests, instead of copying
    it into each file, I added an Order.mli file with the is_pre_order
    predicate.
  • Related to the previous point, I fixed a bug where Gospel could not
    open a module contained in a file that starts with a capital
    letter.
  • I removed the syntax for array access since we want users to use
    sequences instead of reasoning over the array directly.
  • I re-added fold in the Set module since one of the Vocal examples
    used it and I didn't want to change it too much. I did however,
    change its signature: fold now takes two functions, one that maps
    elements into some other type 'b and a function that combines two
    values of type 'b. In the specification for fold, the latter
    function must be a commutative monoid.
  • To simplify the specification of fold I also added a definition for
    a monoid and a commutative monoid in the top level of the library.
  • I changed the history so that filter_map isn't removed

@n-osborne
Copy link
Contributor

Excellent: Thank you :-)

I believe 41162f0 could have deserved its own PR. But is is ok.

Once the conflict on the changelog has been resolved, this PR can be merged if you are ok with that.

@n-osborne n-osborne self-requested a review December 16, 2024 11:26
@mrjazzybread
Copy link
Contributor Author

mrjazzybread commented Dec 16, 2024

I thought of making it its own PR, but since this one is almost done, I figured why not. Besides the fact that the arithmetic operators don't have specifications, I believe that the standard library is in a pretty good state and can be added into Gospel.

@n-osborne n-osborne merged commit 3e83970 into ocaml-gospel:main Dec 18, 2024
6 checks passed
@n-osborne n-osborne mentioned this pull request Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants