Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Stdlib instead of deprecated Pervasives for 5.00.0+trunk #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shakthimaan
Copy link

The Pervasives module has been deprecated in OCaml 5.00.0+trunk, and the recommendation is to use the Stdlib module instead.

Comment on lines +50 to +70
module State : sig
type t
val make : int -> int -> t
val copy : t -> t
val length : t -> int
val get : t -> int -> int
val set : t -> int -> int -> unit
val fold_left : ('a -> int -> 'a) -> 'a -> t -> 'a
val iteri : (int -> int -> unit) -> t -> unit
val as_array : t -> int array
end = struct
type t = int array
let copy a = Obj.(obj (with_tag abstract_tag (repr a)))
let make len elt = copy (Array.make len elt)
let length = Array.length
let get = Array.get
let set = Array.set
let fold_left = Array.fold_left
let iteri = Array.iteri
let as_array x = x
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reasoning behind this? Using Obj.magic below seems highly dangerous

@@ -18,7 +18,6 @@ open Options
open Ast
open Types
open Atom
open Pervasives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the compiler tell you there was no use of Stdlib here? If not removing an open is dangerous as it shadows comparision functions such as compare that might be used somewhere

@@ -18,7 +18,6 @@ open Util
open Ast
open Types
open Atom
open Pervasives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here

@mattiasdrp
Copy link

Hi, thanks for this PR. Could you split your commits in two, one for the Pervasives -> Stdlib and one for the redefinition of State with Obj and provide some explanation as to why this change?

@kit-ty-kate
Copy link
Contributor

@mattiasdrp done in #6

@lthls
Copy link

lthls commented Feb 16, 2022

A warning about this PR: I wrote this code a few months ago to allow cubicle to compile with Flambda 2 (except for the Pervasives -> Stdlib parts). I think it's not in a mergeable state right now: there's an Obj.magic call remaining, and the hash function will always return 0 for any state.

The main idea for the State module is to use from the start a representation of states that the GC doesn't need to scan. There are various other alternatives (some that don't even use Obj), but the complexity and impact on performance can vary.
I think that if letting the GC scan the state array is not too much of a problem, you should just merge #6 and forget this.
If you want to keep the same performance and be compatible with 5.0, then I would suggest discussing the options before making any decision.

@gasche
Copy link

gasche commented Feb 18, 2022

I ended up by clicking at stuff randomly. Apologies!

Two comments:

  • It is surprising to learn that a tricky part of a commit by @shakthimaan was in fact written by @lthls months ago. I would recommend using at least Co-authored-by: metadata, and preferably splitting commits to keep the original authors of well-identified changes.
  • If I understand correctly, Cubicle heavily relies on its "state" type (it's a model checker so there are lots of states :-) implemented as int array, and it is using an ugly set_tag hack to mark the arrays with Obj.no_scan_tag to avoid having the GC waste time traversing all states. A proposal: could states be represented using Bytes instead, using the Bytes.get_int64_ne and Bytes.set_int64_ne primitives for accessing elements?

(Another idea is Bigarray, but it's unclear to me what the performance impact would be, whereas Bytes sound like they perform comparably to normal arrays.)

@lthls
Copy link

lthls commented Feb 18, 2022

A proposal: could states be represented using Bytes instead, using the Bytes.get_int64_ne and Bytes.set_int64_ne primitives for accessing elements?

Pros: Efficient memory layout (only one more word compared to int arrays)
Cons: Needs conversions back and forth between Int64.t and int, slow accesses on non-x86 hardware

(Another idea is Bigarray, but it's unclear to me what the performance impact would be, whereas Bytes sound like they perform comparably to normal arrays.)

Pros: Support for int bigarrays available, reliable performance
Cons: Extra indirection for accesses compared to arrays

Both approaches also require patching the client code, which I was too lazy to do in my PR (but that shouldn't be a problem when implementing a robust solution).

@gasche
Copy link

gasche commented Feb 18, 2022

Needs conversions back and forth between Int64.t and int.

let get arr i = Int64.to_int (Bytes.get_int64_ne arr (i*8))
$ ocamlopt -dcmm -c test.ml
[...]
(function{test.ml:1,8-59} camlTest__get_81 (arr/83: val i/84: val)
 (+
   (<<
     (let (prim/224 (+ (<< i/84 3) -7)
           index/225 (>>s prim/224 1))
       (checkbound [...])
       (load_mut int (+ arr/83 index/225)))
     1)
   1))

It looks like the compiler removes the coercions here. (I had checked before making the suggestion.)

@lthls
Copy link

lthls commented Feb 18, 2022

It looks like the compiler removes the coercions here. (I had checked before making the suggestion.)

You'll notice that little (+ (<< (...) 1) 1) remaining. It's not a big deal, but you don't get that with int array.

@gasche
Copy link

gasche commented Feb 18, 2022

Ah, right. Meh.

@lthls
Copy link

lthls commented Feb 18, 2022

To be honest, the problem is that the coercions are only removed with the native code compiler for a 64-bit x86 backend. Try the same code with the bytecode compiler, or on a 32-bit architecture, or on a 64-bit Mac M1, and you'll get different code.

@gasche
Copy link

gasche commented Feb 18, 2022

I don't know if anyone runs model checkers on 32bit architectures anymore. Regarding arm64, I think that the situation could be improved in this case, are we are indexing with obviously-aligned indices (i*8). This being said, this was a suggestion in passing, for a solution somewhere nice in the axis between safety and performance. (I'm also a bit worried about the verbosity of bound checks with String/Bytes.) I suspect that your solution using Obj.with_tag is actually reasonable, in line with the ugly hack the code was previously using. One could use Custom instead of Abstract if hashing proved to be an issue.

@lthls
Copy link

lthls commented Feb 18, 2022

I don't know if anyone runs model checkers on 32bit architectures anymore.

Wait until people compile their model checkers to WebAssembly and run them in their browsers :)

One could use Custom instead of Abstract if hashing proved to be an issue.

Using a Custom tag implies writing a C stub (even the Obj API doesn't allow you to create well-formed custom blocks); if we're fine with that it's a decent solution. Redefining the hash function manually isn't that complicated either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants