-
Notifications
You must be signed in to change notification settings - Fork 44
Support for HEVM equivalence of Vyper Contracts and multiple source files. #205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/Act/CLI.hs
Outdated
foldM (\cmap spec'@(Contract cnstr _) -> do | ||
|
||
-- Creates maps of storage layout modes and bytecodes, for all contracts contained in the given Act specification | ||
createContractMap :: [Contract] -> Map (Maybe Id) (LayoutMode, ByteString, ByteString) -> IO (Map Id (Contract, BS.ByteString, BS.ByteString), Map Id LayoutMode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just one map with type ap Id (Contract, BS.ByteString, BS.ByteString, LayoutMode)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally checkContract
uses the layout mode and the bytecodes in different places, and the mapping from Ids to bytecodes is aliased as CodeMap
in HEVM.hs
. I was trying to keep the alias pure and also minimize changes to preexisting code. Looking at it again though, it does indeed seem more sensible to have the separation happen after createContractMap
. I will modify accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not too bad having it separate from the beginning. I would only change it if merging the two maps leads to cleaner code.
src/Act/CLI.hs
Outdated
(_, initcode'', runtimecode') = fromJust $ Map.lookup Nothing inputsMap | ||
pure (Map.singleton cid (spec', initcode'', runtimecode'), Map.singleton cid VyperLayout) | ||
_ -> do | ||
render (yellow "Warning: " <> text "Multiple contracts in spec mapped to a single Vyper file" <> line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also just throw an error here. It's very unlikely that the Act source will have multiple specs that match the same bytecode.
src/Act/HEVM.hs
Outdated
cmap' <- applyUpdates cmap cmap upds | ||
let acmap = abstractCmap initAddr cmap' | ||
pure (EVM.Success (preconds' <> caseconds' <> cdataprops <> symAddrCnstr cmap') mempty ret' (M.map fst cmap'), acmap) | ||
bounds <- storageBounds cmap $ nub $ concatMap locsFromExp preconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only the storage bounds from the preconditions and not the ones that appear on the updates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation is the same as the one in the comment in translateConstructor
. All locations that are active in a behaviour are present in preconds
since they have all had integer bounds generated there. I will add a reference to the comment in said function.
src/Act/HEVM.hs
Outdated
layout <- getLayout | ||
let (slot, off, size) = getPosition layout cid name | ||
pure (EVM.Lit (fromIntegral slot), EVM.Lit $ fromIntegral off, size) | ||
let (slot, off, size, layoutMode) = getPosition layout cid name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing whitespace
cmap' <- applyUpdates initmap initmap upds | ||
-- After `addBounds`, `preconds` contains all integer locations that have been constrained in the Act spec. | ||
-- All must be enforced again to avoid discrepancies. Necessary for Vyper. | ||
bounds <- storageBounds cmap $ nub $ concatMap locsFromExp preconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as with behaviors. Is it because the location of the preconditions are a superset of the locations appearing in the creates/updates as we have already added bounds? Even if this is true we might want to be more precise as the preconditions may contain locations that the creates/update never access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some complications with being more precise with the storage bounds, by looking only at creates/updates. The first is that the contract may possibly execute arithmetic using a location, thus requiring the corresponding bounds to be present, but not store the result of the expression in some storage location, instead e.g. only using it for control flow. Said location would not appear in the updates. Secondly, even if some storage bound was not actually necessary, it would still be present as a precondition in the Expr of the Act translation, so absence on the bytecode's branches side would lead to a discrepancy in the allowed input spaces. Lastly, if we were to include both the locations from the updates and the preconditions for the reasons described, then as you said, all locations in updates would be contained in preconds
, so there is no reason to actually look at the updates again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! This deserves a comment in the code.
src/Act/HEVM.hs
Outdated
layout <- getLayout | ||
let (slot, off, size) = getPosition layout cid name | ||
pure (EVM.Lit (fromIntegral slot), EVM.Lit $ fromIntegral off, size) | ||
let (slot, off, size, layoutMode) = getPosition layout cid name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trailing whitespace
storage | ||
x => n | ||
|
||
constructor of A2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
@@ -0,0 +1,214 @@ | |||
constructor of Token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
@@ -0,0 +1,214 @@ | |||
constructor of Token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's come up with some structure for the tests directory so that we don't have to copy specs that are the same for Solidity and Vyper.
For example, can we have a directory where we put one Act spec, and two source files (Solidity and Vyper) and we are checking against both.
, vy :: w ::: Maybe String <?> "Path to .vy" | ||
, code :: w ::: Maybe ByteString <?> "Runtime code" | ||
, initcode :: w ::: Maybe ByteString <?> "Initial code" | ||
, sources :: w ::: Maybe String <?> "Path to sources .json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard would it be to use this json file to allow writing Act contracts that span multiple files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not too hard. If I am interpreting the code correctly there are some requirements of uniqueness in the names of pointers across contracts (?), but if so this can be dealt with by merging the parser's result into a single Act
instance before typechecking. This would mean that as far as the type checker is concerned, all contracts may have come from the same file, which seems fine for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at it more closely, although handling multiple spec files is as simple as concatenating them when the specs are valid, error handling becomes problematic with the current structure. Type checking requires access to all contracts simultaneously. If that is implemented, then each error inside the Err
validation would need to also hold the file in which the position refers to.
@@ -0,0 +1,18 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a multi-contract version of AMM where Tokens are in Vyper and the AMM contract in Solidity?
It doesn't have to be added in this iteration (though it wouldn't hurt if it's not to much time), but it would be nice if:
- The AMM contract used the "real" ERC20 specification (not the simple on this uses now). Ideally, the ERC20 would live in a different act file (see comment about putting Act specs in multiple files)
- The AMM contract could be verified when using Tokens implemented in a different source language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
The PR is in really good shape and very close to being ready to merge. I just have a few comments I’d like us to go over before merging.
This PR adds support for Vyper contracts and proof of their equivalence through HEVM. To facilitate this, since Vyper allows for a single contract per file, support for multiple source files is added. Also added support for the Bitwuzla SMT theory solver, which demonstrated better performance and compatibility with the queries produced by HEVM during equivalence checking.
HEVM equivalence additions
Supporting contracts produced by Vyper requires supporting Vyper's storage layout. There are 2 main differences with Solidity's layout that need to be accounted for.
Each storage element, whether a variable, a mapping slot or a mapping's element (or in future an array element), occupies a whole slot, regardless of type. This allows for the avoidance of bit masking techniques to extract a value from a slot. Instead, the whole slot is assumed to hold the value of the corresponding element.
For slots that hold integer values, this creates the requirement to satisfy the integer bounds of the corresponding integer type. This requirement is not enforced on the pre-state on a per-transition basis by the compiler, but holds only as an invariant across transitions. Thus, to prove equivalence, it is enough to assume the requirement for the pre-state, and given that, equivalence of behavior and input spaces ensures that the bounds hold on the post-state (as arithmetic in Vyper is checked).
For behaviors, the locations that have bounds applied to them must be drawn from all cases, and then applied to each case of the behavior. This may lead to locations that are not actually used in a specific case having their bounds constraint present in said cases' preconditions. This is necessary because the input space of a behavior is checked for equivalence against the input space of the corresponding result from HEVM's symbolic execution, which cannot differentiate between cases beforehand, and thus must have all constraints applied.
Note that, in Solidity, assuming the integer bounds is not required, as the number of bits used in masking depends on each variable's size, and thus the bounds hold trivially. For now, it is still done as a sanity check.
When computing a mapping's element's slot, before application of the Keccak hash, the order of the concatenation of the mapping's slot index and the element's padded value, is reversed compared to Solidity's.
Results
An implementation of ERC20 in Vyper has been added to the test suite, and is successfully proven equivalent to the specification used for the existing Solidity test. Use of Bitwuzla is recommended, as cvc5 presents compatibility issues, and Z3 presents performance issues.
Multiple source files
When working with a single contract, Vyper compatibility is enabled by passing the single Vyper file to the new
--vy
argument instead of--sol
.To work with multiple source files, the
--sources
argument is added. It receives a .json file with the following structure:specifications
list which contains all the relevant Act files' paths. Note that the file paths are relative to the json file's path.sources
object which contains a key for each relevant source file's path. The paths map to objects that contain alanguage
key to specify between Solidity and Vyper (and compiled Bytecode in the future).contracts
object which contains a key for each contract in the Act specification. The contracts map to objects that contain asource
key, which maps to the file path of the source file that implements said contract.The following is an example with 2 files and 2 contracts:
Note that in the future, error messages when providing multiple Act specifications need to be improved.
Miscellaneous