Skip to content

Conversation

lefterislazar
Copy link
Collaborator

@lefterislazar lefterislazar commented Sep 29, 2025

This PR adds support for Vyper contracts and proof of their equivalence through HEVM. To facilitate this, since Vyper allows for a single contract per file, support for multiple source files is added. Also added support for the Bitwuzla SMT theory solver, which demonstrated better performance and compatibility with the queries produced by HEVM during equivalence checking.

HEVM equivalence additions

Supporting contracts produced by Vyper requires supporting Vyper's storage layout. There are 2 main differences with Solidity's layout that need to be accounted for.

  1. Each storage element, whether a variable, a mapping slot or a mapping's element (or in future an array element), occupies a whole slot, regardless of type. This allows for the avoidance of bit masking techniques to extract a value from a slot. Instead, the whole slot is assumed to hold the value of the corresponding element.
    For slots that hold integer values, this creates the requirement to satisfy the integer bounds of the corresponding integer type. This requirement is not enforced on the pre-state on a per-transition basis by the compiler, but holds only as an invariant across transitions. Thus, to prove equivalence, it is enough to assume the requirement for the pre-state, and given that, equivalence of behavior and input spaces ensures that the bounds hold on the post-state (as arithmetic in Vyper is checked).
    For behaviors, the locations that have bounds applied to them must be drawn from all cases, and then applied to each case of the behavior. This may lead to locations that are not actually used in a specific case having their bounds constraint present in said cases' preconditions. This is necessary because the input space of a behavior is checked for equivalence against the input space of the corresponding result from HEVM's symbolic execution, which cannot differentiate between cases beforehand, and thus must have all constraints applied.
    Note that, in Solidity, assuming the integer bounds is not required, as the number of bits used in masking depends on each variable's size, and thus the bounds hold trivially. For now, it is still done as a sanity check.

  2. When computing a mapping's element's slot, before application of the Keccak hash, the order of the concatenation of the mapping's slot index and the element's padded value, is reversed compared to Solidity's.

Results

An implementation of ERC20 in Vyper has been added to the test suite, and is successfully proven equivalent to the specification used for the existing Solidity test. Use of Bitwuzla is recommended, as cvc5 presents compatibility issues, and Z3 presents performance issues.

Multiple source files

When working with a single contract, Vyper compatibility is enabled by passing the single Vyper file to the new --vy argument instead of --sol.

To work with multiple source files, the --sources argument is added. It receives a .json file with the following structure:

  • A specifications list which contains all the relevant Act files' paths. Note that the file paths are relative to the json file's path.
  • A sources object which contains a key for each relevant source file's path. The paths map to objects that contain a language key to specify between Solidity and Vyper (and compiled Bytecode in the future).
  • A contracts object which contains a key for each contract in the Act specification. The contracts map to objects that contain a source key, which maps to the file path of the source file that implements said contract.

The following is an example with 2 files and 2 contracts:

{
  "specifications": ["multi1.act"],
  "sources": {
    "file1.sol" : {
      "language" : "Solidity"
    },
    "file2.vy" : {
      "language" : "Vyper"
    }
  },
  "contracts" : {
    "A1" : {
      "source" : "file2.vy"
    },
    "A2" : {
      "source" : "file1.sol"
    }
  }
}

Note that in the future, error messages when providing multiple Act specifications need to be improved.

Miscellaneous

  • Small corrections in the integer bounds application logic.
  • Regeneration of typing test results due to above changes.
  • Addition of tests for Vyper contracts and mixed Vyper-Solidity multi-contract specifications.
  • Set Bitwuzla as default solver for HEVM CI tests, due to better performance.
  • Changed the AMM test to utilize the ERC20 specification and implementation
  • Restructured test/hevm folder to allow multiple runs to utilize the same specifications and implementations. For reference, see ERC20 and AMM.

@lefterislazar lefterislazar requested a review from zoep September 29, 2025 02:29
src/Act/CLI.hs Outdated
foldM (\cmap spec'@(Contract cnstr _) -> do

-- Creates maps of storage layout modes and bytecodes, for all contracts contained in the given Act specification
createContractMap :: [Contract] -> Map (Maybe Id) (LayoutMode, ByteString, ByteString) -> IO (Map Id (Contract, BS.ByteString, BS.ByteString), Map Id LayoutMode)
Copy link
Collaborator

@zoep zoep Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just one map with type ap Id (Contract, BS.ByteString, BS.ByteString, LayoutMode)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally checkContract uses the layout mode and the bytecodes in different places, and the mapping from Ids to bytecodes is aliased as CodeMap in HEVM.hs. I was trying to keep the alias pure and also minimize changes to preexisting code. Looking at it again though, it does indeed seem more sensible to have the separation happen after createContractMap. I will modify accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not too bad having it separate from the beginning. I would only change it if merging the two maps leads to cleaner code.

src/Act/CLI.hs Outdated
(_, initcode'', runtimecode') = fromJust $ Map.lookup Nothing inputsMap
pure (Map.singleton cid (spec', initcode'', runtimecode'), Map.singleton cid VyperLayout)
_ -> do
render (yellow "Warning: " <> text "Multiple contracts in spec mapped to a single Vyper file" <> line)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just throw an error here. It's very unlikely that the Act source will have multiple specs that match the same bytecode.

src/Act/HEVM.hs Outdated
cmap' <- applyUpdates cmap cmap upds
let acmap = abstractCmap initAddr cmap'
pure (EVM.Success (preconds' <> caseconds' <> cdataprops <> symAddrCnstr cmap') mempty ret' (M.map fst cmap'), acmap)
bounds <- storageBounds cmap $ nub $ concatMap locsFromExp preconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only the storage bounds from the preconditions and not the ones that appear on the updates?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation is the same as the one in the comment in translateConstructor. All locations that are active in a behaviour are present in preconds since they have all had integer bounds generated there. I will add a reference to the comment in said function.

src/Act/HEVM.hs Outdated
layout <- getLayout
let (slot, off, size) = getPosition layout cid name
pure (EVM.Lit (fromIntegral slot), EVM.Lit $ fromIntegral off, size)
let (slot, off, size, layoutMode) = getPosition layout cid name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing whitespace

cmap' <- applyUpdates initmap initmap upds
-- After `addBounds`, `preconds` contains all integer locations that have been constrained in the Act spec.
-- All must be enforced again to avoid discrepancies. Necessary for Vyper.
bounds <- storageBounds cmap $ nub $ concatMap locsFromExp preconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as with behaviors. Is it because the location of the preconditions are a superset of the locations appearing in the creates/updates as we have already added bounds? Even if this is true we might want to be more precise as the preconditions may contain locations that the creates/update never access.

Copy link
Collaborator Author

@lefterislazar lefterislazar Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some complications with being more precise with the storage bounds, by looking only at creates/updates. The first is that the contract may possibly execute arithmetic using a location, thus requiring the corresponding bounds to be present, but not store the result of the expression in some storage location, instead e.g. only using it for control flow. Said location would not appear in the updates. Secondly, even if some storage bound was not actually necessary, it would still be present as a precondition in the Expr of the Act translation, so absence on the bytecode's branches side would lead to a discrepancy in the allowed input spaces. Lastly, if we were to include both the locations from the updates and the preconditions for the reasons described, then as you said, all locations in updates would be contained in preconds, so there is no reason to actually look at the updates again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! This deserves a comment in the code.

src/Act/HEVM.hs Outdated
layout <- getLayout
let (slot, off, size) = getPosition layout cid name
pure (EVM.Lit (fromIntegral slot), EVM.Lit $ fromIntegral off, size)
let (slot, off, size, layoutMode) = getPosition layout cid name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace

storage
x => n

constructor of A2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@@ -0,0 +1,214 @@
constructor of Token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@@ -0,0 +1,214 @@
constructor of Token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's come up with some structure for the tests directory so that we don't have to copy specs that are the same for Solidity and Vyper.

For example, can we have a directory where we put one Act spec, and two source files (Solidity and Vyper) and we are checking against both.

, vy :: w ::: Maybe String <?> "Path to .vy"
, code :: w ::: Maybe ByteString <?> "Runtime code"
, initcode :: w ::: Maybe ByteString <?> "Initial code"
, sources :: w ::: Maybe String <?> "Path to sources .json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would it be to use this json file to allow writing Act contracts that span multiple files?

Copy link
Collaborator Author

@lefterislazar lefterislazar Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not too hard. If I am interpreting the code correctly there are some requirements of uniqueness in the names of pointers across contracts (?), but if so this can be dealt with by merging the parser's result into a single Act instance before typechecking. This would mean that as far as the type checker is concerned, all contracts may have come from the same file, which seems fine for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at it more closely, although handling multiple spec files is as simple as concatenating them when the specs are valid, error handling becomes problematic with the current structure. Type checking requires access to all contracts simultaneously. If that is implemented, then each error inside the Err validation would need to also hold the file in which the position refers to.

@@ -0,0 +1,18 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a multi-contract version of AMM where Tokens are in Vyper and the AMM contract in Solidity?

It doesn't have to be added in this iteration (though it wouldn't hurt if it's not to much time), but it would be nice if:

  • The AMM contract used the "real" ERC20 specification (not the simple on this uses now). Ideally, the ERC20 would live in a different act file (see comment about putting Act specs in multiple files)
  • The AMM contract could be verified when using Tokens implemented in a different source language.

Copy link
Collaborator

@zoep zoep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

The PR is in really good shape and very close to being ready to merge. I just have a few comments I’d like us to go over before merging.

@lefterislazar lefterislazar changed the base branch from vyper to main September 29, 2025 15:06
@lefterislazar lefterislazar requested a review from zoep October 2, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants