Multithreaded Compilation #166

brockelmore · 2021-11-22T23:26:40Z

Requested Feature

For sufficiently large repos, solc compilation can be the slowest part of testing. In those cases we should multithread compilation as much as possible. Here is a rough sketch of a potential way to achieve this?

Suggested solution

Determine a threshold for multithreading compilation (# of sol files in repo?) that offsets any overhead
Spin up 1 thread per file, determining dependency graph
Hash trees/dep graph, and remove duplicate trees
Compile the needed contracts using the existing threads, in parallel

There could be issues here that I am unaware of but documenting the thoughts I had anyway

maxsam4 · 2021-11-30T06:43:14Z

I feel like the overhead will not be worth it in most cases. Individual contracts are usually so small that invoking solc is going to consume more time than solc compiling them.

This feature would make more sense in solc itself but I won't put it on the priority list even there.

lucas-manuel · 2021-12-26T22:22:35Z

I tend to agree that it would be cool but could be more trouble than it is worth, especially if we have compilation caching etc.

mattsse · 2021-12-27T09:44:37Z

this is actually already in use for workspaces that require different solc versions
gakonst/ethers-rs#652

gakonst · 2022-02-18T16:01:56Z

@mattsse bumping this, seems like next up in our priority list along with #769?

onbjerg · 2022-02-19T15:21:53Z

Marking this and #769 as high prio if I understand your comment correctly

lucas-manuel · 2023-05-18T14:53:41Z

Close-able? @mds1

mds1 · 2023-05-18T16:15:20Z

@mattsse wdyt here? Slow compilation times are a big pain point these days so this could be valuable, and per #166 (comment) it sounds like it might not be too hard to implement

gakonst · 2023-05-22T16:33:09Z

I believe this is not tractable: gakonst/ethers-rs#943 (comment). What makes compilation slow these days? Repos got bigger? Via-ir? Forge-std?

(Closing)

brockelmore · 2023-05-22T19:43:18Z

I believe this is not tractable: gakonst/ethers-rs#943 (comment). What makes compilation slow these days? Repos got bigger? Via-ir? Forge-std?

(Closing)

Fwiw, i do think for some repos it may be worthwhile dependent on the number of roots in the project. Where a root is a contract in which no other contract inherits or imports it (diagram below shows how we could split a project into two roots based on dependency graph). I generally believe this is tractable but the number of projects that need this may be few and far between? i.e. if I import my complex contract A, and use the contract as an interface elsewhere i still have to compile the contract in its entirety (but if the project uses interface contracts its probably worth splitting up). This is a different version of parallelization than trying to do any crazy linking. It is much more straightforward. I think it would only have a benefit where via-ir is on (also all of this should in theory be done at the compiler level, not in foundry...)

graph TD;
    subgraph root_2 
         A-->B;
         A-->C;
         C-->G;
         B-->H;
    end
    subgraph root_1
        D-->E;
        D-->F;
    end

0xLightt · 2023-11-29T14:40:01Z

Hi all,

I'm revisiting the idea of parallel compilation in Foundry. After a discussion in one of the Solidity team's weekly meetings, it was indicated that they don't currently plan to implement parallel compilation in solc. However, they suggested an alternative approach of running solc for different files in parallel, rather than for the whole repo. With the shift towards via-ir potentially becoming the default mode for Solidity compilation, I believe exploring parallel compilation approaches could become even more crucial for optimizing build times, especially for larger projects.

Since this issue was previously closed, has there been any new developments or considerations about integrating parallel compilation in Foundry? This could involve strategies like the one suggested by the Solidity team or other methods to effectively parallelize the build process. I recognize the complexity of this feature but am interested in whether there's room for further discussion or contributions.

Thanks for your work on Foundry!

mattsse · 2023-11-29T15:31:17Z

no plans atm because the complexity overhead is quite huge.

but I can see that this would still be a great feature to have so reopening to not lose it

k06a · 2024-02-29T21:03:04Z

@gakonst option to compile files in parallel could be great and relatively easy solution.

dezzeus · 2024-03-13T09:07:50Z

Any marginal (even experimental opt-in) improvement would be very appreciated as my team and I are working on a major project with around 250 solidity files and it takes almost an hour to compile with via_ir enabled…

cameel · 2024-05-27T11:36:53Z

Hey everyone! I've been tasked with exploring this topic in the context of improving the speed of compilation via IR. We (Solidity) wanted to know what the obstacles for this kind of parallelized compilation are and if we can do anything to make it easier for frameworks.

Theoretically, the parallelization is very simple already: take the Standard JSON input containing all the sources and split it into series of inputs where each one uses settings.outputSelection to request output only for a single contract. The compiler will perform compilation and optimization only for the one you selected. It will still analyze all the sources, but the later stages of the pipeline are orders of magnitude slower than analysis so it should not matter that much.

To benchmark it, I created a proof of concept script (parasolc) that can be passed in place of a solc binary to forge --use. Here's also the full report with my findings: The parasolc experiment.

Now, as some have already suspected here, the overhead of doing it this way is just staggering. The projects I benchmarked require 3-4 times as much work compared to sequential compilation. The report attempts to explain where all that time is going, but the short of it is that this kind of parallelization makes the compiler repeat the same work multiple times in several ways (that may be different for different projects). Bytecode dependencies (contracts deployed with new) can no longer be reused for contracts that depend on them and also the same sources are analyzed multiple times.

Still, while expensive, this method does provide an actual improvement in terms of wall-clock-time spent on compilation. Here are the numbers I got for an 8 core machine.

Benchmark	Real time	CPU time
OpenZeppelin (sequential)	39 s	39 s
OpenZeppelin (parallelized)	27 s	144 s
Uniswap v4 (sequential)	166 s	165 s
Uniswap v4 (parallelized)	76 s	499 s

It appears that with enough cores you can still come out ahead despite the overhead. While this is far from what I was hoping to present you here, and does not seem like a good choice for the default compilation mode, it's still a trade-off that may make sense in some situations. It may work better for some projects than others, depending on how they are structured and how interdependent their contracts are. The method is simple enough that it might make sense to be an optional feature.

Can we do better?

I explored workarounds, like grouping bytecode dependencies together to improve reuse or culling of sources irrelevant to the contract being compiled. That's also in the report. Both unfortunately come with significant downsides and/or don't improve the situation as much as one would hope.

There's the idea described by @brockelmore above (and already used by e.g. Hardhat) of identifying independent clusters of contracts and compiling each cluster separately. It's largely orthogonal to what I explored here - it just shifts the problem inside the groups - and it's not effective against projects with tightly interconnected sources. Still, I think it's worthwhile when applicable, and we'd like to see more tools using it. It is not technically complicated and all the information is already out there, but it does require parsing the AST to identify the imports. If that's an obstacle, the one thing we could do to make it more straightforward would be to make the dependency graph between sources available as a separate output. Or even just outright assign each contract to a group based on it. Would Foundry make use of such an output?

Another idea, that would only work for IR though, would be to do unoptimized IR generation sequentially (i.e. request the ir output for all contracts) and then only do optimization in parallel by compiling the produced Yul. Code generation is slower than analysis, but it's the optimization and Yul->EVM transform that are the real bottlenecks so it should provide some savings. The downside here is that it would be a little more complex than the naive method and only viable after we fix ethereum/solidity#15062. It also only addresses the analysis overhead and would do nothing for bytecode dependencies.

In the longer term, there are things we could improve in the compiler, but I can't really tell if/when they will happen:

Reuse of analysis info: You can avoid overhead by just not doing analysis in parallel. Do it sequentially and then compile each contract in parallel by importing (rather than recalculating) all the info. This requires a way to reuse it - the current AST export does not cut it. The import redoes a lot of analysis.
Lazy analysis: Avoid analyzing definitions that are not relevant. Currently the analysis works in distinct phases, where each has to analyze the whole input, because it makes no assumptions about which parts the subsequent phases will need.
Bytecode reuse: A way to serialize and reuse already compiled bytecode for contracts that depend on it.
Multithreaded compilation: TBH we consider this one too complex and disruptive to introduce in the current state of the compiler, so while it could happen some day, it won't be any time soon. We'd rather explore the ideas based on multiprocessing first.

onbjerg added T-feature Type: feature C-forge Command: forge labels Jan 19, 2022

onbjerg added Cmd-forge-build Command: forge build P-high Priority: high labels Feb 19, 2022

mattsse mentioned this issue Feb 21, 2022

feat(solc): compile project components in parallel gakonst/ethers-rs#943

Closed

3 tasks

gakonst mentioned this issue Feb 26, 2022

feat: benchmarks #815

Closed

mattsse self-assigned this Mar 8, 2022

gakonst changed the title ~~Multithreaded Compilation~~ s Mar 17, 2022

onbjerg changed the title s Multithreaded Compilation Mar 21, 2022

gakonst closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2023

mattsse reopened this Nov 29, 2023

cameel mentioned this issue May 27, 2024

More granular parallelized compilation of Solidity contracts NomicFoundation/hardhat#5278

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multithreaded Compilation #166

Multithreaded Compilation #166

brockelmore commented Nov 22, 2021 •

edited

Loading

maxsam4 commented Nov 30, 2021

lucas-manuel commented Dec 26, 2021

mattsse commented Dec 27, 2021

gakonst commented Feb 18, 2022

onbjerg commented Feb 19, 2022

lucas-manuel commented May 18, 2023

mds1 commented May 18, 2023

gakonst commented May 22, 2023 •

edited

Loading

brockelmore commented May 22, 2023 •

edited

Loading

0xLightt commented Nov 29, 2023 •

edited

Loading

mattsse commented Nov 29, 2023

k06a commented Feb 29, 2024

dezzeus commented Mar 13, 2024 •

edited

Loading

cameel commented May 27, 2024 •

edited

Loading

Multithreaded Compilation #166

Multithreaded Compilation #166

Comments

brockelmore commented Nov 22, 2021 • edited Loading

Requested Feature

Suggested solution

maxsam4 commented Nov 30, 2021

lucas-manuel commented Dec 26, 2021

mattsse commented Dec 27, 2021

gakonst commented Feb 18, 2022

onbjerg commented Feb 19, 2022

lucas-manuel commented May 18, 2023

mds1 commented May 18, 2023

gakonst commented May 22, 2023 • edited Loading

brockelmore commented May 22, 2023 • edited Loading

0xLightt commented Nov 29, 2023 • edited Loading

mattsse commented Nov 29, 2023

k06a commented Feb 29, 2024

dezzeus commented Mar 13, 2024 • edited Loading

cameel commented May 27, 2024 • edited Loading

Can we do better?

brockelmore commented Nov 22, 2021 •

edited

Loading

gakonst commented May 22, 2023 •

edited

Loading

brockelmore commented May 22, 2023 •

edited

Loading

0xLightt commented Nov 29, 2023 •

edited

Loading

dezzeus commented Mar 13, 2024 •

edited

Loading

cameel commented May 27, 2024 •

edited

Loading