Reducing reliance on IFD #41

TeofilC · 2023-12-19T13:48:43Z

I'm trying to use this with a project that has approx 500 transitive dependencies. This means that initial builds are quite slow, since IFD causes things to happen in sequence. In particular all-cabal-hashes-component-<package> derivations seem to take quite a while. Each is quite cheap, but since they don't run in parallel, and nix wants to check if they exist on caches, they are quite slow. I'm also a bit worried that this makes caching dependencies tricky.

I'm wondering if it would be possible to reduce our reliance on IFD. I think even if we could have a couple of bigger IFD derivations rather than lots of small ones, that would be good. Do you have any thoughts/plans about this?

The text was updated successfully, but these errors were encountered:

cdepillabout · 2023-12-27T05:06:57Z

I agree this is quite annoying. I'm also using stacklock2nix on a project that has a large number of transitive deps, and the initial builds are really slow. The only good thing is that once you do at least once build, subsequent builds are fast.

I think there are sort of two main problems here:

The all-cabal-hashes-component-<package> derivations are done one-at-a-time, and there are many of them.
Forced usage of IFD

These are both closely related.

stacklock2nix uses the haskellPackages.callCabal2nix and haskellPackages.callHackage functions from Nixpkgs, for instance in the following places:

stacklock2nix/nix/build-support/stacklock2nix/default.nix

Lines 376 to 399 in 909a362

    
                 extraGitDep = 
        
                   let 
        
                     srcName = haskPkgLock.name + "-git-repo"; 
        
                     rawSrc = builtins.fetchGit { 
        
                       url = haskPkgLock.git; 
        
                       name = srcName; 
        
                       rev = haskPkgLock.commit; 
        
                       allRefs = true; 
        
                     }; 
        
                     src = 
        
                       if haskPkgLock ? "subdir" then 
        
                         runCommand (srcName + "-get-subdir-" + haskPkgLock.subdir) {} '' 
        
                           cp -r "${rawSrc}/${haskPkgLock.subdir}" "$out" 
        
                         '' 
        
                       else 
        
                         rawSrc; 
        
                   in { 
        
                     name = haskPkgLock.name; 
        
                     value = 
        
                       hfinal.callCabal2nix 
        
                         haskPkgLock.name 
        
                         src 
        
                         (getAdditionalCabal2nixArgs haskPkgLock.name haskPkgLock.version); 
        
                   };

stacklock2nix/nix/build-support/stacklock2nix/default.nix

Lines 401 to 424 in 909a362

    
                 extraUrlDep = 
        
                   let 
        
                     srcName = haskPkgLock.name + "-url"; 
        
                     rawSrc = builtins.fetchurl { 
        
                       name = srcName; 
        
                       url = haskPkgLock.url; 
        
                       sha256 = haskPkgLock.sha256; 
        
                     }; 
        
                     src = 
        
                       runCommand (srcName + "-unpacked" + lib.optionalString (haskPkgLock ? "subdir") ("-get-subdir-" + haskPkgLock.subdir)) {} '' 
        
                         # We are assuming the input file is a tarball. 
        
                         # TODO: Is it okay to always assume this?? 
        
                         mkdir ./raw-input-source 
        
                         tar -xf "${rawSrc}" -C ./raw-input-source --strip-components=1 
        
                         cp -r "./raw-input-source/${haskPkgLock.subdir or ""}" "$out" 
        
                       ''; 
        
                   in { 
        
                     name = haskPkgLock.name; 
        
                     value = 
        
                       hfinal.callCabal2nix 
        
                         haskPkgLock.name 
        
                         src 
        
                         (getAdditionalCabal2nixArgs haskPkgLock.name haskPkgLock.version); 
        
                   };

stacklock2nix/nix/build-support/stacklock2nix/default.nix

Lines 679 to 690 in 909a362

    
           stackYamlLocalPkgsOverlay = hfinal: hprev: 
        
             let 
        
               localPkgToOverlayAttr = { pkgName, pkgPath }: { 
        
                 name = pkgName; 
        
                 value = 
        
                   let 
        
                     additionalArgs = getAdditionalCabal2nixArgs pkgName null; 
        
                   in 
        
                   hfinal.callCabal2nix pkgName pkgPath additionalArgs; 
        
               }; 
        
             in 
        
             builtins.listToAttrs (map localPkgToOverlayAttr localPkgs);

stacklock2nix/nix/build-support/stacklock2nix/default.nix

Lines 307 to 315 in 909a362

    
           pkgHackageInfoToNixHaskPkg = pkgHackageInfo: hfinal: 
        
             let 
        
               additionalArgs = getAdditionalCabal2nixArgs pkgHackageInfo.name pkgHackageInfo.version; 
        
             in 
        
             overrideCabalFileRevision 
        
               pkgHackageInfo.name 
        
               pkgHackageInfo.version 
        
               pkgHackageInfo.cabalFileHash 
        
               (hfinal.callHackage pkgHackageInfo.name pkgHackageInfo.version additionalArgs);

These functions use IFD.

The haskellPackages.callHackage function is the source of the all-cabal-hashes-component-<package> derivations, as you can see if you trace through Nixpkgs:

https://github.com/NixOS/nixpkgs/blob/f930306a698f1ae7045cf3265693b7ebc9512f23/pkgs/development/haskell-modules/make-package-set.nix#L191

https://github.com/NixOS/nixpkgs/blob/f930306a698f1ae7045cf3265693b7ebc9512f23/pkgs/development/haskell-modules/make-package-set.nix#L147-L163

stacklock2nix also does a little work on its own to make sure you're using the correct Hackage revision of a given package, as you can see in https://github.com/cdepillabout/stacklock2nix/blob/909a362869fab3b9276f0016af1df050044a4ea0/nix/build-support/stacklock2nix/fetchCabalFileRevision.nix.

A straight forward way to speed up stacklock2nix would be to reduce calls to haskellPackages.callHackage.

As you can see with the all-cabal-hashes-component-<package> derivations, cabal2nix is called independently many times through the haskellPackages.callHackage function, once for each transitive dependency, in a bunch of independent derivations.

An alternative might be to create a single derivation, where cabal2nix is called multiple times (once for each transitive dependency). I feel like I've heard that https://horizon-haskell.net/ takes this approach, although I'm not familiar enough with the code-base to figure out where this would be taking place. This is also at least somewhat similar to how hackage2nix works, and how stack2nix works.

There would still be a problem of figuring out what to do with fetchCabalFileRevision.nix, but we could always turn that into a separate issue. (edit: This has been addressed by @TeofilC in #42)

The forced use of IFD in stacklock2nix is also a problem.

Since writing stacklock2nix, I've started thinking that the following architecture would be better:

stacklock2nix creates a Nixpkgs-compatible Haskell package set, and writes it out as a .json or .nix file in a derivation.
Users can either check that .nix file into their repo in order to avoid IFD, or just import it directly in Nix code (making use of IFD).

This architecture is similar to the thinking behind dream2nix, as well as stack2nix. This is also more-or-less what hackage2nix does for Nixpkgs.

The downside of this approach is that we likely wouldn't be able to reuse functions like haskellPackages.callHackage and haskellPackages.callCabal2nix from Nixpkgs. stacklock2nix might get even more complicated (which I'd like to avoid if possible).

I haven't started working on this at all, but personally I'll like to try to create a project like cabalsolver2nix, which would be like stacklock2nix but for users of cabal-install. My plan is to figure how to do something like this for cabalsolver2nix, and then port the solution here to stacklock2nix. But if anyone else wants to try to solve this, feel free to just send a PR fixing this directly in stacklock2nix.

isomorpheme · 2024-01-04T10:59:14Z

This blog post might be relevant, I found it when I also ran into the serial builds and wondered how it could be fixed: https://jade.fyi/blog/nix-evaluation-blocking/

The approach of some-cabal-hashes there seems to be similar to the idea of creating a single derivation that does all the cabal2nix-ing.

TeofilC · 2024-01-04T14:52:57Z

Thanks @cdepillabout for your comprehensive response!

If I can find some time, I'll try to look at the callHackage thing you mention, and if I have a bit more time, I'll give the generating json/nix stuff a go. I was thinking along similar lines as well, so I'm glad to hear we are on the same page.

Your cabalsolver2nix idea sounds great! I hope you get a chance to implement it.

TeofilC · 2024-01-04T14:54:19Z

Thanks for that link @isomorpheme! It looks super helpful

cdepillabout · 2024-01-07T12:39:22Z

@isomorpheme Thanks for the link! That does sound exactly like what we need here :-)

Let me ping @lf- in case she has any direct suggestions for us.

lf- · 2024-01-08T08:21:25Z

I seem to recall I was contemplating putting some-cabal-hashes in nixpkgs. But yes I would suggest using my code for this ;)

cdepillabout mentioned this issue Aug 17, 2024

init dream2nix integration templates + tests #8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing reliance on IFD #41

Reducing reliance on IFD #41

TeofilC commented Dec 19, 2023

cdepillabout commented Dec 27, 2023 •

edited

Loading

isomorpheme commented Jan 4, 2024

TeofilC commented Jan 4, 2024

TeofilC commented Jan 4, 2024

cdepillabout commented Jan 7, 2024

lf- commented Jan 8, 2024 •

edited

Loading

Reducing reliance on IFD #41

Reducing reliance on IFD #41

Comments

TeofilC commented Dec 19, 2023

cdepillabout commented Dec 27, 2023 • edited Loading

isomorpheme commented Jan 4, 2024

TeofilC commented Jan 4, 2024

TeofilC commented Jan 4, 2024

cdepillabout commented Jan 7, 2024

lf- commented Jan 8, 2024 • edited Loading

cdepillabout commented Dec 27, 2023 •

edited

Loading

lf- commented Jan 8, 2024 •

edited

Loading