Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing reliance on IFD #41

Open
TeofilC opened this issue Dec 19, 2023 · 6 comments
Open

Reducing reliance on IFD #41

TeofilC opened this issue Dec 19, 2023 · 6 comments

Comments

@TeofilC
Copy link
Contributor

TeofilC commented Dec 19, 2023

I'm trying to use this with a project that has approx 500 transitive dependencies. This means that initial builds are quite slow, since IFD causes things to happen in sequence. In particular all-cabal-hashes-component-<package> derivations seem to take quite a while. Each is quite cheap, but since they don't run in parallel, and nix wants to check if they exist on caches, they are quite slow. I'm also a bit worried that this makes caching dependencies tricky.

I'm wondering if it would be possible to reduce our reliance on IFD. I think even if we could have a couple of bigger IFD derivations rather than lots of small ones, that would be good. Do you have any thoughts/plans about this?

@cdepillabout
Copy link
Owner

cdepillabout commented Dec 27, 2023

I agree this is quite annoying. I'm also using stacklock2nix on a project that has a large number of transitive deps, and the initial builds are really slow. The only good thing is that once you do at least once build, subsequent builds are fast.

I think there are sort of two main problems here:

  1. The all-cabal-hashes-component-<package> derivations are done one-at-a-time, and there are many of them.
  2. Forced usage of IFD

These are both closely related.

stacklock2nix uses the haskellPackages.callCabal2nix and haskellPackages.callHackage functions from Nixpkgs, for instance in the following places:

extraGitDep =
let
srcName = haskPkgLock.name + "-git-repo";
rawSrc = builtins.fetchGit {
url = haskPkgLock.git;
name = srcName;
rev = haskPkgLock.commit;
allRefs = true;
};
src =
if haskPkgLock ? "subdir" then
runCommand (srcName + "-get-subdir-" + haskPkgLock.subdir) {} ''
cp -r "${rawSrc}/${haskPkgLock.subdir}" "$out"
''
else
rawSrc;
in {
name = haskPkgLock.name;
value =
hfinal.callCabal2nix
haskPkgLock.name
src
(getAdditionalCabal2nixArgs haskPkgLock.name haskPkgLock.version);
};

extraUrlDep =
let
srcName = haskPkgLock.name + "-url";
rawSrc = builtins.fetchurl {
name = srcName;
url = haskPkgLock.url;
sha256 = haskPkgLock.sha256;
};
src =
runCommand (srcName + "-unpacked" + lib.optionalString (haskPkgLock ? "subdir") ("-get-subdir-" + haskPkgLock.subdir)) {} ''
# We are assuming the input file is a tarball.
# TODO: Is it okay to always assume this??
mkdir ./raw-input-source
tar -xf "${rawSrc}" -C ./raw-input-source --strip-components=1
cp -r "./raw-input-source/${haskPkgLock.subdir or ""}" "$out"
'';
in {
name = haskPkgLock.name;
value =
hfinal.callCabal2nix
haskPkgLock.name
src
(getAdditionalCabal2nixArgs haskPkgLock.name haskPkgLock.version);
};

stackYamlLocalPkgsOverlay = hfinal: hprev:
let
localPkgToOverlayAttr = { pkgName, pkgPath }: {
name = pkgName;
value =
let
additionalArgs = getAdditionalCabal2nixArgs pkgName null;
in
hfinal.callCabal2nix pkgName pkgPath additionalArgs;
};
in
builtins.listToAttrs (map localPkgToOverlayAttr localPkgs);

pkgHackageInfoToNixHaskPkg = pkgHackageInfo: hfinal:
let
additionalArgs = getAdditionalCabal2nixArgs pkgHackageInfo.name pkgHackageInfo.version;
in
overrideCabalFileRevision
pkgHackageInfo.name
pkgHackageInfo.version
pkgHackageInfo.cabalFileHash
(hfinal.callHackage pkgHackageInfo.name pkgHackageInfo.version additionalArgs);

These functions use IFD.

The haskellPackages.callHackage function is the source of the all-cabal-hashes-component-<package> derivations, as you can see if you trace through Nixpkgs:

https://github.com/NixOS/nixpkgs/blob/f930306a698f1ae7045cf3265693b7ebc9512f23/pkgs/development/haskell-modules/make-package-set.nix#L191

https://github.com/NixOS/nixpkgs/blob/f930306a698f1ae7045cf3265693b7ebc9512f23/pkgs/development/haskell-modules/make-package-set.nix#L147-L163

stacklock2nix also does a little work on its own to make sure you're using the correct Hackage revision of a given package, as you can see in https://github.com/cdepillabout/stacklock2nix/blob/909a362869fab3b9276f0016af1df050044a4ea0/nix/build-support/stacklock2nix/fetchCabalFileRevision.nix.


A straight forward way to speed up stacklock2nix would be to reduce calls to haskellPackages.callHackage.

As you can see with the all-cabal-hashes-component-<package> derivations, cabal2nix is called independently many times through the haskellPackages.callHackage function, once for each transitive dependency, in a bunch of independent derivations.

An alternative might be to create a single derivation, where cabal2nix is called multiple times (once for each transitive dependency). I feel like I've heard that https://horizon-haskell.net/ takes this approach, although I'm not familiar enough with the code-base to figure out where this would be taking place. This is also at least somewhat similar to how hackage2nix works, and how stack2nix works.

There would still be a problem of figuring out what to do with fetchCabalFileRevision.nix, but we could always turn that into a separate issue. (edit: This has been addressed by @TeofilC in #42)


The forced use of IFD in stacklock2nix is also a problem.

Since writing stacklock2nix, I've started thinking that the following architecture would be better:

  1. stacklock2nix creates a Nixpkgs-compatible Haskell package set, and writes it out as a .json or .nix file in a derivation.
  2. Users can either check that .nix file into their repo in order to avoid IFD, or just import it directly in Nix code (making use of IFD).

This architecture is similar to the thinking behind dream2nix, as well as stack2nix. This is also more-or-less what hackage2nix does for Nixpkgs.

The downside of this approach is that we likely wouldn't be able to reuse functions like haskellPackages.callHackage and haskellPackages.callCabal2nix from Nixpkgs. stacklock2nix might get even more complicated (which I'd like to avoid if possible).

I haven't started working on this at all, but personally I'll like to try to create a project like cabalsolver2nix, which would be like stacklock2nix but for users of cabal-install. My plan is to figure how to do something like this for cabalsolver2nix, and then port the solution here to stacklock2nix. But if anyone else wants to try to solve this, feel free to just send a PR fixing this directly in stacklock2nix.

@isomorpheme
Copy link
Contributor

This blog post might be relevant, I found it when I also ran into the serial builds and wondered how it could be fixed: https://jade.fyi/blog/nix-evaluation-blocking/

The approach of some-cabal-hashes there seems to be similar to the idea of creating a single derivation that does all the cabal2nix-ing.

@TeofilC
Copy link
Contributor Author

TeofilC commented Jan 4, 2024

Thanks @cdepillabout for your comprehensive response!

If I can find some time, I'll try to look at the callHackage thing you mention, and if I have a bit more time, I'll give the generating json/nix stuff a go. I was thinking along similar lines as well, so I'm glad to hear we are on the same page.

Your cabalsolver2nix idea sounds great! I hope you get a chance to implement it.

@TeofilC
Copy link
Contributor Author

TeofilC commented Jan 4, 2024

Thanks for that link @isomorpheme! It looks super helpful

@cdepillabout
Copy link
Owner

@isomorpheme Thanks for the link! That does sound exactly like what we need here :-)

Let me ping @lf- in case she has any direct suggestions for us.

@lf-
Copy link

lf- commented Jan 8, 2024

I seem to recall I was contemplating putting some-cabal-hashes in nixpkgs. But yes I would suggest using my code for this ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants