Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtins.mergeMapAttrs (or builtins.concatMapAttrs) #11887

Open
roberth opened this issue Nov 15, 2024 · 4 comments
Open

builtins.mergeMapAttrs (or builtins.concatMapAttrs) #11887

roberth opened this issue Nov 15, 2024 · 4 comments
Labels
feature Feature request or proposal idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. language The Nix expression language; parser, interpreter, primops, evaluation, etc performance

Comments

@roberth
Copy link
Member

roberth commented Nov 15, 2024

Is your feature request related to a problem? Please describe.

Nixpkgs lib defines concatMapAttrs f attrs, which is a very versatile function. (To those it may concern: by virtue of being almost the monadic bind operation on attrsets. See here. Almost => no associativity due to attribute name collisions iirc)

Describe the solution you'd like

Add to builtins a primop that behaves like Nixpkgs' lib.concatMapAttrs. It could be builtins.concatMapAttrs or a different name, such as flatMap attrs or mergeMapAttrs. Its implementation will be slightly more efficient than that in lib, and lib may reuse it, so that it becomes a polyfill.
The added efficiency comes from having no intermediate Nix lists, and may build the attrset in one go.

The current implementation in lib has the following behaviors, which could be optimized away (in order of increasing implementation work):

  1. performs more function calls (significant, since we're an interpreter, and our calls perform allocations)
  2. creates and sorts intermediate attrsets after each f call (except the final one which is the return value)
  3. creates and sorts intermediate attrsets returned by f
  • (1) and (2) a trivially solved by implementing this in C++.

  • (3) may feel unavoidable, but can be improved with Prototype: accumulate attrset updates, perform k-way merge #11290

    • or a variation that streams an insertion sort, "accumulating" individual attrs, rather than a bunch of attrsets.
    • this is a nice to have, because (1) and (2) are sufficient reason to implement the builtin.

If we pick a new name, we could make attribute name collisions an error, which iirc would make it a lawful monad in all the successful cases. Errors are better than buggy behavior somewhere deep down in some expression, because how would you find the bug.

Describe alternatives you've considered

  • Perhaps make it flatMapAttrs or mergeMapAttrs. concat kinda works by extrapolating from lists, but I'm not super happy about that.

  • mergeMapAttrsWith (name: values: merge name values) f attrs to handle the collisions.
    This is like zipAttrsWith (name: values: if length values == 1 then head values else m name values) but without allocating a whole bunch of singleton lists.

    • I feel like this isn't sufficiently distinct to warrant a separate builtin
    • => describe the connection in the docs instead.

Additional context

Priorities

Add 👍 to issues you find important.

@roberth roberth added feature Feature request or proposal language The Nix expression language; parser, interpreter, primops, evaluation, etc performance labels Nov 15, 2024
@hsjobeki
Copy link
Contributor

Could you give some example how a function call would look like?

Same as concatMapAttrs ?

mergeMapAttrs 
  (name: value: { ${name + "-mapped"} = value * 2 ; } )
  { foo = 1; bar =2; nested.foo = 3; }
=>
{
  { foo-mapped = 2; bar-mapped = 4; nested-mapped.foo = 6; }
}

Would it make sense to accept an optional merge function.

mergeMapAttrs {
  mapFn =  (name: value: { ${name + "-mapped"} = value * 2 ; } );
  mergeFn = (name: values: merge name values) # defaults to an error function? 
  } 
  { foo = 1; bar =2; nested.foo = 3; }
=>
{
  { foo-mapped = 2; bar-mapped = 4; nested-mapped.foo = 6; }
}

would taking an attribute set as arguments add some benefit? in this case: default behavior that can be customized.

@roberth
Copy link
Member Author

roberth commented Nov 17, 2024

@hsjobeki

Same as concatMapAttrs?

Yes. I've updated the description.

accept an optional merge function.

Perhaps, but it has a slight cost, and not much, but non-zero benefit. See the alternatives section.

would taking an attribute set as arguments add some benefit? in this case: default behavior that can be customized.

In the current evaluator implementation, "argument attrsets" aren't special or optimized, whereas curried function application of positional arguments to primops is optimized. I'd like this to make no difference, but optimizing those calls is non-trivial, because we don't have static knowledge of the function, given an application Expr.
So an attrset based call would have a slight overhead compared to entirely positional arguments.

@hsjobeki
Copy link
Contributor

hsjobeki commented Nov 18, 2024

I have to admit i am a big fan of this Idea overall.

@roberth roberth added this to Nix team Nov 18, 2024
@github-project-automation github-project-automation bot moved this to To triage in Nix team Nov 18, 2024
@edolstra edolstra added the idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. label Nov 20, 2024
@edolstra
Copy link
Member

Team meeting notes:

  • Idea approved in principle, but needs implementation + benchmarking to see if the Nixpkgs speedup is worth it.

@edolstra edolstra removed this from Nix team Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. language The Nix expression language; parser, interpreter, primops, evaluation, etc performance
Projects
None yet
Development

No branches or pull requests

3 participants