Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Anonymous function literals with function signature inference #4170

Open
Rocknest opened this issue Jan 13, 2020 · 7 comments
Open
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@Rocknest
Copy link
Contributor

Rocknest commented Jan 13, 2020

Rationale

In the discussion of '#1717 function expressions' there are a lot of comments that touch a topic of function signature inference:
#1717 (comment) #1717 (comment) #1717 (comment) #1717 (comment). Which is a natural use case for anonymous function expressions (classis example is an argument to 'sort'), however as stated by @hryx #1717 (comment) this is not a goal of that proposal, so this use case will have to be solved anyway, so thats why i'm creating this proposal.

This proposal is compatible with #1717 (which is 'accepted' at the time of writing) but in the substance makes it a bit redundant leaving the controversial [1] [2] 'syntactic consistency among all statements which bind something to an identifier' as the main difference.

Closures are non-goal.

The proposal

Add an ability to declare a function inside of an expression. The types of the arguments of the function should be infered from the context, the same applies to the return type.

Possible syntax:

.|a, b| {
    // function body
}

An opening dot is a convention established by anon enum literals, structs, etc.
Parameters enclosed by | | instead of parentheses is also present in the language (eg. loops).

Such expressions should be coercable to the function type that is expected in the declaring scope.

const f: fn (i32) bool = .|a| {
    return (a < 4);
};

var f2: fn (i32) bool = if (condition) .|x| {
    return (x < 4);
} else .|x| {
    return (x == 54);
};

 
Ambiguous expressions should probably be compile errors:

const a = .|| { return error.Unlucky; };
// @TypeOf(a) == fn () !void

const lessThan = .|a, b| {
    return a < b;
};
// error: ambiguous
// OR
// @TypeOf(lessThan) == fn (var, var) bool

fn foo() void {
    const incr = .|x| {
        return x + 1;
    };

    warn("woah {}\n", .{ incr(4) }); // Ok?
};

 
Some examples:

pub fn sort(comptime T: type, arr: []T, f: fn (T, T) bool) {
    // ...
};

pub fn main() void {
    var letters = []u8 {'g', 'e', 'r', 'm', 'a', 'n', 'i', 'u', 'm'};

    sort(u8, letters, .|a, b| {
        return a < b;
    });
};
(.|| {
    std.debug.warn("almost js", .{});
    // what will @This() return?
})();

Expression expressions

Naming these gets complicated:

sort(u8, letters, .|a, b| => (a < b));

Basically a shortcut for one line function that return some expression. These are not part of this proposal.

@mikdusan mikdusan added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jan 14, 2020
@andrewrk andrewrk added this to the 0.7.0 milestone Jan 26, 2020
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 27, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 23, 2021
@iacore
Copy link
Contributor

iacore commented Dec 30, 2021

We should make the syntax easy to use for refactoring code. Inspiration: https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md#code-refactoring

Step 1

/// inline refactoring

const std = @import("std");

const V = struct {
    i: i32,
    u: i32,
};

pub fn main() !void {
    const v = V{.i=42, .u=35};
    std.log.debug("{}", .{v.i});
}

Step 2

pub fn main() !void {
    const v = V{.i=42, .u=35};
    {
        std.log.debug("{}", .{v.i});
    }
}

Step 3 (proposed syntax for block with restricted access to outer scope)

pub fn main() !void {
    const v = V{.i=42, .u=35};
    bind (v) {
        std.log.debug("{}", .{v.i});
    } ();
}

Not sure how this syntax should be. One orthogonal syntax I can think of (refer to #10458) is too complicated.

Step 4 (call inline anonymous function)

pub fn main() !void {
    const v = V{.i=42, .u=35};
    fn (v: V) void {
        std.log.debug("{}", .{v.i});
    } (v);
}

Step 5 (extract function)

const foo = fn (v: V) void {
    std.log.debug("{}", .{v.i});
};

pub fn main() !void {
    const v = V{.i=42, .u=35};
    foo(v);
}

@iacore
Copy link
Contributor

iacore commented Dec 30, 2021

I dislike the weird syntax of .|x|. Edit: I changed my idea. See below.
A better solution is to make function definition

const foo = fn () void {};

This aligns with const foo = struct {};, and function definition is "function literal". Just like if Struct is a type, then Struct{} is constructor of that type, fn () void is a type, and fn () void {} is constructor of that type. With this syntax, nested function is as simple as

const foo = fn () void {
  const bar = fn () void {};
};

To be honest, I want to deprecate fn functionName() {}, because it makes anonymous function less discoverable. The language user then have to read the manual carefully to discover that function type is a feature. Someone has the same criticism: https://www.duskborn.com/posts/2021-aoc-zig/#the-bad

@rohlem
Copy link
Contributor

rohlem commented Jan 10, 2022

@locriacyber see #1717

@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@iacore
Copy link
Contributor

iacore commented Dec 3, 2022

I've tried to look into zig/parser.zig to add this syntax myself, but to no avail. Is there any other effort trying to implement this?

Also, I'm not sure what the syntax should be.

Closest to current syntax:

pub const foo = fn (a: i32) void {};

Separate type and implementation (like struct literal):

pub const foo = fn (i32) void |a| {};

pub const foo1: fn (i32) void = .|a| {};

// this syntax is useful for naming callback types
const A: type =  fn (i32) void;
pub const foo2 = A |a| {};
pub const foo3: A = .|a| {};

// maybe it's more like this?
pub const foo3_stage2: *A = .|a| {};

@Vexu
Copy link
Member

Vexu commented Dec 4, 2022

Is there any other effort trying to implement this?

No effort has been made since this proposal has not been accepted and any work on it would likely end up being rejected.

@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023
@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Jul 9, 2023
@Pyrolistical
Copy link
Contributor

Pyrolistical commented Apr 18, 2024

I think this proposal is a bit broad but a lower powered version would greatly improve common refactoring issues.

pub fn main() void {
  const a = A.init();
  defer a.deinit();
  const b = B.init();
  defer b.deinit();
  ...
  const z = Z.init();
  defer z.deinit();

  const e1 = a.foo() + b.foo() + ... + z.foo() + 10;
  const e2 = a.foo() + b.foo() + ... + z.foo() + 100;
}

Status quo, extract a struct

const ExtraStruct = struct {
  a: A,
  b: B,
  ...
  z: Z,

  fn extracted(self: ExtraStruct, param: usize) usize {
     return self.a.foo() + self.b.foo() + ... + self.z.foo() + param;
  }
};

pub fn main() void {
  const a = A.init();
  defer a.deinit();
  const b = B.init();
  defer b.deinit();
  ...
  const z = Z.init();
  defer z.deinit();

  const extra_struct = ExtraStruct{
    .a = a,
    .b = b,
    ...
    .z = z,
  f};

  const e1 = extra_struct.extracted(10);
  const e2 = extra_struct.extracted(100);
}

Extracting a struct is very annoying.

This proposal can help but we don't need its full power. Instead of an anonymous function, all that is needed here is an inlined parameterized block.

Proposed, inline parameterized block

pub fn main() void {
  const a = A.init();
  defer a.deinit();
  const b = B.init();
  defer b.deinit();
  ...
  const z = Z.init();
  defer z.deinit();

  const extracted = inline blk: |param| {
    break :blk a.foo() + b.foo() + ... + z.foo() + param;
  };

  const e1 = extracted(10);
  const e2 = extracted(100);
}

The idea is since extracted is comptime inlined, it is allowed access to lexically scoped variables.

Note that since extracted is a block and not a function, return would return to the outer function, just like normal blocks.

However, since it is a lexically scoped block, I would make it a compile error if it is passed like a closure.

fn foreach(self: @This(), closure: anytype) void {
  for (self.buckets) |bucket| {
    var current = bucket.first;
    while (current) |node| : (current = node.next) {
       closure(&node.key, &node.value);
    }
  }
}

fn countValues(self: @This(), value: Value) usize {
    var count: usize = 0;
    const closure = inline |_, v| {
       if (v.* == value) count += 1;
    };
    self.foreach(closure);  // compile error: `closure` not allowed to escape
    return count;
}

This mean it wouldn't work for #6965, where this example is adapted from. If it were allowed to escape, it would violate zig's principle of "no hidden control flow" if the block returned.

@Skehmatics
Copy link

Skehmatics commented Dec 26, 2024

@Pyrolistical's simplified version of this proposal alone would be handy, but with a minor tweak this might sensibly solve the open question of how Zig can wrangle the semantics of C's function-like macros: allow passing these to inline functions only.

An example that has personally burned me a few times (Pipewire's POD macros) is a procedural-style tree builder pattern used in C. The idea is to make a builder for the structure that appends new nodes to a pre-allocated buffer with a stack to keep track of hierarchy, but then abstract away the push and pop to a macro that takes arbitrary expressions as an argument. The most valuable thing about this pattern is not the abstraction, but that it puts the hierarchy you're creating front-and-center, which can vastly improve readability when that information is key. Compare these two, roughly modeled after Clay (nicbarker/clay#3):

OpenContainer(containerConfig); // Make container element and push it to a hierarchy stack.
PlainChild(config);             // Make other elements, which automatically parent 
PlainChild(config);             // to the container at the top of the stack.
OpenContainer(containerConfig); // Start 2nd level, also parented to the previous top of the stack.
CompositeChild(config);         // Hoisted a section of the tree to a function, now we have a new building block!
CloseContainer();               // Finalize 2nd level and pop it from the stack.
for (int i = 0; i < tailLen; i++) { 
  PlainChild(tail[i]);
}
CloseContainer();               // and again for the 1st.
#define CONTAINER(config, children)
    OpenContainer(config);
    children
    CloseContainer();
...
CONTAINER(containerConfig, {
  PlainChild(config);
  PlainChild(config);
  CONTAINER(containerConfig, {
    CompositeChild(config);
  });
  for (int i = 0; i < tailLen; i++) { 
    PlainChild(tail[i]); 
  }
});

This type of abstraction is unrepresentable in status-quo Zig as far as I can tell. I'd argue that it fits very well with the ethos of making the most correct and performant approach the easiest, and that the degree it hides control flow is no less than that of async functions (not a good omen, to be fair) or labeled breaks.
The example above under this modified proposal (type signature and definition syntax notwithstanding) would be:

inline fn CONTAINER(config: ContainerConfig, children: inline fn (void) void) void {
    OpenContainer(config);
    children();
    CloseContainer();
}
...
CONTAINER(containerConfig, inline {
  PlainChild(config);
  PlainChild(config);
  CONTAINER(containerConfig, inline {
    CompositeChild(config);
  });
  for (tail) |tailConfig| { 
    PlainChild(tailConfig); 
  }
});

Alternatives

These come with a pretty big caveat: none of these can be done by translate-c, as the semantics of calling must change to break through a layer of abstraction, so manual wrapping of the C library is required.

Struct-based approach

Despite being the most straight forward seeming way to do this, it is not a zero-cost abstraction. Items were arranged merely by the sequential nature of execution previously, but now values must be returned so items can be accounted and re-combined in order. Additionally, this complicates inclusion of control flow, as you are now working with manipulating data rather than just writing statements.

CONTAINER(containerConfig, blk: { 
  var children: []Element = .{
    PlainChild(config),
    PlainChild(config),
    CONTAINER(containerConfig, .{
      CompositeChild(config), // This must now only add a single element or wrap them in a container!
    }),
  };
  for (tail) |tailConfig| { 
    children = children ++ .{ PlainChild(tailConfig) };
  }
  break :blk children;
});

Defer and blocks

This is probably the biggest argument against this justification, as it is very explicit about the execution without being unreadable, but is less so about the intent. However, if the push and pop/finalize are non-trivial, this can become untenable without adding extra abstraction elsewhere.

{ 
  OpenContainer(containerConfig); defer CloseContainer();
  PlainChild(config);
  PlainChild(config);
  {
    OpenContainer(containerConfig); defer CloseContainer();
    CompositeChild(config);
  }
  for (tail) |tailConfig| { 
    PlainChild(tailConfig); 
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

8 participants