Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambdas, Closures and everything in between #1048

Closed
BraedonWooding opened this issue Jun 4, 2018 · 48 comments
Closed

Lambdas, Closures and everything in between #1048

BraedonWooding opened this issue Jun 4, 2018 · 48 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@BraedonWooding
Copy link
Contributor

BraedonWooding commented Jun 4, 2018

I've been thinking about this topic for a few weeks now, and after quite a bit of research I think I have a solution that fits Zig's needs :).

This is building on #229 (and similar issues), since this is talking about implementation specifically and the issue is old I felt it was worth creating a new one.

Step 1: Make all functions anonymous; I.e. you can define a function like; const Foo = (bar: i32) i32 => { ... }; however of course we could support using fn Foo(bar: i32) i32 { ... } as the short form. In this way you can define functions inline, these aren't closure functions and when compiled will be placed outside the scope of course.

// i.e. to sort
sort(array, (a: i32, b: i32) bool => { return a < b; });

Step 2: Lambdas; Building onto anonymous functions you can define 'inline' lambda functions which are just a single statement that returns like;

sort(array, ($) => $0 < $1);

The $ just acts as a wild card effectively matching whatever the types the input requires, if the input is a 'var' type then it makes a function def that is fn X(a: var, b: var) var, perhaps? Or maybe that is just a compile error, I'm not sold either way.

Step 3: Closures; In the case where you actually want a closure you would define it like any other function but indicate type of function input;

var x = someRuntimeValue();
// 'Long form'
where(array, (a: i32, x = x) bool => { return a < x });
// And as a lambda (implicit 'var' return)
where(array, ($, x = x) => $0 < x));
// The above cases are by value, if you wanted by reference you would just do
where(array, ($, x = &x) => $0 < x));

The above is synonymous to the following Zig code if we allow some kind of implicit cast;

const Closure = struct {
   x: i32, // note: if doing the by value it would be `x: *i32`
   f: fn(i32) bool,
};
where(array, Closure { .x = x, .f = (a: i32, closure: &const Closure) bool => { return a < closure.x } });

HOWEVER, this is where the problem occurs, you require this pointer to exist in the definition, and so we need someway to get around this call and it's been suggested in the past that you can pass around some kind of 'closure' type that allows you to call it like a function but is really just this struct, personally this hides information from the coder and I feel goes against Zig's core, and furthermore would you allow the above 'closure' type to be passed into a function with definition (a: i32) bool?

Instead I propose that we can use LLVM Trampolining in quite a few cases to 'excise' a parameter from the call, which would be the closure information and the call would rather become something like;

const Closure = struct {
   x: i32, // note: if doing the by value it would be `x: *i32`
   // Note: no f
};
fn Foo(env: &Closure, a: i32) bool {
    return a < env.x;
}

var x = runtimeValue();
const c = Closure { .x = x };
where(array, @trampoline(Foo, c));

Note: of course in this case I'm using trampoline as if it was an inbuilt, I'm not actually sure if we want it as an inbuilt, but in reality it is more like generating the LLVM code that trampolines the function. A trampoline (said that word too many times now) just basically stores the value on the stack in advance to the call this would be much more efficient then a typical closure as it would avoid the need for a function pointer and would avoid the ugliness of indirection. HOWEVER, there is not much information on how this effects things like 'arrays of closures' which may instead require a different approach. Note: this is how GCC implements nested functions this is a relevant section.

So basically I propose that we approach this in the order I've given, and perhaps implement that builtin before integrating the closure x = x syntax.

@bnoordhuis
Copy link
Contributor

Trampolines require that the stack is marked executable. That's not the default on most systems and it reduces security, it makes injecting shellcode much easier.

I have some ideas on how to tackle closures but I don't want to derail this issue with counterproposals.

@bronze1man
Copy link

How the closures is alloc?
How to free this alloc?

@BraedonWooding
Copy link
Contributor Author

@bnoordhuis Huh, you got me there, completely forgot that damn. I guess the big challenge with this problem is trying to figure out an efficient solution that isn't just 'every closure gets a struct with a function ptr' since that just deliberately hides information from the coder. Counter proposals are fine since its more an implementation detail. The other idea I had was a 'fat ptr' like solution however that is just bloating a lot of calls, and forces us to use C style function calls (that allow extra parameters).

@BraedonWooding
Copy link
Contributor Author

BraedonWooding commented Jun 4, 2018

@bronze1man sorry I don't understand your questions? No need to free closures as we are talking about stack allocations, you can read up on trampolining if you really want the nitty gritty implementation detail.

@bheads
Copy link

bheads commented Jun 4, 2018

Could we use @newStackCall as part of the allocation.

var needle : i32 = 10;
var lambda = fn [needle] (x: var) bool { return x >needle; };  // yes cpp capture syntax

What I image is lambda is a struct, that has captured init stack instance and a function pointer:

struct {
    stack: Struct { needle : i32 },
    func: (x: var) bool,
}

Calling the lambda would use newStackCall, I am just not sure how to put the stack into scope for the call.

@BraedonWooding
Copy link
Contributor Author

BraedonWooding commented Jun 4, 2018

Could we use @newstackcall as part of the allocation.

Actually I think you are onto something, originally I dismissed your idea since I was like "nah that can't work", however upon considering it I definitely think it could work. What if you could bind a stack to a function, regardless of the functions purpose such as;

This is what the lambda would decompose to basically.

var stack : [8]u8;
var x : i32 = &stack[0]; // skipping cast to make it easier to understand
x.* = 10;
var lambda = fn (a: var) bool { return a < @stack(0, i32); }; // Longer syntax for clearness, @stack giving you the stackptr, then asking for an 'i32' integer
doWhatever(@bindstack(stack[4..], lambda, i32)); // @bindstack returns a function ptr, 4... to indicate stack begins from the '4'th position

Basically the language would have the concept of allowing you to bind a stack to a function, so there is no need to 'implicitly' cast. In effect yes it would look like that struct you gave (kinda) but instead of requiring some weird obscure cast or changing how functions work, we just allow functions to carry a different stack then the one we give it. Of course this requires the function to have a scope within the caller, if the scope is outside perhaps we allocate at start of program then deallocate at end?

As a side note I'm also not too sure about cpp syntax, it looks clunky; and is the cause of much confusion and bugs due to having some finer details that are a little odd. I prefer the more common style that is prevalent in languages like C#, Go, and Python for example since it is much clearer.

@bheads
Copy link

bheads commented Jun 4, 2018

I get the capture syntax is tricky when your learning but it was added to c++ for good reason. One of the good things about the capture syntax is it gives you control over what goes into the closure. This fits nicely with the zen and my personal rule of the principle of least surprise (ie, I have no idea what is getting capture in the closure and I have run into performance issues and bugs in D where it was capturing and trying to make deep copies of things I did not want/need.)

@BraedonWooding
Copy link
Contributor Author

Of course, however I still feel that the syntax I originally proposed is clearer i.e. (a: i32, x = x) => { ... } where x = x is by value and x = &x would be by reference; of course you could change the name of the variable to better match its context i.e. upperbound = x, and maybe could even just not include the new name if it is going to be the same therefore being (a: i32, =x, =&y). I guess it also comes down to not having the [] at the beginning of each one which kinda makes it look uglier.

Honestly a personal preference and we both agree that a clear copy/by ref is really needed which is important, which way it goes I think is not important right now; more important is figuring out how the lambda will be implemented.

@bheads
Copy link

bheads commented Jun 4, 2018

Ah sorry, I was not trying to defend the c++ syntax, just the concept of capturing. So yeah I am on the same page as you.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jun 4, 2018
@andrewrk andrewrk added this to the 0.4.0 milestone Jun 4, 2018
@bjornpagen
Copy link

@BraedonWooding
I really like how neatly this proposal fits into Zig! Making all functions anonymous is a great way to prevent the language from becoming more complex, especially in implementing closures. It would probably be the best way to implement closures in Zig.

When I say that more discussion is needed before Zig gets closures, it has nothing to do with your proposal, only with the concept in general. I just don't think that this anonymous functions and closures are a good idea in Zig.

Zig aims to be a C alternative, not a C++ alternative. C does not have anonymous functions. Lambdas have not been added to C because over 45+ years, no one has ever felt they they needed them. Anonymous functions are simply not useful in imperative languages such as C, and Zig.

I love Zig because it aims to be an alternative for C, and it is the only language with this goal that is on the right track. As of now, Zig is small and simple, and fit for low-level/embedded/os development. Rust is horrible for this purpose: it is endlessly bloated with features, lang items, and 3 standard libraries (core, alloc, std). But individually, these features are very small, like Rust closures. But boy, do they add up, and make for a hell on earth.

So, unless you can come up with some real-life code examples of why we need lambdas, rather than just some quick convenience in sort(), I remain unconvinced.

Everyone, please give your own perspectives and ideas! I don't want to be "that guy", just spouting disapproval in every Zig github issue. ;)

@ghost
Copy link

ghost commented Jul 14, 2018

this proposal seems way to complicated to me

the Syntax of funcs should always be nearly the same.

A normal func looks like this:

fn foo(a : i32, b : i32) bool {
    return a < b;
}

thus a lamda should look like this:
no magic $ involved.

sort(   some_arr, 
        (a : i32, b : i32) bool {
            return a < b;
        } 
    )

And it should be possible to have a func inside a func so that it can only be used locally. This is good for code hygiene and refactoring. The Syntax is the same as for a global function thus its very easy to move them around.

closures are simple as well

fn adder( x : i32) type {
    return fn(y : i32) i32 {
        return x + y;
    };
}

var add40 = adder(40);
assert(add40(2)==42);

Lambdas have not been added to C because over 45+ years, no one has ever felt they they needed them.

that is obviously false, just cause its not in c does in no way mean no one needed it

@bjornpagen
Copy link

bjornpagen commented Jul 14, 2018

that is obviously false, just cause its not in c does in no way mean no one needed it

Here are your examples written in current zig.

First:

fn foo(a : i32, b : i32) bool {
    return a < b;
}

sort(some_arr, foo);

Second:

fn add(a: i32, b: i32) i32 {
    return a+b;
}

assert(add(40,2) == 42);

I don't see the benefit of these functional programming features here. I'm not saying I hate closures: I just want to see an example of them being legitimately useful.

Zig zen bullet 4:

Only one obvious way to do things.

@ghost
Copy link

ghost commented Jul 14, 2018

local functions would be useful as described

And it should be possible to have a func inside a func so that it can only be used locally. This is good for code hygiene and refactoring. The Syntax is the same as for a global function thus its very easy to move them around.

Making foo available although it is just used for sorting is bad because it makes it subject to being used elsewhere thus unwanted dependency arise and refactoring becomes a problem. That is why with local functions and lambdas are useful. Of course those are contrived examples.

I'm not all for closures, I was merely pointing out that they can be added to the language without fuzz/ real new syntax

Only one obvious way to do things.

If you look at how zig does OO this is not the case currently #1205 (comment) (its clearly not the obvious way but rather the hacked way). Thus some language features might indeed be needed to improve code quality.

@bjornpagen
Copy link

bjornpagen commented Jul 14, 2018

@monouser7dig

If you look at how zig does OO this is not the case currently #1205 (comment) (its clearly not the obvious way but rather the hacked way). Thus some language features might indeed be needed to improve code quality.

Ok, finished my case study of std/io.zig. It's a bit cluttered, and you're right: it's a bit hacked. But the Plan 9 extension thing is the solution: not closures. I can imagine an InputStream interface being defined as a struct with only abstract functions, which would be anonymously included in a FileInputStream. That would be a nice, clean, "OO" way to do this. (Besides, the code for FileInputStream and InputStream could just be merged into one "InputStream", everything should be abstracted as a File in the first place, but that's a discussion for another issue/MR)

EDIT:
Even better, a File can be directly embedded into an InputStream. No need for crazy abstractions. A big problem here is the implementations for os.File and friends for different operating systems is all mushed together in the same code. This means a nice UNIX "everything is a file" ideal is thrown out the window, since it's forced to share its codebase with something like Windows. A Zig "file" should be higher level than an OS file, in such a case.

@bronze1man
Copy link

Is a function obj one ptr size or two ptr size?
Closure syntax like golang can not avoid hidden alloc if zig want support apple ios operate system which do not support generate code at runtime.
I think a closure means a function with a context.
If you want to put some context data with a function obj,you need two ptr size if you do not generate asm code at runtime,one for the function address,one for the context object.
But when you pass in a global function, it only have a function address, and the context object is null.
I think golang implement function type with two pointers. And the context data may more than one pointer size. So the runtime need to alloc that context and store the pointer of that alloc object in the function object. That will cause a hidden alloc if you store that function object in a hashmap or list.
I know zig hates hidden alloc. I think alloc is not avoidable if you use a closure with a lot of context data and store it to hashmap to use it after the function return.

If zig design a syntax that can support closure without hidden alloc. Then there may be two type for runnable type. One for no context plain function which is one pointer size. One for closure with context which is two pointer size.
That two runnable types will cause a lot of other problems.

@tiehuis
Copy link
Member

tiehuis commented Jul 15, 2018

Just a note, local, anonymous functions can technically be done already, albeit with a very non-intuitive syntax.

const std = @import("std");

pub fn main() void {
    var s = []u8{ 0, 41, 0, 3 };

    std.sort.sort(u8, s[0..], (struct {
        fn local(a: u8, b: u8) bool {
            return a < b;
        }
    }).local);

    for (s) |b| {
        std.debug.warn("{} ", b);
    }
    std.debug.warn("\n");
}

@ghost
Copy link

ghost commented Jul 15, 2018

Closure syntax like golang can not avoid hidden alloc if zig want support apple ios operate system which do not support generate code at runtime.
I think a closure means a function with a context.

then just do it like coroutines:

// coroutines 
async<*std.mem.Allocator> fn simpleAsyncFn() void {
    x += 1;
}

const p = try async<std.debug.global_allocator> simpleAsyncFn();
// closure 
fn adder( x : i32) type {
    return fn<*std.mem.Allocator>(y : i32) i32 {
        return x + y;
    };
}

var add40 = adder<std.debug.global_allocator>(40);

the syntax is already established in zig you just have to combine it if needed

I put the <> behind the fn because there is no keyword like async and I think it is more readable.
Async syntax could be changed to

// coroutines changed 
async fn simpleAsyncFn<*std.mem.Allocator>() void {
    x += 1;
}

const p = try async simpleAsyncFn<std.debug.global_allocator>();

@binary132
Copy link

binary132 commented Jul 15, 2018

@bronze1man, why can't a closure be allocated the same way any other object on the stack is allocated?

@bronze1man
Copy link

bronze1man commented Jul 15, 2018

@binary132
You can not only allocate closure on the stack.
You have to alloc it on the heap if you want all the functional of closure that golang support.
Because you can store data in the context of the closure.
And the closure can be return to the caller, or store in a global hash map.

@ghost
Copy link

ghost commented Jul 15, 2018

It looks like that coroutines add a context is a closure.It already alloc it's stack on the heap or other place, so add a context to it should not add hidden alloc.

what do you mean?

You have to alloc it on the heap if you want all the functional of closure that golang support.
Because you can store data in the context of the closure.

yes I think this would be possible with the proposed syntax that would mimic coroutines, right?

@bronze1man
Copy link

bronze1man commented Jul 15, 2018

what do you mean?

Sorry, I read the document of the coroutines, and I found I am wrong.

@bronze1man
Copy link

bronze1man commented Jul 15, 2018

I have a propose of the syntax of anonymous function and closure:

anonymous function:

const std = @import("std");

pub fn main() void {
    var s = []u8{ 0, 41, 0, 3 };

    std.sort.sort(u8, s[0..],
        fn (a: u8, b: u8) bool {
            return a < b;
        }
    });

    for (s) |b| {
        std.debug.warn("{} ", b);
    }
    std.debug.warn("\n");
}

anonymous function is a function type, it is one pointer size. It can not capture any local variables.

closure:

const std = @import("std");

pub fn main() void {
    var s = []u8{ 0, 41, 0, 3 };
    var biggerThan: i32 = 10;

    std.sort.sort(u8, s[0..],
        std.closure(std.debug.global_allocator,
            fn (a: u8, b: u8,biggerThan: i32) bool {
                if (biggerThan>a and biggerThan>b){
                    return a>b;
                }else{
                    return a < b;
                }
            }
        ),biggerThan);

    for (s) |b| {
        std.debug.warn("{} ", b);
    }
    std.debug.warn("\n");
}

closure is a type define in std package. it is not the same as function type.It can pass in local variables by function call.

@binary132
Copy link

binary132 commented Jul 15, 2018

If the user wants to capture variables in a closure, they should be careful not to capture stack-locals by reference if that closure is going to be passed back. I guess that's why C++ lambdas default to capture by-value. Otherwise, I don't see the problem, and I definitely do not agree that "you have to alloc it on the heap" except in specific cases.

Zig should be careful not to introduce a syntax that causes heap allocation implicitly or by default.

On that note, I actually really like the C++ syntax for lambdas. It is maybe too featureful, but it is cool.

Another point re. C++ closures: if the value is moved into the lambda, then you can just deallocate it when the lambda's lifetime ends, just as you would with any other value. I don't know if that is supported in Zig, but it would be great.

@ghost
Copy link

ghost commented Jul 15, 2018

@binary123
Zig does not have move although it probably needs to, see #782 (comment)

... the other issue about „CPP Satax is nice“ really destroys the language because having two totally unrelated ways of defining functions/ closures makes for very bad experience as described earlier.

Zig must keep the syntax coherent and not just copy paste different languages together.

@binary132
Copy link

binary132 commented Jul 16, 2018

I said I personally like the C++ lambda syntax. I did not say I think Zig should use it. As a matter of fact, I criticized it. I am not at all suggesting Zig should bolt on syntax features from other languages. I am suggesting Zig should learn from the semantics of C++ lambdas.

I agree, having a single consistent syntax for functions is a good thing. But if function declarations can capture variables, that syntax should take capture semantics (move, stack reference, allocation? please no. etc.) into account. This is an unending source of foot-shootings in Go.

I personally like the way Rust does it where the closure context is an implicit struct having the closure as a "method". If you then disallow structs borrowing stack-local references to leave the lifetime of those references, that might be one way around it. But that would add a lot of language complication to get something which could be solved just by letting the user decide whether to capture by reference or not.

@isaachier
Copy link
Contributor

@binary132 C++ does the same thing as Rust. If you know C++03, there were only hand-written functor structs. C++11 just added compiler-generator anonymous structs.

@isaachier
Copy link
Contributor

@monouser7dig borrowing from other languages can be an invaluable tool. Programming languages have been around for a lot longer than Zig, and many encountered similar issues. Looking at how they solved certain problems should be encouraged.

@binary132
Copy link

binary132 commented Jul 16, 2018

You could just have a comptime function to create a struct type, and construct the struct, from some names in local scope, optionally by reference, and a function that would become an "eval" method on that struct. That new struct type would then be your "functor".... To evaluate the functor, you'd need to then call .eval() on the returned functor (or add a feature for eval on a struct itself, like an anonymous method?) Maybe that's too explicit, but then the more you capture, the more you have to be aware that you're capturing?

That might fit nicely with the "everything is a struct" issue.

@isaachier
Copy link
Contributor

AFAIK the idea of comptime does not extend to complete code generation. It isn't the same as a preprocessor. From what I understand, there is no way to generate a struct on the fly from a set of identifiers using comptime. I came across a simpler instance of this recently. As far as I know, there is no way to generate an array of strings corresponding to enum members at compile time. This is a somewhat unrelated issue, but thought it was worth bringing up to point out comptime isn't a panacea.

@binary132
Copy link

you'd need some more type/reflection capabilities in comptime, and something like a "context" object containing metadata of the local context.

@ghost
Copy link

ghost commented Jul 16, 2018

@isaachier thats no excuse for inventing new language constructs that could have been done equally well with existing syntax which does exist as shown above by me, or are there any remaining shortcomings in that syntax that you see?

@isaachier
Copy link
Contributor

@monouser7dig 100% agree there. If there is a straightforward path from existing syntax to a certain feature, that is the best approach.

@ghost
Copy link

ghost commented Jul 24, 2018

http://number-none.com/blow/blog/programming/2014/09/26/carmack-on-inlined-code.html
from John Carmack

Besides awareness of the actual code being executed, inlining functions also has the benefit of not making it possible to call the function from other places. That sounds ridiculous, but there is a point to it. As a codebase grows over years of use, there will be lots of opportunities to take a shortcut and just call a function that does only the work you think needs to be done

making another point for local functions #1048 (comment)

(and a sane syntax, not structs, that encourage usage of such)

and as a further IMO very good suggestion

It would be kind of nice if C had a “functional” keyword to enforce no global references.

@jarble
Copy link

jarble commented Oct 11, 2020

It's already possible to define a "closure" using a function within a struct. This program compiles successfully, for example:

pub fn main() void {
    const j = 1;
    var b = struct{
        fn function(x: i32) i32 {
            return x+j;
        }
    }.function;

    var c = b(1);
}

@ikskuh
Copy link
Contributor

ikskuh commented Oct 11, 2020

@jarble Note that this only works as j is comptime known and thus can be captured at compile time. When you initialize j with a runtime-known value, this code will fail

@phykos
Copy link

phykos commented Nov 29, 2020

Lambdas have not been added to C because over 45+ years, no one has ever felt they they needed them.

@bjornpagen this is not true.
GCC and Clang have lambda like blocks.

See: https://gist.github.com/gburd/4113660

@leira
Copy link

leira commented Feb 13, 2021

@jarble Note that this only works as j is comptime known and thus can be captured at compile time. When you initialize j with a runtime-known value, this code will fail

This works at runtime:

test "closure" {
  var a: i32 = 1;
  a += 1;

  const addX = (struct {
    a: i32,
    fn call(self: @This(), x: i32) i32 {
      return x + self.a;
    }
  } { .a = a }).call;

  testing.expect(addX(2) == 4);
}

@kevinswiber
Copy link

@leira This works because of first-class function support in the language. For an example of where it falls short, notice that addX can't be passed as a parameter, as it's a bound function.

First-class functions without closure support can be a bit frustrating, which leads me to this issue in the first place. I was hoping to build parser combinators with function composition, and lack of closure support is kinda biting me, mostly because I was really determined to try to find a workaround but have so far been unsuccessful. 😅

@iacore
Copy link
Contributor

iacore commented Aug 21, 2021

I wonder how no one mentions https://www.nmichaels.org/zig/interfaces.html

@JustAnotherCodemonkey
Copy link

I don't know why Andrew marked as completed.

Maybe a little unrelated but it's a sticking point where someone mentions "I like to be able to create code dynamically like via macros. It dramatically eases some workflows" and points to Rust macros or even the C preprocessor and then you point out Zig's compiletime system and say "no need" but they say "for structs; what about functions?" and I really don't have much of an answer. I know being able to construct functions at compile time may be a bit much but to me it does seem very much possible with my understanding even without macros.

I agree with those that say that using structs to define anonymous functions is a major weakness and easily needlessly complicated. To those who say "why add a language feature when it's already possible", I point to the entire language and say "why add a language when it's already possible via C". Having a common feature be achieved via a workaround (even if small), especially if users of all levels will have to see, do, and deal with it is, in fact, a major weakness. We shouldn't need to use the struct workaround.

Honestly, when I first came to this language, I was extremely confused by the fact that functions aren't just all consts bound to anonymous functions / closures. It's very bizarre for a language seemingly on the surface as fluid and reflective as the ocean, be full of these hard, inflexible poles sticking out.

@mlugg
Copy link
Member

mlugg commented Sep 8, 2023

I don't know why Andrew marked as completed.

The issue close reasons is a pretty recent GitHub feature, and didn't exist at the time this was closed. (GitHub marked all old closures as "completed" when they added this feature.)

Functions returning functions is absolutely possible: comptime values can be captured into a nested struct scope, and logic in the function can use those comptime-known values to do conditional compilation.

fn isMultipleOfFn(comptime n: u64) fn (u64) bool {
    return struct {
        fn f(x: u64) bool {
            return x % n == 0;
        }
    }.f;
}

This is a little bit clunky, but that's kind of intentional. Zig to an extent discourages this kind of metaprogramming pattern, because logic is almost always significantly easier to write, understand, and debug when functions contain "direct" logic, even if it's a little longer. If you're using callbacks, it's almost always more correct to use a context struct with methods, to avoid requiring globals. But this functionality exists and works, and you can use it if it's genuinely the right tool.

For more complex code generation, the Zig build system makes it really easy to add custom build steps that generate Zig source code if required. Again, this isn't a thing you should have to do often, but if it feels like the right choice then there's nothing wrong with doing it!

As for the rest of your comment, it sounds a lot like you're discussing #1717: I recommend taking a look at the last comment on that issue for a bit more context as to why functions retain the syntax they have today.

@JustAnotherCodemonkey
Copy link

Hm yes that does help to elaborate. I disagree but I can see a bit more. I still think that if we have try expressions, the anonymous function syntax is entirely reasonable.

Although I adore functional programming, I completely understand sticking to your imperative guns but there are many times where using callbacks is not only very warranted but necessary so I feel like punishing users for doing something they may have to do frequently if they're working with certain C libraries and many low-level APIs is not good. I'm specifically thinking about windowing and audio which tends to rely heavily on you supplying callbacks. Of course that's all very much possible with Zig and furthermore, naming those callbacks is a good idea but it makes the point that there are plenty of times that a user would have to supply a small lambda / callback and anonymous functions would make that so much easier.

I also don't like how they said that "anonymous functions don't show up in call stacks" because Rust proves this (like a lot of anti-anon-func arguments) not necessarily correct. Anonymous functions may not have names but that doesn't mean they can't have ID. In rust, the name and type of closures (closure are also unique anonymous types) is something like [closure@1e48baf89faaufdhsfundsu]. Also, having all functions be named is actually a drawback in of itself as small callback functions can pollute the namespace when they're only used for a single little thing.

Oh well I suppose

@mlugg
Copy link
Member

mlugg commented Sep 8, 2023

...but there are many times where using callbacks is not only very warranted but necessary...

In pretty much any scenario, plain callbacks actually aren't what you want: instead you want a context type. The core issue here is that Zig doesn't have closures, so you can't easily refer to locals scoped outside your callback. So let's say you're using some library which streams data to you in a callback, and you want to buffer all that data into an ArrayList. With simple function callbacks, that would look like this (in a fictional world with anon functions):

test {
    var data_stream = try beginDataStream();
    var bytes = std.ArrayList(u8).init(std.testing.allocator);
    defer bytes.deinit();
    try data_stream.stream(fn (data: []const u8) !void {
        // Hmm...
    });
}

What do we put in the function body? We can't easily refer to bytes, because it's from an outer scope. We'd have to put a pointer to bytes into a global - this means the logic can't be reentrant and you also have to mark the global thread-local to avoid nasty bugs. Yuck!

C libraries generally handle this by allowing the callback to receive an arbitrary user-supplied pointer as a parameter, generally called the "user pointer" or "user data" or similar. This might be e.g. an integer you've cast to a pointer, or maybe it's an actual pointer to some bigger structure in memory. This solution totally works, but it comes at the cost of type-safety and may add unnecessary pointer indirections. It makes sense for C, but in Zig we can do better!

This is where context structs come in. Instead of taking a single function as our callback, we instead take a type with a method on it (we would usually put all the callbacks we need on that one type). We then take a value of that type, and call the methods on it. So to implement the above example, you do this:

var data_stream = try beginDataStream();
var bytes = std.ArrayList(u8).init(std.testing.allocator);
defer bytes.deinit();
const Context = struct {
    out_bytes: *std.ArrayList(u8),
    pub fn process(ctx: @This(), data: []const u8) !void {
        try ctx.out_bytes.appendSlice(data);
    }
};
try data_stream.stream(Context{ .out_bytes = &bytes });

This solution gives us more type safety, improves clarity, and allows the context to be stored and passed directly, potentially avoiding an unnecessary pointer indirection.

I don't want to assert without evidence that this pattern is always applicable where you want to use a callback, but it definitely makes sense in every case I've seen. The fact is, plain anonymous functions in a language without closures aren't actually overly useful. They sometimes make sense for, say, sorting functions (to which you are providing a comparison function), but sorting can still make use of contexts sometimes, so it still makes sense for us to provide this more powerful (and still safe) API.

@expikr
Copy link
Contributor

expikr commented Nov 5, 2023

@jarble Note that this only works as j is comptime known and thus can be captured at compile time. When you initialize j with a runtime-known value, this code will fail

This works at runtime without context structs:

const print = @import("std").debug.print;

pub fn main() void {

    var j: usize = do: {
        for(0..101) |i| {
            if (i*2 == 100) break :do i;
        }
    };

    procedure(j);

    j = 100;
    
    procedure(j);
}


fn procedure(j: usize) void {


    const closure = (opaque {
        var hidden_variable: usize = 0;
        pub fn init(state: usize) *const @TypeOf(run) {
            hidden_variable = state;
            return &run;
        }
        fn run() void {
            print("{}\n",.{hidden_variable});
            hidden_variable += 1;
        }
    }).init(j);

    useClosure(closure, 10);


}


fn useClosure(func: anytype, times: usize) void {
    for (0..times) |_| {
        func();
    }
}

@ianprime0509
Copy link
Contributor

@expikr container-level variables have static lifetimes, so unfortunately that pattern doesn't have the intended effect: all the "closures" created using that method share the same hidden_variable, as in the following example:

const print = @import("std").debug.print;

pub fn main() void {
    const counter1 = counter(0);
    counter1();
    const counter2 = counter(5);
    counter2();
    counter1();
}

fn counter(j: usize) *const fn () void {
    return (opaque {
        var hidden_variable: usize = 0;
        pub fn init(state: usize) *const @TypeOf(run) {
            hidden_variable = state;
            return &run;
        }

        fn run() void {
            print("{}\n", .{hidden_variable});
            hidden_variable += 1;
        }
    }).init(j);
}

This prints 0, 5, 6, while it would be expected to print 0, 5, 1 if these were true closures. Using a context struct avoids this by leaving the caller in control of the memory for the state data.

@expikr
Copy link
Contributor

expikr commented Nov 5, 2023

one could further force a hack with inline fn which essentially declares a distinct static object at each callsite:

const print = @import("std").debug.print;

pub fn main() void {
    const counter1 = counter(0);
    counter1();
    const counter2 = counter(5);
    counter2();
    counter1();
}

inline fn counter(j: usize) *const fn () void {
    return (opaque {
        var hidden_variable: usize = 0;
        pub fn init(state: usize) *const @TypeOf(run) {
            hidden_variable = state;
            return &run;
        }

        fn run() void {
            print("{}\n", .{hidden_variable});
            hidden_variable += 1;
        }
    }).init(j);
}

Of course, this is still static instancing at compile time, but for use cases where you don't expect to have runtime-known instancing counts it could be more concise under specific circumstances.

Mostly it's just for fun to figure out code golfing tricks though

@htqx
Copy link

htqx commented Apr 22, 2024

Key points of anonymous functions:

  1. Used as an expression
  2. The essence of closure is structure
    1. However, var members have side effects and are a poor design that should be avoided. (At least it shouldn't be too easy)
    2. Either a const member
    3. Either a field. Fields are passed as parameters
      To summarize I recommend:
  3. Simple and easy-to-read syntax (for use as expressions)
  4. Don’t capture variables
// 1
const f : fn(i32,i32) i32 = (a,b) => a+b;
// 2
const Func = fn(i32,i32)i32;  // Arbitrarily complex function types
const f: Func = (a,b)=>a+b;
// 3
const f = (a,b)=> a+b; // fn(a:anytype, b:anytype) @TypeOf(a,b) {return a+b;}
// 4
var a:i32 = 123;
var b:i32 = 456;
const f = (self)=> self.a + self.b;
_ = f(.{.a = a, .b = a}); // capture var
// 5
const a = 123;
const b = 456;
const f = ()=> a + b; // capture comptime const
// 6
const f = (a,b)=>{return a+b;} // Similar to below
//const f = struct { fn anonymous(a:anytype, b:anytype) @TypeOf(a,b) {return a + b;}}.anonymous;

@justin330
Copy link

justin330 commented Jun 4, 2024

Would this work?

fn someFunction() void {

  // some local vars to capture
  var x: u32 = 0;
  const y: i32 = 10;
  
  // proposed syntax A, like C++'s auto x = [&]() -> void {...}
  const lambda = .{&x, &y}() void {
  ...
  };
  
  // or (preferably) proposed syntax B
  const lambda = @lambda(.{&x, &y}, inline fn () void {
    // pointer
    capture[0].* = ...
    
    // copy
    capture[1] = ...

     ...
  });
  
  // which expands into this, for a single @lambda() call with unique captures: (works in 0.12)
  const lambda = blk:{
    const _capture = .{&x, y};
    
    const Lmb_Ptr_x_Val_y = struct { 
	    var capture: @TypeOf(_capture) = undefined;
    
	    inline fn func() void {
		    ...
		    // pointer
		    capture[0].* = ...
		    
		    // this is a copy
		    capture[1] = ...
	    }
    };
    
    Lmb_Ptr_x_Val_y.captures = _capture;
    
    break:blk Lmb_Ptr_x_Val_y.func; // or &Lmb_Ptr_x_Val_y.func
  };
  
  lambda();

Otherwise if more than a single lambda uses the same set of captures (and all of them being pointers), this could be expanded/optimised into:

  var x: u32 = 0;
  var y: i32 = 10;
  
  // inserted directly below the last capture
  const Lambda_Ptr_x_Ptr_y = struct {
    var capture: @TypeOf(.{&x, &y}) = undefined;
    
    inline fn funcA() void {
	    ...
	    // pointer
	    capture[0].* = ...
	    
	    // this is a copy
	    capture[1] = ...
    }
    
    inline fn funcB() void {
	    ...
    }
  };
  
  Lambda_Ptr_x_Ptr_y.capture = .{&x, y};

  ...

  const lambda = Lambda_Ptr_x_Ptr_y.funcA;
	  
  const lambda_b = Lambda_Ptr_x_Ptr_y.funcB;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests