Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct2CPS Scheduling/Calling Problem #141

Open
NeuralCoder3 opened this issue Nov 10, 2022 · 6 comments
Open

Direct2CPS Scheduling/Calling Problem #141

NeuralCoder3 opened this issue Nov 10, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@NeuralCoder3
Copy link
Collaborator

NeuralCoder3 commented Nov 10, 2022

The direct2cps dialect has a bug as demonstrated in
https://github.com/NeuralCoder3/thorin2/blob/test/direct_old_var/lit/direct/ad_mem.thorin.disabled

The bug manifests in the test case only in complex circumstances.
Two cps2ds calls are used with functions that contain cps2ds where a function is created that is called.

Removing some of the complexity resolves the bug.

The dialect rewrites the function but only introduces new lambdas for the continuations.
To do so, the body of the old functions is replaced.
Therefore, the origin of the bug is unclear.

@NeuralCoder3 NeuralCoder3 added the bug Something isn't working label Nov 10, 2022
@NeuralCoder3 NeuralCoder3 reopened this Dec 2, 2022
@NeuralCoder3
Copy link
Collaborator Author

Current branch: https://github.com/NeuralCoder3/thorin2/tree/direct_fix
Example command ./build/bin/thorin -d direct -o - -VVVV lit/direct/ad_mem.thorin.
The issue persists even if the pass degenerates to identity by replacing
https://github.com/NeuralCoder3/thorin2/blob/direct_fix/dialects/direct/passes/cps2ds.cpp#L48
with
if (auto app = def->isa<App>(); app && false) {

@NeuralCoder3
Copy link
Collaborator Author

NeuralCoder3 commented Dec 5, 2022

The issue was that nested lambda rewrites overwrote the current lambda screwing up the call trace.
The debugging problem was that the error was hidden and only occurred at the debug dump of the world.
The recursive lambda call is self was also hidden in op rewrites in some cases.
Additionally, this error only occurs in complex situations (like a nested call with multiple ds2cps calls on each level).

The solution is to explicitly handle lambdas and to manage a meta call stack.

Note: There is still a case where either

  • cps2ds remains
  • old var occurs (if rewritten_lams is ignored)

This issue might be related to scheduling.

@leissa
Copy link
Member

leissa commented Dec 6, 2022

I had similar probls when implementing SSAConstr, CopyProp, etc. To me the problem seems to be that you just shouldn't rewrite the body. What you should do instead is a reduction of the nominal in question and replace everything with the desired Defs up front.
So instead of this:

if (auto app = def->isa<App>()) {
    if (auto callee = app->callee()->isa_nominal<Lam>()) {
        map[some_old_value] = some_new_value;
        // and do sth later on with map
    }
}

You should do it like this:

if (auto app = def->isa<App>()) {
    if (auto callee = app->callee()->isa_nominal<Lam>()) {
        auto new_ops = callee->reduce(/*my_new_args*/);
        // ...
     }
}

@NeuralCoder3
Copy link
Collaborator Author

NeuralCoder3 commented Dec 6, 2022

The current issue does not happen in the original pass. Instead, partial evaluation inlines functions and causes an issue to surface.

Simplified ad_mem2:

.con .extern main __690558::[mem_690876: %mem.M, argc_691342: .Idx 4294967296, %mem.Ptr (%mem.Ptr (.Idx 256, 0), 0), return_690562: .Cn [%mem.M, .Idx 4294967296]] @(0:(.Idx 2)) = {
    .let A: [.Idx 4294967296, .Cn [.Idx 4294967296, .Cn «2; .Idx 4294967296»]] = %direct.cps2ds_dep («2; .Idx 4294967296», Uf_691251) eta_inner_mul_deriv_cps_691294 (42:(.Idx 4294967296), argc_691342);
    //                                                                                 ^
    .con comp_tup_pb__691250 _691347::[_691349: .Idx 4294967296, _691358: .Cn [%mem.M, .Idx 4294967296]] @(1:(.Idx 2)) = {
        A:(.Idx 2) (_691349, comp_tup_pb__cont_691356)
    };
    .let B: %mem.M = f(A);
    .let C: [%mem.M, .Idx 4294967296] = f(B);
    .let D: [%mem.M, .Cn [.Idx 4294967296, .Cn [%mem.M, .Idx 4294967296]]] = f(C);
    .con comp_tup_pb__690636 _691889::[_691891: .Idx 4294967296, _691894: .Cn [%mem.M, .Idx 4294967296]] @(1:(.Idx 2)) = {
        .con comp_tup_pb__cont_691892 _691940::[_691941: .Idx 4294967296, _692020: .Idx 4294967296] @(1:(.Idx 2)) = {
            .let _691942: [%mem.M, .Idx 4294967296] = %direct.cps2ds_dep (.Idx 4294967296, Uf_691899) zero_pb_691934 _691941;
            .let _692021: [%mem.M, .Idx 4294967296] = %direct.cps2ds_dep (.Idx 4294967296, Uf_691949) D _692020;
            .let _692028: .Idx 4294967296 = %core.wrap.add 4294967296 0 (_691942#1:(.Idx 2), _692021#1:(.Idx 2));
            _691894 (:%mem.M, _692028)
        };
        .let E: [.Idx 4294967296, .Cn [.Idx 4294967296, .Cn «2; .Idx 4294967296»]] = %direct.cps2ds_dep («2; .Idx 4294967296», Uf_690659) eta_inner_mul_deriv_cps_690710 (100:(.Idx 4294967296), C:(.Idx 2));
        E:(.Idx 2) (_691891, comp_tup_pb__cont_691892)
    };
    .let _692035: [%mem.M, .Idx 4294967296] = %direct.cps2ds_dep (.Idx 4294967296, Uf_690595) comp_tup_pb__690636 1:(.Idx 4294967296);
    .let _692095: [%mem.M, .Idx 4294967296] = %direct.cps2ds_dep (%mem.M, Uf_692052) zero_pb_692087 D:(.Idx 2);
    .let _692102: .Idx 4294967296 = %core.wrap.add 4294967296 0 (_692035#1:(.Idx 2), _692095#1:(.Idx 2));
    return_690562 (:%mem.M, _692102)
};

The underlying problem seems to be that a function (A) is cps2ds called in main before others but the iteration first inlines other calls (bottom) because the first call is not directly used as an argument.
Therefore, the call trace is altered and some child-parent relations get destroyed.

The functions are not correctly inlined/rewritten as cps in such complex circumstances.

@NeuralCoder3 NeuralCoder3 changed the title Direct2CPS Old var bug Direct2CPS Scheduling/Calling Problem Jan 6, 2023
@NeuralCoder3
Copy link
Collaborator Author

NeuralCoder3 commented Jan 6, 2023

Current work in branch direct_scheduling.
(Pass: cps2ds)

To recap:
The problem is that res = cps2ds fun args calls are unordered to a certain degree.
This freeness results in wrong translations to cps calls.

The general idea of the translation is to generate a cps call and use the argument
of the continuation in place of the previous direct result.

res = cps2ds fun args
...

becomes

fun(args, fun_cps_cont)

.cn fun_cps_cont [res] = {
  ...
}

However, this procedure becomes problematic in more complex situations (e.g. lit/direct/ad_mem2)
where many parallel and nested cps2ds calls are present.

One kind of problem:

{
  a = cps2ds f1 x
}
b = cps2ds f2 y

If x contains b or y contains a, a certain order and nesting of calls is required.
This concept can be nested more indirectly such that the dependency is not found using depth-first search.

We possibly need to first call a function found later on in an outer scope.

One Idea:
We could impose strict ordering using mem/locks. However, this would increase the complexity at many other places and impose artificial constraints.

Previous Translation:
At the point of encountering a cps2ds call (ds call for short):

  • We replace the body of the current function with the cps call and
  • Place the original body with the continuation argument substituted for the call in the continuation.
h:
b = f a
C[b]

=>

h:
    f'(a,h_cont)

h_cont(b):
    C[b]

f : A -> B
f': .Cn [A, ret: .Cn[B]]

Idea:
Place the call smartly and not directly at the old call to capture the dependencies.

Problem:
The default schedulers for the calls are not sufficient.
The dependencies/nestings of the calls between each other are not taken fully into account.

Attempt:
Separate the stages:

  • Find ds calls, prepare cps calls and continuations
  • Replace ds calls with continuation arguments (do not place cps calls yet)
  • Place the cps calls taking the uses of the continuation arguments and the cps call arguments into account

Possibly, a good enough place for the calls might be enough:

We want the place the call so low (depth large)
that all arguments of the call are in view.
But simultaneously it needs to be high (depth small)
enough that the result (from the continuation) can be accessed at the previous cps2ds call sites.

Linked to the depth is the call nesting.
As we place function calls, we order and nest functions.
This property is strictly linked to the depth:
If the place is main, we replace the body of main with the call to fun and continue
the body of main in fun_cps_cont.
If the place is other_cps_cont, we implicitly nest the function fun to be called after other.

The open question is how to find this place. Can the scope do this for us? If so, how (for which def) do I compute this place?

@leissa
Copy link
Member

leissa commented Jan 9, 2023

I think we should discuss this online in our next meeting. I've difficulties following you here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants