Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul implementation of for-generators #844

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

odenix
Copy link
Contributor

@odenix odenix commented Dec 10, 2024

Based on: #837

Motivation:

  • fix known bugs and limitations of for-generators
  • improve code health by removing complex workarounds

Changes:

  • simplify AstBuilder code related to for-generators
    • track for-generators via SymbolTable.enterForGenerator()
    • add RestoreForBindingsNode during initial AST construction
      instead of calling MemberNode.replaceBody() later on
    • simplify unnecessarily complex code such as objectMemberInserter
  • remove workarounds and band-aids such as:
    • isInIterable
    • executeAndSetEagerly
    • adding dummy slots in AmendFunctionNode
  • overhaul implementation of for-generators
    • store keys and values of for-generator iterations in regular instead of auxiliary frame slots
      • set them via TypeNode.executeAndSet()
      • ResolveVariableNode no longer needs to search auxiliary slots
      • Read(Enclosing)AuxiliarySlot is no longer needed
    • at the start of each for-generator iteration, create a new VirtualFrame
      that is a copy of the current frame (arguments + slots)
      and stores the iteration key and value in additional slots.
    • execute for-generator iteration with the newly created frame
      • childNode.execute(newFrame)
      • Pkl objects created during the iteration will materialize this frame
    • store newly created frames in owner.extraStorage
      if their for-generator slots may be accessed when a generated member is executed
      • resolving variable names to for-generator variables at parse time would make this analysis more precise
    • when a generated member is executed,
      * retrieve the corresponding frame stored in owner.extraStorage
      * copy the retrieved frame's for-generator slots into slots of the current frame

Result:

  • for-generators are implemented in a correct, reasonably simple, and reasonably efficient way
    • complexity is fully contained within package generator and AstBuilder
  • for-generator keys and values can be accessed from all nested scopes:
    • key and value expressions of generated members
    • condition expressions of nested when-generators
    • iterable expressions of nested for-generators
  • for-generator keys and values can be accessed from within objects created by the expressions listed above
  • sibling for-generators can use the same key/value variable names
  • parent/child for-generators can use the same key/value variable names
  • fixes Late-bound values of iteratees within nested for/spread fail to resolve for-generator variables #741

Limitations not addressed in this PR:

  • object spreading is eager in values
    This should be easy to fix.
  • for-generators are eager in values
    I think this could be fixed by:
    • resolving variable names to for-generator variables at parse time
    • replacing every access to a for-generator's value with iterable[key]
  • for/when-generator bodies can't have local properties/methods
    I think this could be fixed by:
    • resolving variable names to local properties/methods at parse time
    • internally renaming generated local properties/methods to avoid name clashes

Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is a good improvement in how for-generators work!

I did a first pass, take a look at my comments.

Also, this is slower than the current implementation. Some quick numbers:

// test.pkl

amends "pkl:Benchmark"

outputBenchmarks {
  ["for-generator"] {
    sourceModule = import("test2.pkl")
  }
}
// test2.pkl
res {
  for (i in IntSeq(1, 10000)) {
    i
  }
}

Running this benchmark produces:

Current Pkl:

outputBenchmarks {
  ["for-generator"] {
    iterations = 15
    repetitions = 50
    min = 2.86.ms
    max = 3.01.ms
    mean = 2.96.ms
    stdev = 0.05.ms
    error = 0.03.ms
  }
}

This PR:

outputBenchmarks {
  ["for-generator"] {
    iterations = 15
    repetitions = 50
    min = 3.5.ms
    max = 3.88.ms
    mean = 3.64.ms
    stdev = 0.1.ms
    error = 0.06.ms
  }
}

Tested on macOS/M1, testing the native executable (built with -DreleaseBuild=true)

@odenix
Copy link
Contributor Author

odenix commented Dec 20, 2024

Also, this is slower than the current implementation

Correctness has a performance cost here. However, we can probably also find benchmarks where the new implementation performs better than the old one. For example, ResolveVariableNode now has less work to do, and iterables are no longer forced. Also, there is probably room to improve performance, especially if we have some representative benchmarks.

I'm not sure if this is the right model; there's many members for just this one frame. It's doing a lot of extra allocation here.

This instantiates 100 frames. Each frame captures one iteration's key/value binding. I'm pretty confident this is the right model (even discussed it in Truffle Slack). It's a more efficient form of having a root node for the loop body and calling it 100 times, which would also result in 100 frames. The old implementation did a similar allocation for every iteration: https://github.com/apple/pkl/blob/main/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorMemberNode.java#L115-L117

Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: performance regression:

I tested these changes against some real-world code, and this is what time has to say about it:

Pkl 0.27:

________________________________________________________
Executed in  299.86 secs    fish           external
   usr time   29.09 mins    0.11 millis   29.09 mins
   sys time    1.01 mins    2.81 millis    1.01 mins

These changes:

________________________________________________________
Executed in  297.81 secs    fish           external
   usr time   28.74 mins   53.00 micros   28.74 mins
   sys time    1.00 mins  889.00 micros    1.00 mins

So, all in all, the changes introduced here are actually slightly faster in my one run (although not significant enough to rule out noise), so, my performance concerns really aren't that high.

I made another pass. I still need to get through more of this code, but want to provide some more feedback rather than delay too much more.

Motivation:
* fix known bugs and limitations of for-generators
* improve code health by removing complex workarounds

Changes:
* simplify AstBuilder code related to for-generators
  * track for-generators via `SymbolTable.enterForGenerator()`
  * add `RestoreForBindingsNode` during initial AST construction
    instead of calling `MemberNode.replaceBody()` later on
  * simplify some unnecessarily complex code
* remove workarounds and band-aids such as:
  * `isInIterable`
  * `executeAndSetEagerly`
  * adding dummy slots in `AmendFunctionNode`
* overhaul implementation of for-generators
  * store keys and values of for-generator iterations in regular instead of auxiliary frame slots
    * set them via `TypeNode.executeAndSet()`
    * `ResolveVariableNode` no longer needs to search auxiliary slots
    * `Read(Enclosing)AuxiliarySlot` is no longer needed
  * at the start of each for-generator iteration, create a new `VirtualFrame`
    that is a copy of the current frame (arguments + slots)
    and stores the iteration key and value in additional slots.
  * execute for-generator iteration with the newly created frame
    * `childNode.execute(newFrame)`
    * Pkl objects created during the iteration will materialize this frame
  * store newly created frames in `owner.extraStorage`
    if their for-generator slots may be accessed when a generated member is executed
    * resolving variable names to for-generator variables at parse time would make this analysis more precise
  * when a generated member is executed,
	  * retrieve the corresponding frame stored in `owner.extraStorage`
	  * copy the retrieved frame's for-generator slots into slots of the current frame

Result:
* for-generators are implemented in a correct, reasonably simple, and reasonably efficient way
  * complexity is fully contained within package `generator` and `AstBuilder`
* for-generator keys and values can be accessed from all nested scopes:
  * key and value expressions of generated members
  * condition expressions of nested when-generators
  * iterable expressions of nested for-generators
* for-generator keys and values can be accessed from within objects created by the expressions listed above
* sibling for-generators can use the same key/value variable names
* parent/child for-generators can use the same key/value variable names
* fixes apple#741

Limitations not addressed in this PR:
* object spreading is eager in values
  This should be easy to fix.
* for-generators are eager in values
  I think this could be fixed by:
  * resolving variable names to for-generator variables at parse time
  * replacing every access to a for-generator's `value` with `iterable[key]`
* for/when-generator bodies can't have local properties/methods
  I think this could be fixed by:
  * resolving variable names to local properties/methods at parse time
  * internally renaming generated local properties/methods to avoid name clashes
@odenix
Copy link
Contributor Author

odenix commented Jan 23, 2025

I believe I've addressed all requested changes.

Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is an excellent improvement! As always, thanks for the high quality contributions :)

Would like one more thumb from either @holzensp or @stackoverflow before merging.

@bioball bioball added this to the Pkl 0.28.0 milestone Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Late-bound values of iteratees within nested for/spread fail to resolve for-generator variables
2 participants