Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new optional Monty-backed REPL environment to the rlm.environments routing system, along with dependency wiring and basic tests/docs so users can select environment="monty".
Changes:
- Introduce
MontyREPL(non-isolated) with stdout capture and AST-based variable persistence across code blocks. - Register the new environment type (
monty) in environment routing/types and add an optional dependency extra. - Add import-guarded tests for Monty availability and basic REPL behavior; update README installation/docs.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
rlm/environments/monty_repl.py |
New Monty-based REPL implementation with state persistence and LM query helpers |
rlm/environments/__init__.py |
Registers "monty" in get_environment() routing |
rlm/core/types.py |
Extends EnvironmentType Literal to include "monty" |
tests/test_monty_repl.py |
Adds Monty REPL smoke tests (import-skipped if dependency missing) |
tests/test_imports.py |
Adds Monty import checks and optional-module circular import coverage |
pyproject.toml |
Adds monty optional extra dependency |
uv.lock |
Locks pydantic-monty and adds monty extra metadata |
README.md |
Documents MontyREPL and installation via optional extra |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Addressed review comments in latest push (README grammar, stderr capture, in-block FINAL_VAR/SHOW_VARS, AssignedNameCollector walrus/match, tests, persistence guard/docstring, monty persistent completion test). Happy to resolve threads if needed. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fix cleanup() to clear stderr_parts and reset counters, guard state restoration to avoid silently setting variables to None, remove dead final_var()/show_vars() instance methods, and add execution-level stderr capture test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Conflicts: # rlm/core/types.py # rlm/environments/__init__.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
looks awesome, can't wait! |
|
That's great work but I have some bad news... I was looking into that a couple of days ago but it's unfortunately not that easy to just plug in Monty because of its limitations. 1. Functions and closures don't survive across turnsThis affects any user code that defines a function or closure in one block and calls it in a later block. It essentially means Monty does not support a real REPL. At least yet without an awkward workaround. This means it's pretty hard to use it with RLM because it assumes using an actual REPL and not just calling independent snippets of code. Monty's input boundary converts function objects to their string representation. A function defined in turn N is stored in Tests ( class TestMontyREPLValueRoundTrip:
"""Tests that non-primitive values survive across execution turns."""
def test_function_survival(self):
"""Define a function in block 1, call it in block 2."""
repl = MontyREPL()
repl.execute_code("def helper():\n return 42")
assert "helper" in repl.locals
result = repl.execute_code("print(helper())")
assert "42" in result.stdout # FAILS: stdout is ''
def test_closure_survival(self):
"""Define a closure in block 1, call it in block 2."""
repl = MontyREPL()
repl.execute_code(
"def make_adder(n):\n return lambda x: x + n\nadd5 = make_adder(5)"
)
assert "add5" in repl.locals
result = repl.execute_code("print(add5(10))")
assert "15" in result.stdout # FAILS: stdout is ''
def test_function_referencing_cross_turn_variable(self):
"""Variable from turn 1, function closing over it in turn 2, called in turn 3."""
repl = MontyREPL()
repl.execute_code("data = [1, 2, 3]")
repl.execute_code("def total():\n return sum(data)")
result = repl.execute_code("print(total())")
assert "6" in result.stdout # FAILS: stdout is ''Why is this problematicLLMs naturally define helper functions in early turns and reuse them later. For example: turn 1 defines The naive workaround -- re-executing all previous code from scratch each turn (source replay) -- avoids the function problem but introduces a speed problem. Every previous network call is repeated on each new turn: if turn 1 fetched a URL, that fetch runs again in turn 2, and again in turn 3, and so on. Replay time grows linearly with the number of fetches across the session. For any code that interacts with external services, this quickly becomes impractical. Source replay also introduces side-effect problems. File system operations are the worst case: if turn 1 writes a file and turn 2 appends to it, replaying both turns would write the file twice and then append, corrupting the result. Similarly, deleting and recreating directories, moving files, or incrementing counters stored on disk would all produce wrong results on replay. |
Why Monty?
Summary
Testing
Haiku
Tiny sandbox hums
Code whispers in quiet loops
Monty guards the sparks