Skip to content

User-defined classes#515

Open
samuelcolvin wants to merge 3 commits into
mainfrom
impl-classes
Open

User-defined classes#515
samuelcolvin wants to merge 3 commits into
mainfrom
impl-classes

Conversation

@samuelcolvin

@samuelcolvin samuelcolvin commented Jun 28, 2026

Copy link
Copy Markdown
Member

Support class Foo: ... with instance methods, __init__, __repr__/__str__, and literal class variables. Methods are ordinary functions whose first parameter is self, so defaults/keyword args come for free.

  • Parse: new Node::ClassDef; reject inheritance/metaclasses, class/method decorators, and non-literal class bodies at parse time.
  • Prepare: class name binds in the enclosing scope; methods are prepared with register_name=false so class scope is skipped for free-var resolution while still capturing enclosing-function locals (matching CPython).
  • Compile: extract emit_make_function; new BuildClass opcode (appended).
  • Heap: new Class, Instance, BoundMethod types wired into HeapData, HeapReadOutput, GC walkers, Type, and PyTrait dispatch.
  • Instantiation: Foo(...) allocates the instance, runs __init__(self, ...) as a real (suspendable) frame flagged is_initializer; ReturnValue discards the None and leaves the instance. The flag round-trips through frame serialization so a suspended __init__ resumes correctly.
  • repr/str dispatch to user __repr__/__str__ via evaluate_function, intercepted at the Value level (where the heap id is available).
  • type(obj) returns the class object; type(obj) is Foo and isinstance work.

Identity-only equality, always-truthy instances, no inheritance/other dunders yet; divergences documented in limitations/classes.md.


Summary by cubic

Adds user-defined classes with a real, CPython-like class-body scope. Class variables can be any expressions (and reference earlier ones), class bodies and __init__ run as suspendable frames, and instances work with type()/isinstance().

  • New Features
    • Parser/AST: new Node::ClassDef; reject inheritance/metaclasses and class/method decorators; compile the class body as a synthetic zero-arg function that runs top-to-bottom.
    • Name resolution: class name binds in the enclosing scope; methods skip the class scope for free vars while capturing enclosing-function locals; conflicting captures vs class members return a clean NotImplementedError.
    • Bytecode/VM: BuildClass opcode assembles (name, value) pairs into a class; calling a class allocates an instance and runs __init__ as a real, suspendable frame (is_initializer, serialized for resume); bound methods pass self.
    • Builtins/types/repr: type(x) for instances returns the class object; isinstance supports user classes; added Class/Instance/BoundMethod and Type::Instance; repr/str dispatch to user dunders.
    • Tests/docs: added class__basic.py, class__scope.py, class__body_external.py, class__init_external.py, class__repr.py, plus TRACEBACK tests class__name_error.py, class__attribute_error.py, class__init_raises.py, class__method_raises.py; updated parse_errors.rs (simple classes compile; inheritance/decorators rejected), limitations/classes.md and limitations/language.md; traceback harness now strips CPython’s “Did you mean: '…'?” suffix for byte-for-byte parity.

Written for commit c802ca9. Summary will update on new commits.

Review in cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found across 28 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/monty/src/heap_data.rs">

<violation number="1" location="crates/monty/src/heap_data.rs:729">
P2: Bound methods are introduced as identity-equality values but are omitted from `py_hash` dispatch, making them unexpectedly unhashable. This breaks using `obj.method` as dict/set keys and is inconsistent with other identity-hashed callable objects.</violation>
</file>

<file name="crates/monty/src/bytecode/op.rs">

<violation number="1" location="crates/monty/src/bytecode/op.rs:748">
P2: BuildClass stack-effect arithmetic overflows for large but accepted class bodies. Either cap class member_count to the representable stack-effect range or compute/report this case before emitting the opcode.</violation>
</file>

<file name="crates/monty/src/value.rs">

<violation number="1" location="crates/monty/src/value.rs:1815">
P3: Update py_set_attr docs to include user-defined instances; current comment is now false and will mislead callers about supported setattr targets.</violation>
</file>

<file name="crates/monty/src/types/instance.rs">

<violation number="1" location="crates/monty/src/types/instance.rs:255">
P2: Bound-method attribute access leaks references when heap allocation fails. Handle the allocation error by dropping the owned BoundMethod values with the heap, or avoid taking extra refs until allocation succeeds.</violation>
</file>

<file name="crates/monty/src/bytecode/vm/collections.rs">

<violation number="1" location="crates/monty/src/bytecode/vm/collections.rs:71">
P2: Drop replaced class-member values when duplicate names overwrite earlier entries. Ignoring Dict::set's return leaks the old value's heap ownership/refcount for duplicate methods or class variables.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread crates/monty/src/bytecode/vm/mod.rs
Comment thread crates/monty/src/prepare.rs
Self::Dataclass(dc) => dc.py_hash(self_id, vm),
// Classes and instances hash by identity.
Self::Class(class) => class.py_hash(self_id, vm),
Self::Instance(instance) => instance.py_hash(self_id, vm),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Bound methods are introduced as identity-equality values but are omitted from py_hash dispatch, making them unexpectedly unhashable. This breaks using obj.method as dict/set keys and is inconsistent with other identity-hashed callable objects.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/heap_data.rs, line 729:

<comment>Bound methods are introduced as identity-equality values but are omitted from `py_hash` dispatch, making them unexpectedly unhashable. This breaks using `obj.method` as dict/set keys and is inconsistent with other identity-hashed callable objects.</comment>

<file context>
@@ -691,6 +724,9 @@ impl<'h> PyTrait<'h> for HeapReadOutput<'h> {
             Self::Dataclass(dc) => dc.py_hash(self_id, vm),
+            // Classes and instances hash by identity.
+            Self::Class(class) => class.py_hash(self_id, vm),
+            Self::Instance(instance) => instance.py_hash(self_id, vm),
             Self::Range(r) => r.py_hash(self_id, vm),
             Self::Slice(s) => s.py_hash(self_id, vm),
</file context>
Suggested change
Self::Instance(instance) => instance.py_hash(self_id, vm),
Self::Instance(instance) => instance.py_hash(self_id, vm),
Self::BoundMethod(_) => {
let mut hasher = DefaultHasher::new();
self_id.hash(&mut hasher);
Ok(Some(HashValue::new(hasher.finish())))
},

Comment thread crates/monty/src/bytecode/compiler.rs Outdated
(LoadGlobalCallable, Operand::U16U16(..)) => 1,
// `BuildClass(name_const, member_count)` pops `2 * member_count`
// key/value pairs and pushes the class object.
(BuildClass, Operand::U16U16(_, member_count)) => 1 - 2 * member_count.cast_signed(),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: BuildClass stack-effect arithmetic overflows for large but accepted class bodies. Either cap class member_count to the representable stack-effect range or compute/report this case before emitting the opcode.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/bytecode/op.rs, line 748:

<comment>BuildClass stack-effect arithmetic overflows for large but accepted class bodies. Either cap class member_count to the representable stack-effect range or compute/report this case before emitting the opcode.</comment>

<file context>
@@ -733,6 +743,9 @@ impl Opcode {
             (LoadGlobalCallable, Operand::U16U16(..)) => 1,
+            // `BuildClass(name_const, member_count)` pops `2 * member_count`
+            // key/value pairs and pushes the class object.
+            (BuildClass, Operand::U16U16(_, member_count)) => 1 - 2 * member_count.cast_signed(),
 
             // === Jumps: fall-through effect (what the tracker absorbs after the bytes are written).
</file context>

let mut namespace = Dict::new();
let mut iter = items.into_iter();
while let (Some(key), Some(value)) = (iter.next(), iter.next()) {
namespace.set(key, value, self)?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Drop replaced class-member values when duplicate names overwrite earlier entries. Ignoring Dict::set's return leaks the old value's heap ownership/refcount for duplicate methods or class variables.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/bytecode/vm/collections.rs, line 71:

<comment>Drop replaced class-member values when duplicate names overwrite earlier entries. Ignoring Dict::set's return leaks the old value's heap ownership/refcount for duplicate methods or class variables.</comment>

<file context>
@@ -46,6 +48,34 @@ impl<T: ResourceTracker> VM<'_, T> {
+        let mut namespace = Dict::new();
+        let mut iter = items.into_iter();
+        while let (Some(key), Some(value)) = (iter.next(), iter.next()) {
+            namespace.set(key, value, self)?;
+        }
+
</file context>
Suggested change
namespace.set(key, value, self)?;
if let Some(old_value) = namespace.set(key, value, self)? {
old_value.drop_with_heap(self);
}

instance: Value::Ref(self_id),
func: member,
};
let id = vm.heap.allocate(HeapData::BoundMethod(bound))?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Bound-method attribute access leaks references when heap allocation fails. Handle the allocation error by dropping the owned BoundMethod values with the heap, or avoid taking extra refs until allocation succeeds.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/types/instance.rs, line 255:

<comment>Bound-method attribute access leaks references when heap allocation fails. Handle the allocation error by dropping the owned BoundMethod values with the heap, or avoid taking extra refs until allocation succeeds.</comment>

<file context>
@@ -0,0 +1,373 @@
+                instance: Value::Ref(self_id),
+                func: member,
+            };
+            let id = vm.heap.allocate(HeapData::BoundMethod(bound))?;
+            Ok(CallResult::Value(Value::Ref(id)))
+        } else {
</file context>

Comment thread crates/monty/src/value.rs
old_value.drop_with_heap(vm);
Ok(())
}
HeapReadOutput::Instance(mut instance) => {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Update py_set_attr docs to include user-defined instances; current comment is now false and will mislead callers about supported setattr targets.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/value.rs, line 1815:

<comment>Update py_set_attr docs to include user-defined instances; current comment is now false and will mislead callers about supported setattr targets.</comment>

<file context>
@@ -1796,6 +1812,15 @@ impl Value {
                     old_value.drop_with_heap(vm);
                     Ok(())
                 }
+                HeapReadOutput::Instance(mut instance) => {
+                    let name_value = match name {
+                        EitherStr::Interned(string_id) => Self::InternString(*string_id),
</file context>

@codspeed-hq

codspeed-hq Bot commented Jun 28, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 20 untouched benchmarks
⏩ 15 skipped benchmarks1


Comparing impl-classes (c802ca9) with main (5d0b268)

Open in CodSpeed

Footnotes

  1. 15 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Results 📊

❌ Patch coverage is 0.12%. Project has 39124 uncovered lines.
❌ Project coverage is 49.7%. Comparing base (base) to head (head).

Files with missing lines (17)
File Patch % Lines
crates/monty/src/prepare.rs 0.00% ⚠️ 229 Missing
crates/monty/src/types/instance.rs 0.00% ⚠️ 204 Missing
crates/monty/src/parse.rs 0.00% ⚠️ 99 Missing
crates/monty/src/bytecode/compiler.rs 0.00% ⚠️ 77 Missing
crates/monty/src/types/class.rs 0.00% ⚠️ 72 Missing
crates/monty/src/bytecode/vm/call.rs 0.00% ⚠️ 50 Missing
crates/monty/src/heap.rs 0.00% ⚠️ 30 Missing
crates/monty/src/bytecode/vm/mod.rs 0.00% ⚠️ 19 Missing
crates/monty/src/heap_data.rs 0.00% ⚠️ 19 Missing
crates/monty/src/builtins/isinstance.rs 0.00% ⚠️ 18 Missing
crates/monty/src/bytecode/vm/collections.rs 0.00% ⚠️ 15 Missing
crates/monty/src/value.rs 0.00% ⚠️ 13 Missing
crates/monty/src/builtins/type_.rs 0.00% ⚠️ 6 Missing
crates/monty/src/bytecode/builder.rs 0.00% ⚠️ 3 Missing
crates/monty/src/bytecode/vm/async_exec.rs 0.00% ⚠️ 2 Missing
crates/monty/src/types/type.rs 50.00% ⚠️ 1 Missing and 1 partials
crates/monty/src/bytecode/op.rs 0.00% ⚠️ 1 Missing
Coverage diff
@@            Coverage Diff             @@
##          main       #PR       +/-##
==========================================
- Coverage    49.87%    49.70%    -0.17%
==========================================
  Files          304       308        +4
  Lines        76360     77789     +1429
  Branches    162849    165638     +2789
==========================================
+ Hits         38085     38665      +580
- Misses       38275     39124      +849
- Partials      3054      3086       +32

Generated by Codecov Action

Support `class Foo: ...` with instance methods, `__init__`, `__repr__`/`__str__`,
and literal class variables. Methods are ordinary functions whose first parameter
is `self`, so defaults/keyword args come for free.

- Parse: new `Node::ClassDef`; reject inheritance/metaclasses, class/method
  decorators, and non-literal class bodies at parse time.
- Prepare: class name binds in the enclosing scope; methods are prepared with
  `register_name=false` so class scope is skipped for free-var resolution while
  still capturing enclosing-function locals (matching CPython).
- Compile: extract `emit_make_function`; new `BuildClass` opcode (appended).
- Heap: new `Class`, `Instance`, `BoundMethod` types wired into HeapData,
  HeapReadOutput, GC walkers, `Type`, and PyTrait dispatch.
- Instantiation: `Foo(...)` allocates the instance, runs `__init__(self, ...)` as
  a real (suspendable) frame flagged `is_initializer`; `ReturnValue` discards the
  `None` and leaves the instance. The flag round-trips through frame
  serialization so a suspended `__init__` resumes correctly.
- repr/str dispatch to user `__repr__`/`__str__` via `evaluate_function`,
  intercepted at the `Value` level (where the heap id is available).
- `type(obj)` returns the class object; `type(obj) is Foo` and `isinstance` work.

Identity-only equality, always-truthy instances, no inheritance/other dunders yet;
divergences documented in limitations/classes.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Model the class body as a synthetic zero-arg function (like CPython's
class-body code object): it runs the class statements top-to-bottom in
its own scope, assembles the namespace, and returns the Class. This is
confined to parse -> prepare -> compile; the heap types, BuildClass
opcode, instantiation and builtins are unchanged.

- Class variables may now be arbitrary expressions and reference earlier
  class variables (the literal-only restriction is removed).
- Methods skip the class scope for free-var resolution: a bare member
  name resolves to a global/NameError, never a sibling member.
- Class-body values may suspend on external/OS calls (real frame).
- The same-name collision (an enclosing local and a class member sharing
  a name, captured by a method) is rejected with a clean
  NotImplementedError rather than miscompiling.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/monty/src/prepare.rs">

<violation number="1" location="crates/monty/src/prepare.rs:2211">
P3: Move or rewrite the stale `FunctionScopeInfo` doc comment; it now attaches to `FinalizedScope` and misdocuments the closure-slot helper struct.</violation>
</file>

<file name="crates/monty/src/parse.rs">

<violation number="1" location="crates/monty/src/parse.rs:729">
P2: Class-var RHS named expressions can create class-body locals that are not recorded in `members`, so those bindings disappear from the class namespace. Reject named expressions in class RHS or add their targets to namespace assembly.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread crates/monty/src/parse.rs
));
};
let ident = self.identifier(id, *name_range);
let object = self.parse_expression(*value)?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Class-var RHS named expressions can create class-body locals that are not recorded in members, so those bindings disappear from the class namespace. Reject named expressions in class RHS or add their targets to namespace assembly.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/parse.rs, line 729:

<comment>Class-var RHS named expressions can create class-body locals that are not recorded in `members`, so those bindings disappear from the class namespace. Reject named expressions in class RHS or add their targets to namespace assembly.</comment>

<file context>
@@ -708,14 +721,16 @@ impl<'a> Parser<'a> {
                     };
                     let ident = self.identifier(id, *name_range);
-                    class_vars.push((ident, self.parse_class_var_value(*value)?));
+                    let object = self.parse_expression(*value)?;
+                    members.push(ident);
+                    body.push(Node::Assign { target: ident, object });
</file context>

/// each captured name against the parent. These are exactly the fields the
/// compiler needs to emit `MakeFunction`/`MakeClosure` and install cells at
/// call time; see [`crate::function::Function`] for their meaning.
struct FinalizedScope {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: Move or rewrite the stale FunctionScopeInfo doc comment; it now attaches to FinalizedScope and misdocuments the closure-slot helper struct.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/monty/src/prepare.rs, line 2211:

<comment>Move or rewrite the stale `FunctionScopeInfo` doc comment; it now attaches to `FinalizedScope` and misdocuments the closure-slot helper struct.</comment>

<file context>
@@ -2016,6 +2201,20 @@ impl<'i, 'g> Prepare<'i, 'g> {
+/// each captured name against the parent. These are exactly the fields the
+/// compiler needs to emit `MakeFunction`/`MakeClosure` and install cells at
+/// call time; see [`crate::function::Function`] for their meaning.
+struct FinalizedScope {
+    free_var_enclosing_slots: Vec<NamespaceId>,
+    free_var_slots: Vec<NamespaceId>,
</file context>

Cover the bare-name NameError (a method referencing a class member by
bare name), missing-attribute AttributeError, and an exception raised in
__init__ with TRACEBACK tests so frames, line numbers and caret markers
are verified against CPython.

To keep byte-for-byte parity, the traceback harness now strips CPython's
"Did you mean: '...'?" spelling suggestion, which Monty does not emit
(its "Did you forget to import '...'" hint, produced by both, is kept).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +364 to +371
/// Whether this frame is a class `__init__` running for `Foo(...)`.
///
/// When `true`, the `ReturnValue` handler discards the frame's return value
/// (`__init__` returns `None`) and leaves the instance — pushed onto the
/// caller's operand stack before this frame was created — as the result of the
/// construction. Threaded through serialization (`SerializedFrame`) so a
/// suspended initializer resumes correctly.
is_initializer: bool,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a suspicious special case; the way that this works in CPython is that __call__ for type objects is an outer layer which handles creating a new instance (e.g. via __new__ then calling __init__ on the new object)

Comment on lines +523 to +532
/// Pop `2 * member_count` key/value pairs, build a class object, push it.
/// Operands: u16 const index of the class name (an interned string) + u16
/// member count.
///
/// The compiler emits, for a `class Foo: ...`, the `(name, value)` pairs for
/// each method and class variable, then this op pops them, builds the class
/// namespace dict, and wraps it in a [`HeapData::Class`](crate::heap::HeapData).
/// Stack: `[..., k1, v1, ..., kN, vN] -> [..., class]`.
/// Appended at the end to preserve the serialized byte values of all older opcodes.
BuildClass,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be a new opcode; if we implemented the multi-arg form to type() which dynamically creates new types, we could make the compiler emit codegen which calls that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants