[runtime] Add multi-headed deterministic runtime support #2492

commonware-llm · 2025-12-13T01:59:37Z

Add a Manager and Instance API for creating multiple isolated runtime
instances that share a single deterministic executor. Each instance has:

Its own IP address for network operations
Its own storage namespace (partitions are prefixed)
Its own metrics namespace (metrics are prefixed)

This enables simulating multiple distributed system nodes in tests where
each node needs network isolation while sharing the same simulated time
and task scheduler.

Closes #2417

Add a Manager and Instance API for creating multiple isolated runtime instances that share a single deterministic executor. Each instance has: - Its own IP address for network operations - Its own storage namespace (partitions are prefixed) - Its own metrics namespace (metrics are prefixed) This enables simulating multiple distributed system nodes in tests where each node needs network isolation while sharing the same simulated time and task scheduler. Closes #2417

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

runtime/src/deterministic.rs

patrick-ogrady · 2025-12-13T02:25:33Z

runtime/src/network/deterministic.rs

+    ///
+    /// This shares the ephemeral port counter and listeners with other networks
+    /// created from the same base, but uses a different IP for its own connections.
+    pub fn with_ip(&self, ip: Ipv4Addr) -> Self {


Require an IP (no defaults)

The Instance::context() method was incorrectly cloning the tree directly with self.context.tree.clone() instead of creating a proper child tree via Tree::child(). This caused multiple contexts from the same instance to share the same tree node, leading to panics when spawning from multiple contexts. Fixed by using Tree::child() to create proper child trees, matching the behavior of the standard Context::clone() method. Added tests to verify: - Dropping one context does not affect others - Multiple contexts from the same instance can spawn independently

cloudflare-workers-and-pages · 2025-12-13T02:27:08Z

Deploying monorepo with Cloudflare Pages

Latest commit:	`5f23038`
Status:	✅ Deploy successful!
Preview URL:	https://b5f32c03.monorepo-eu0.pages.dev
Branch Preview URL:	https://claude-issue-2417-0167rvrpew.monorepo-eu0.pages.dev

View logs

patrick-ogrady · 2025-12-13T02:28:03Z

runtime/src/deterministic.rs

+            // Return a context with the instance namespace as its label
+            // Create a proper child tree for supervision
+            let (child, _) = Tree::child(&self.context.tree);
+            let name = format!("i{}", self.namespace);


Calling encode() on this spawned context won't lead to isolated metrics?

Maybe we wan't joined (just a guarantee not to conflict)?

It would be cool to have both...

Add encode() methods to Instance and Manager: - Instance::encode() returns only that instance's metrics (filtered by prefix) - Manager::encode() returns all metrics from all instances This ensures proper isolation where each instance only sees its own metrics, while the manager can observe the complete picture. Updated test to verify: - Instance1's encode() only shows instance1's metrics - Instance2's encode() only shows instance2's metrics - Manager's encode() shows metrics from all instances

- Manager now tracks allocated IPs and panics on duplicate allocation - Instance::encode() strips the instance prefix from metric names, returning clean metrics without the namespace prefix - Manager::encode() still returns all metrics with their full prefixes Added test for duplicate IP panic behavior.

Migrate the link, bandwidth, and transmitter modules from p2p::simulated to runtime::simulated, making them generic over peer and channel types. This allows the network simulation infrastructure to be reused across different contexts (e.g., the multihead runtime with Ipv4Addr peers). Changes: - Add runtime/src/simulated/ module with Link, bandwidth, and transmitter - Make transmitter generic over peer type P and channel type C - Re-export runtime's Link from p2p::simulated::ingress - Update p2p::simulated::transmitter to use type aliases wrapping runtime - Delete p2p/src/simulated/bandwidth.rs (now in runtime) - Simplify multihead module by removing incompatible LinkAwareNetwork wrapper - Add required dependencies (num-rational, num-traits, rand_distr) to runtime

codecov · 2025-12-13T03:47:43Z

Codecov Report

❌ Patch coverage is 94.94128% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.51%. Comparing base (b9d9c51) to head (19e283a).

Files with missing lines	Patch %	Lines
runtime/src/simulated/transmitter.rs	95.83%	28 Missing ⚠️
runtime/src/deterministic.rs	92.65%	23 Missing ⚠️
runtime/src/network/deterministic.rs	76.92%	3 Missing ⚠️
runtime/src/simulated/link.rs	97.89%	2 Missing ⚠️

@@            Coverage Diff             @@
##             main    #2492      +/-   ##
==========================================
- Coverage   92.52%   92.51%   -0.02%     
==========================================
  Files         336      338       +2     
  Lines       96743    96999     +256     
==========================================
+ Hits        89510    89734     +224     
- Misses       7233     7265      +32

Files with missing lines	Coverage Δ
p2p/src/simulated/ingress.rs	`99.33% <ø> (ø)`
p2p/src/simulated/mod.rs	`98.05% <ø> (ø)`
p2p/src/simulated/transmitter.rs	`100.00% <100.00%> (+2.93%)`	⬆️
runtime/src/lib.rs	`96.95% <ø> (-0.14%)`	⬇️
runtime/src/simulated/bandwidth.rs	`95.01% <ø> (ø)`
runtime/src/simulated/link.rs	`97.89% <97.89%> (ø)`
runtime/src/network/deterministic.rs	`94.87% <76.92%> (-3.66%)`	⬇️
runtime/src/deterministic.rs	`95.08% <92.65%> (-0.95%)`	⬇️
runtime/src/simulated/transmitter.rs	`95.83% <95.83%> (ø)`

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b9d9c51...19e283a. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add a high-level Router struct to runtime::simulated that combines link management with message scheduling. The Router handles latency sampling, success rate determination, and bandwidth-limited delivery. Refactor p2p::simulated to use the runtime's Router, making it a thinner wrapper that focuses on peer identity and network connections. The DeliveryLink struct now only handles actual network delivery since scheduling logic moved to the Router.

Add test examples demonstrating how to use multihead runtime instances with authenticated networks. Each peer runs on its own Instance with: - A unique IP address (same port, different IPs) - Isolated storage namespace - Isolated metrics prefix The lookup test (test_multihead_connectivity) demonstrates full peer-to-peer messaging where each peer sends its public key to all others. The discovery test (test_multihead_instances) demonstrates instance creation and isolation verification, showing how discovery networks can be configured with unique IPs per instance.

Add two tests to runtime::simulated::router demonstrating that: 1. `messages_only_flow_between_linked_ips`: Messages only flow between IPs that have explicit links configured. Links are unidirectional (A→B doesn't imply B→A), and removing a link stops message flow. 2. `network_partition_simulation`: Shows how to simulate network partitions by removing cross-partition links. Nodes within each partition can still communicate, but cross-partition messages fail. These tests confirm that the Router properly enforces link topology when used with IP addresses as peer identifiers.

patrick-ogrady · 2025-12-13T23:57:02Z

p2p/src/simulated/ingress.rs

    ///
    /// Link can be called before a peer is registered or bandwidth is specified.
-    pub async fn add_link(&mut self, sender: P, receiver: P, config: Link) -> Result<(), Error> {
+    pub async fn add_link(&mut self, sender: P, receiver: P, link: Link) -> Result<(), Error> {


At this point, it is probably better to just remove all link management from p2p::simulated and use runtime for it?

Simplify p2p::simulated by removing explicit link management. Peers in the same peer set can now communicate directly without needing to call add_link/remove_link. Key changes: - Remove Link struct and link-related methods from p2p::simulated - Remove bandwidth.rs and transmitter.rs from p2p::simulated - Move link/bandwidth simulation logic to runtime::simulated module - Add Router struct to runtime::simulated with add_link, update_link, remove_link, and bandwidth limiting capabilities - Update all dependent crates to remove explicit link calls - Update AGENTS.md documentation to reflect new architecture The network simulation (latency, jitter, bandwidth limiting) is now handled at the runtime layer via runtime::simulated::Router, enabling the upcoming multi-headed runtime to provide realistic network conditions.

chatgpt-codex-connector bot reviewed Dec 13, 2025

View reviewed changes

runtime/src/deterministic.rs Show resolved Hide resolved

cursor bot reviewed Dec 13, 2025

View reviewed changes

runtime/src/deterministic.rs Show resolved Hide resolved

patrick-ogrady reviewed Dec 13, 2025

View reviewed changes

patrick-ogrady marked this pull request as draft December 13, 2025 02:33

claude added 2 commits December 13, 2025 02:38

claude added 3 commits December 13, 2025 04:52

patrick-ogrady reviewed Dec 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[runtime] Add multi-headed deterministic runtime support #2492

[runtime] Add multi-headed deterministic runtime support #2492

Uh oh!

commonware-llm commented Dec 13, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

patrick-ogrady Dec 13, 2025

Uh oh!

cloudflare-workers-and-pages bot commented Dec 13, 2025 •

edited

Loading

Uh oh!

patrick-ogrady Dec 13, 2025

Uh oh!

patrick-ogrady Dec 13, 2025

Uh oh!

codecov bot commented Dec 13, 2025

Uh oh!

patrick-ogrady Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[runtime] Add multi-headed deterministic runtime support #2492

Are you sure you want to change the base?

[runtime] Add multi-headed deterministic runtime support #2492

Uh oh!

Conversation

commonware-llm commented Dec 13, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

patrick-ogrady Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

cloudflare-workers-and-pages bot commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying monorepo with Cloudflare Pages

Uh oh!

patrick-ogrady Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

patrick-ogrady Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 13, 2025

Codecov Report

Uh oh!

patrick-ogrady Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cloudflare-workers-and-pages bot commented Dec 13, 2025 •

edited

Loading