Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughA new regression testing harness ( Changes
Sequence DiagramsequenceDiagram
participant Test as Test Harness
participant RawSocket as RawSocketClient
participant Cmux as cmux Client
participant Terminal as Terminal Surface
participant Snapshot as Pixel Verification
Test->>RawSocket: connect()
RawSocket-->>Test: socket connected
Test->>Cmux: reset_to_fresh_workspace()
Cmux-->>Test: workspace_id
Test->>Cmux: build_workspace_grid()
Cmux-->>Test: list[workspace_ids]
Test->>Cmux: create_surface_targets()
Cmux-->>Test: list[SurfaceTarget]
loop For Each Target
Test->>Cmux: wait_for_visible_terminal(target)
Cmux-->>Test: terminal visible
Test->>RawSocket: command(type_token)
RawSocket->>Terminal: send typing input
Terminal-->>RawSocket: command response
RawSocket-->>Test: latency captured
Test->>Snapshot: panel_snapshot_retry()
Snapshot-->>Test: snapshot dict (pixel change verified)
Test->>Test: collect latency value
end
Test->>Test: compute_stats(baseline_latencies)
Test->>Test: compute_stats(stress_latencies)
Test->>Test: compare results with thresholds
Test-->>Test: regression decision
Test->>RawSocket: close()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds a visibility-aware typing-latency regression harness ( Key observations:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Test as test script
participant cmux as cmux client
participant Raw as RawSocketClient
participant App as cmux app
Test->>cmux: connect()
Test->>Raw: connect() [same socket]
Note over Test,App: Baseline scenario
Test->>cmux: reset_to_fresh_workspace()
cmux->>App: new_workspace / close_workspace
Test->>Raw: simulate_shortcut(ch) × TOKEN_LENGTH × BASELINE_TOKEN_COUNT
Raw-->>Test: OK (shortcut_latency_ms each)
Test->>cmux: read_terminal_text() [poll until token visible]
Test->>cmux: panel_snapshot() [verify changed_pixels]
Note over Test,App: Build stress targets
Test->>cmux: reset_to_fresh_workspace()
loop TOTAL_WORKSPACES
Test->>cmux: new_workspace / select_workspace
Test->>cmux: new_pane("right/down") × 3
loop PANES_PER_WORKSPACE panes
loop until TABS_PER_PANE tabs
Test->>cmux: new_surface(terminal)
end
end
end
Note over Test,App: Stress scenario
loop each SurfaceTarget
Test->>cmux: select_workspace / focus_pane / focus_surface
Test->>cmux: wait_for_visible_terminal()
Test->>Raw: simulate_shortcut(ch) × TOKEN_LENGTH
Raw-->>Test: OK (shortcut_latency_ms each)
Test->>cmux: read_terminal_text() [poll until token visible]
Test->>cmux: panel_snapshot() [verify changed_pixels ≥ MIN]
end
Note over Test: Compare baseline vs stress latency stats → PASS/FAIL
Last reviewed commit: cda2950 |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_workspace_split_tab_typing_lag.py (1)
612-618: Consider logging cleanup failures instead of silently ignoring.The
try-except-passat lines 616-617 silently swallows cleanup exceptions. While this is intentional to avoid masking the original test result, completely silent failures can hide environmental issues.♻️ Optional: Add minimal logging for cleanup failures
finally: if client is not None: try: reset_to_fresh_workspace(client) - except Exception: - pass + except Exception as cleanup_exc: + print(f"Warning: cleanup failed: {cleanup_exc}", file=sys.stderr) client.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_workspace_split_tab_typing_lag.py` around lines 612 - 618, The cleanup block currently swallows exceptions silently; change the except block to catch Exception as e and log the failure (e.g., logger.warning or logger.exception) including the exception message/traceback so cleanup errors are visible while still not failing the test; update the finally to log errors from reset_to_fresh_workspace and/or client.close (reference reset_to_fresh_workspace and client.close) and keep the test behavior of not re-raising the exception.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/test_workspace_split_tab_typing_lag.py`:
- Around line 612-618: The cleanup block currently swallows exceptions silently;
change the except block to catch Exception as e and log the failure (e.g.,
logger.warning or logger.exception) including the exception message/traceback so
cleanup errors are visible while still not failing the test; update the finally
to log errors from reset_to_fresh_workspace and/or client.close (reference
reset_to_fresh_workspace and client.close) and keep the test behavior of not
re-raising the exception.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e509676b-3ff5-4634-a1bb-c42d35938b79
📒 Files selected for processing (2)
tests/test_workspace_split_tab_typing_lag.pyvendor/bonsplit
There was a problem hiding this comment.
1 issue found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tests/test_workspace_split_tab_typing_lag.py">
<violation number="1" location="tests/test_workspace_split_tab_typing_lag.py:264">
P2: `build_workspace_grid` hard-codes a 4-pane layout, so `CMUX_TYPING_LAG_PANES_PER_WORKSPACE` values other than 4 will time out.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cda2950813
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 44c5d14fde
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
cmux-macminiwith tagged builds so the test run does not steal local focusProfiling
Before the Bonsplit change,
sampleshowedSelectedTabFramePreferenceKey.reduce,TabBarView.tabItem, andGeometryProxy.frame(in:)dominating the main thread during dense workspace churn.After the change, the full-load run no longer showed that path dominating the captured sample.
Testing
python3 -m py_compile tests/test_workspace_split_tab_typing_lag.pyCMUX_SOCKET=/tmp/cmux-debug-task-workspace-typing-lag.sock CMUX_TYPING_LAG_TOTAL_WORKSPACES=2 python3 -u tests/test_workspace_split_tab_typing_lag.pyCMUX_SOCKET=/tmp/cmux-debug-task-workspace-typing-lag.sock python3 -u tests/test_workspace_split_tab_typing_lag.pyDependency
Summary by cubic
Adds a visibility-aware typing-lag regression test that spins up many workspaces, splits, and
bonsplittabs, types in each visible terminal, and fails on regressions. Also updatesvendor/bonsplitto reduce tab-bar layout churn and lower main-thread use.New Features
tests/test_workspace_split_tab_typing_lag.pyto measure shortcut and visible typing latency vs a clean baseline.sampleon failure; continues withoutcmuxPID (disables failure sampling); adds snapshot retries and focus recovery; refuses main sockets by default.Dependencies
vendor/bonsplitto include reduced tab-bar layout churn (only publishes the selected tab frame preference).Written for commit e2fc0d4. Summary will update on new commits.
Summary by CodeRabbit
Tests
Chores