-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[Newton] Graphed Workflows #3657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev/newton
Are you sure you want to change the base?
[Newton] Graphed Workflows #3657
Conversation
|
this is amazing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR introduces a major architectural enhancement by implementing graphed workflows using Warp for GPU-accelerated RL training in Isaac Lab. The implementation adds a complete parallel pipeline alongside the existing Torch-based workflow.
Key Changes
- New Backend Abstraction: Factory pattern in
backend_utils.pyenables dynamic loading of Newton vs base implementations, with backend currently hardcoded to "newton" - Warp-based Environment:
DirectRLEnvWarpclass implements CUDA graph capture forstep_warp_action()andstep_warp_end()to eliminate kernel launch overhead - Newton Assets & Actuators: Complete reimplementation of articulation and actuator systems using Warp arrays and kernels instead of Torch tensors
- Performance Improvements: Benchmarks show significant speedups (Cartpole: 3.4ms→1.1ms, Ant: 7.6ms→4.6ms, Humanoid: 17.5ms→14.9ms per collection step)
- Example Environments: Warp implementations for Cartpole, Ant, and Humanoid demonstrating the new workflow
Issues Found
- Critical: Line 375 in
direct_rl_env_warp.pycreates new tensors each step (wp.from_torch(action)), preventing full graph capture and undermining performance benefits - Type inconsistency:
_pre_physics_stepsignature expectstorch.Tensorbut receiveswp.array - Code cleanup needed: Significant commented-out code for event managers and noise models throughout
direct_rl_env_warp.py - Minor: Missing newline at EOF in
__init__.py
Confidence Score: 2/5
- This PR has significant architectural changes with critical performance-impacting issues that need resolution before merging
- Score reflects the critical issue at line 375 where tensor conversion breaks CUDA graph capture, undermining the core performance benefits of the graphed workflow. The extensive commented-out code suggests incomplete implementation of event management and noise models. While the architecture is sound and benchmarks show promise, the tensor conversion bug needs immediate attention as it prevents achieving the advertised performance gains in production.
- Pay special attention to
source/isaaclab/isaaclab/envs/direct_rl_env_warp.py- the action tensor conversion at line 375 must be fixed to achieve full CUDA graph benefits
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| source/isaaclab/isaaclab/envs/direct_rl_env_warp.py | 3/5 | New warp-based RL environment with CUDA graph capture for performance. Contains commented-out event manager and noise model code that should be cleaned up or enabled. |
| source/isaaclab/isaaclab/init.py | 4/5 | Added Backend IntEnum to distinguish Newton from PhysX. Missing newline at end of file. |
| source/isaaclab/isaaclab/actuators/actuator_base.py | 3/5 | Refactored to support backend abstraction. Removed torch-specific attributes, simplified constructor signature for backend compatibility. |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation.py | 3/5 | Large new file (2500+ lines) implementing Newton-based articulation with warp kernels. Core implementation for the graphed workflow. |
| source/isaaclab/isaaclab/utils/backend_utils.py | 5/5 | Factory pattern implementation for dynamic backend loading (Newton vs base). Clean abstraction layer. |
| source/isaaclab_tasks/isaaclab_tasks/direct/cartpole/cartpole_warp_env.py | 4/5 | Warp-based Cartpole environment using kernels for observations, rewards, and resets. Good example of the new workflow. |
Sequence Diagram
sequenceDiagram
participant User
participant DirectRLEnvWarp
participant Scene
participant Articulation
participant ActuatorModel
participant NewtonSolver
participant WarpKernels
Note over User,WarpKernels: Initialization Phase
User->>DirectRLEnvWarp: __init__(cfg)
DirectRLEnvWarp->>Scene: InteractiveScene(cfg.scene)
DirectRLEnvWarp->>Articulation: Articulation(cfg, frontend=Frontend.WARP)
Articulation->>ActuatorModel: Initialize actuator models with warp arrays
Note over User,WarpKernels: Training Loop
User->>DirectRLEnvWarp: step(action)
DirectRLEnvWarp->>DirectRLEnvWarp: _pre_physics_step(wp.from_torch(action))
loop decimation times
DirectRLEnvWarp->>DirectRLEnvWarp: step_warp_action() [CUDA Graph]
DirectRLEnvWarp->>Articulation: _apply_action()
Articulation->>ActuatorModel: compute()
ActuatorModel->>WarpKernels: compute_pd_actuator / clip_efforts
DirectRLEnvWarp->>Scene: write_data_to_sim()
Scene->>NewtonSolver: Apply computed efforts
DirectRLEnvWarp->>NewtonSolver: sim.step(render=False)
NewtonSolver-->>Articulation: Update physics state
DirectRLEnvWarp->>Scene: update(dt)
end
DirectRLEnvWarp->>DirectRLEnvWarp: step_warp_end() [CUDA Graph]
DirectRLEnvWarp->>WarpKernels: add_to_env (episode_length_buf++)
DirectRLEnvWarp->>DirectRLEnvWarp: _get_dones()
DirectRLEnvWarp->>WarpKernels: get_dones kernel
DirectRLEnvWarp->>DirectRLEnvWarp: _get_rewards()
DirectRLEnvWarp->>WarpKernels: compute_rewards kernel
DirectRLEnvWarp->>DirectRLEnvWarp: _reset_idx(reset_buf)
DirectRLEnvWarp->>Scene: reset(mask=reset_buf)
DirectRLEnvWarp->>DirectRLEnvWarp: _get_observations()
DirectRLEnvWarp->>WarpKernels: get_observations kernel
DirectRLEnvWarp-->>User: (obs, rewards, dones, extras)
94 files reviewed, 5 comments
| __version__ = ISAACLAB_METADATA["package"]["version"] | ||
|
|
||
| class Backend(IntEnum): | ||
| NEWTON = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: missing newline at end of file
| NEWTON = 0 | |
| PHYSX = 1 |
| action = self._action_noise_model(action) | ||
|
|
||
| # process actions, #TODO pass the torch tensor directly. | ||
| self._pre_physics_step(wp.from_torch(action)) # Creates a tensor and throws it away. --> Not graphable unless the training loop is using the same pointer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: the TODO comment indicates this creates a new tensor each step, preventing CUDA graph capture. This will significantly impact the performance benefits of the graphed workflow since the action conversion happens outside the graph
| pass | ||
|
|
||
| @abstractmethod | ||
| def _pre_physics_step(self, actions: torch.Tensor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: type hint says torch.Tensor but should be wp.array since this is the warp environment and line 375 converts it with wp.from_torch(action)
| # TODO We could split it out. | ||
| # post-step: step interval event | ||
| # if self.cfg.events: | ||
| # if "interval" in self.event_manager.available_modes: | ||
| # self.event_manager.apply(mode="interval", dt=self.step_dt) | ||
|
|
||
| # update observations | ||
| self._get_observations() | ||
|
|
||
| # add observation noise | ||
| # note: we apply no noise to the state space (since it is used for critic networks) | ||
| # if self.cfg.observation_noise_model: | ||
| # self.obs_buf["policy"] = self._observation_noise_model(self.obs_buf["policy"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: large blocks of commented-out code for event management and observation noise should either be removed or properly implemented with a clear plan
|
|
||
| # apply events such as randomization for environments that need a reset | ||
| # if self.cfg.events: | ||
| # if "reset" in self.event_manager.available_modes: | ||
| # env_step_count = self._sim_step_counter // self.cfg.decimation | ||
| # self.event_manager.apply(mode="reset", env_ids=env_ids, global_env_step_count=env_step_count) | ||
|
|
||
| # reset noise models | ||
| # if self.cfg.action_noise_model: | ||
| # self._action_noise_model.reset(env_ids) | ||
| # if self.cfg.observation_noise_model: | ||
| # self._observation_noise_model.reset(env_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: commented-out event manager and noise model reset logic should be addressed before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Introduces a separate Newton-based graphed workflow pipeline for Isaac Lab using Warp kernels to achieve significant performance improvements. The implementation adds a complete parallel execution path with warp-based actuators, articulation management, and a new DirectRLEnvWarp environment class.
Major changes include:
- New Newton backend actuator implementations (
actuator_base.py,actuator_pd.py,actuator_net.py) using warp arrays instead of torch tensors - Complete articulation system rewrite for Newton (
articulation.py,articulation_data.py) with dual Torch/Warp frontend support - Warp kernels for PD control, DC motor clipping, and effort limiting
- New
DirectRLEnvWarpenvironment that aims to support CUDA graph capture for performance - Performance improvements demonstrated: Cartpole overhead reduced from 2560us to 357us, Ant from 3404us to 471us
Key concerns:
- The DC motor torque-speed clipping formula may not correctly handle negative velocities
- TODOs indicate tensor conversions in the action pipeline block CUDA graph capture, limiting the full performance benefit
- Large commented-out sections for event management and noise models suggest incomplete implementation
Confidence Score: 3/5
- Safe to merge with caveats - the parallel pipeline is isolated but has incomplete features and one potential physics bug
- The PR introduces a complete parallel execution path which limits risk to existing code. However, the DC motor clipping logic has a potential bug in handling negative velocities, and the CUDA graph support (a key performance feature) is blocked by tensor conversion issues noted in TODOs. The implementation is well-structured but incomplete (missing event management, noise models).
source/isaaclab/isaaclab/newton/actuators/kernels.pyfor DC motor physics verification, andsource/isaaclab/isaaclab/envs/direct_rl_env_warp.pyfor resolving CUDA graph blocking issues
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| source/isaaclab/isaaclab/newton/actuators/kernels.py | 3/5 | New warp kernels for PD control and DC motor clipping; DC motor torque-speed curve logic needs verification |
| source/isaaclab/isaaclab/newton/actuators/actuator_base.py | 4/5 | Base actuator class for Newton backend with warp array support; clean implementation of parameter parsing |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation.py | 4/5 | Core articulation implementation for Newton with extensive state management and frontend conversions |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation_data.py | 4/5 | Data container with dual Torch/Warp frontend support; comprehensive property accessors for articulation state |
| source/isaaclab/isaaclab/envs/direct_rl_env_warp.py | 2/5 | Warp-based RL environment with CUDA graph support; contains TODOs about tensor conversion blocking graphs |
Sequence Diagram
sequenceDiagram
participant User as Training Script
participant Env as DirectRLEnvWarp
participant Art as Articulation (Newton)
participant Act as Actuator Models
participant Sim as Newton Solver
participant WP as Warp Kernels
User->>Env: step(actions)
Env->>Env: wp.from_torch(actions)
Note over Env: TODO: Tensor conversion blocks CUDA graphs
Env->>Env: _pre_physics_step(wp_actions)
Env->>Art: set_joint_*_target()
Art->>Art: _torch_to_warp_dual_index()
Art->>Act: Store targets in data._actuator_target
Env->>Art: write_data_to_sim()
Art->>Act: _apply_actuator_model()
Act->>WP: compute_pd_actuator()
Note over WP: PD control: k_p*(target-pos) + k_d*(-vel)
WP->>Act: computed_effort
Act->>WP: clip_efforts_with_limits()
WP->>Act: applied_effort
Act->>Art: Update data._sim_bind_joint_effort
Env->>Sim: step(dt)
Sim->>Sim: Newton physics integration
Env->>Art: update(dt)
Art->>Art: _data.update(dt)
Note over Art: Read new states from Newton
Env->>Env: _get_observations()
Env->>Env: _get_rewards()
Env->>User: obs, rewards, dones, info
21 files reviewed, 1 comment
| joint_vel: float, | ||
| effort: float, | ||
| ): | ||
| max_effort = saturation_effort * (1.0 - joint_vel) / vel_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: DC motor clipping formula appears incorrect. The max_effort calculation uses (1.0 - joint_vel) but this doesn't account for the sign of velocity correctly. When joint_vel is negative (motor spinning backward), (1.0 - joint_vel) becomes larger than 1.0, which seems incorrect for a DC motor torque-speed curve. Standard DC motor model: torque = k * (V - k * omega) suggests the formula should handle forward/backward rotation symmetrically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR introduces a comprehensive Warp-based graphed workflow pipeline for Isaac Lab, providing significant performance improvements (up to 3x speedup for Cartpole, 40-65% overhead reduction across environments).
Key Changes
- New Newton backend structure: Refactored articulations and rigid objects into base classes with separate PhysX and Newton implementations
- Warp kernel library: Added extensive kernel collection for state manipulation, joint operations, velocity transforms, and actuator computations
- DirectRLEnvWarp: New warp-native RL environment with partial CUDA graph support
- Environment implementations: Warp versions of Cartpole, Ant, and Humanoid with kernel-based observations/rewards/resets
Major Issues Found
- DC motor clipping formula: Incorrect torque-speed curve implementation in
actuator kernels(lines 72-73) - Gravity normalization: Uses sum of components instead of vector magnitude in both
rigid_object_data.py:84andarticulation_data.py:91 - Partial graph capture: The
wp.from_torch(action)conversion at line 375 ofdirect_rl_env_warp.pycreates new tensors each step, preventing full CUDA graph capture and limiting performance benefits
Performance Impact
The reported speedups are impressive but the TODO comment indicates the action conversion issue limits the actual achievable performance. The CUDA graph currently only captures action processing and post-processing separately, not the full pipeline.
Confidence Score: 3/5
- This PR has critical bugs in motor physics and partial CUDA graph support that limit performance gains
- Score reflects three critical logical errors (DC motor clipping, gravity normalization) and architectural limitation (incomplete graph capture). The DC motor bug will cause incorrect actuator behavior, the gravity bug affects projections in locomotion tasks, and the graph capture issue means the performance improvements are not as significant as claimed. Extensive codebase with many TODOs and commented-out sections suggests incomplete implementation.
newton/actuators/kernels.py(motor physics),newton/assets/*/data.py(gravity), andenvs/direct_rl_env_warp.py(graph capture)
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| source/isaaclab/isaaclab/newton/actuators/kernels.py | 2/5 | new warp kernels for PD actuators and DC motor clipping, but DC motor clipping formula is incorrect for torque-speed curve |
| source/isaaclab/isaaclab/newton/assets/rigid_object/rigid_object_data.py | 3/5 | comprehensive warp-based rigid object data implementation with lazy evaluation and timestamped buffers, has gravity normalization bug |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation_data.py | 3/5 | extensive articulation data container with warp arrays and lazy evaluation, includes same gravity normalization issue |
| source/isaaclab/isaaclab/envs/direct_rl_env_warp.py | 3/5 | new warp-based RL environment with CUDA graph capture, but wp.from_torch(action) in _pre_physics_step creates new tensors preventing full graph capture |
| source/isaaclab/isaaclab/newton/assets/rigid_object/rigid_object.py | 4/5 | newton-backend rigid object implementation with comprehensive warp kernel integration for state manipulation |
| source/isaaclab/isaaclab/newton/assets/articulation/articulation.py | 4/5 | large newton-backend articulation with extensive warp operations for joints, actuators, and body manipulations |
Sequence Diagram
sequenceDiagram
participant User as Training Script
participant Env as DirectRLEnvWarp
participant Scene as InteractiveScene
participant Robot as Newton Articulation
participant Actuator as Warp Actuators
participant Solver as Newton Solver
participant Kernels as Warp Kernels
User->>Env: step(actions)
Note over Env: Convert torch to warp array<br/>(wp.from_torch - breaks graph)
Env->>Env: _pre_physics_step(warp_action)
activate Env
loop decimation steps
Env->>Kernels: update_actions kernel
Note over Kernels: Scale and process actions
Env->>Actuator: compute_warp()
Actuator->>Kernels: compute_pd_actuator kernel
Kernels-->>Actuator: computed_effort
Actuator->>Kernels: clip_efforts (DC motor/limits)
Kernels-->>Actuator: clipped_effort
Actuator->>Robot: set joint efforts
Env->>Solver: sim.step()
activate Solver
Solver->>Solver: MuJoCo constraint solver
Solver->>Robot: update state (pose, velocity)
deactivate Solver
Env->>Scene: scene.update(dt)
Scene->>Robot: robot.update(dt)
Robot->>Robot: Update timestamped buffers
end
deactivate Env
Env->>Kernels: get_observations kernel
Kernels-->>Env: observations (warp array)
Env->>Kernels: get_dones kernel
Kernels-->>Env: termination flags
Env->>Kernels: compute_rewards kernel
Kernels-->>Env: rewards (warp array)
Env->>Kernels: reset kernel (for terminated envs)
Kernels->>Robot: write state to sim
Note over Env: Convert warp to torch<br/>(wp.to_torch)
Env-->>User: obs, rewards, dones, extras
Note over User,Kernels: CUDA Graphs capture action<br/>processing and post-processing<br/>separately (not full pipeline)
19 files reviewed, 4 comments
| joint_vel: float, | ||
| effort: float, | ||
| ): | ||
| max_effort = saturation_effort * (1.0 - joint_vel) / vel_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: DC motor clipping uses (1.0 - joint_vel) but doesn't normalize by vel_limit. When joint_vel = vel_limit, max_effort should be 0. When joint_vel = 0, max_effort should be saturation_effort. The formula should be saturation_effort * (vel_limit - joint_vel) / vel_limit.
| max_effort = saturation_effort * (1.0 - joint_vel) / vel_limit | |
| max_effort = saturation_effort * (vel_limit - joint_vel) / vel_limit |
| effort: float, | ||
| ): | ||
| max_effort = saturation_effort * (1.0 - joint_vel) / vel_limit | ||
| min_effort = saturation_effort * (-1.0 - joint_vel) / vel_limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Same DC motor issue for negative velocity direction. Should be saturation_effort * (-vel_limit - joint_vel) / vel_limit.
| min_effort = saturation_effort * (-1.0 - joint_vel) / vel_limit | |
| min_effort = saturation_effort * (-vel_limit - joint_vel) / vel_limit |
| self._sim_timestamp = 0.0 | ||
| # obtain global simulation view | ||
| gravity = wp.to_torch(NewtonManager.get_model().gravity)[0] | ||
| gravity_dir = [float(i) / sum(gravity) for i in gravity] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: gravity normalization divides by sum of components instead of vector magnitude. If gravity is (0, 0, -9.81), this gives (0, 0, -1) when it should give (0, 0, -1). However, if gravity has multiple non-zero components, the normalization is incorrect.
| gravity_dir = [float(i) / sum(gravity) for i in gravity] | |
| gravity_length = wp.length(gravity) | |
| gravity_dir = [float(gravity[i]) / gravity_length for i in range(3)] |
| self._sim_timestamp = 0.0 | ||
| # obtain global simulation view | ||
| gravity = wp.to_torch(NewtonManager.get_model().gravity)[0] | ||
| gravity_dir = [float(i) / sum(gravity) for i in gravity] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Same gravity normalization issue as in rigid_object_data.py:84. Uses sum of components instead of vector magnitude.
Description
Adds graphed workflows as a separate pipeline for now.
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context..
Type of change
How to use:
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-Warp-v0 --num_envs 4096 --headless --max_iterations 300
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Ant-Warp-v0 --num_envs 4096 --headless --max_iterations 300
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Humanoid-Warp-v0 --num_envs 4096 --headless --max_iterations 300
Results
Provides some speed-up on the Isaac front:
For Cartpole:
For Ant:
For Humanoid:
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there