Releases: NVIDIA/warp
Releases · NVIDIA/warp
v1.5.1
Changelog
[1.5.1] - 2025-01-02
Added
- Add PyTorch basics and custom operators notebooks to the
notebooks
directory. - Update PyTorch interop docs to include section on custom operators
(docs).
Fixed
- warp.sim: Fix a bug in which the color-balancing algorithm was not updating the colorings.
- Fix custom colors being not being updated when rendering meshes with static topology in OpenGL
(GH-343). - Fix
wp.launch_tiled()
not returning aLaunch
object when passedrecord_cmd=True
. - Fix default arguments not being resolved for
wp.func
when called from Python's runtime
(GH-386). - Array overwrite tracking: Fix issue with not marking arrays passed to
wp.atomic_add()
,wp.atomic_sub()
,
wp.atomic_max()
, orwp.atomic_min()
as being written to (GH-378). - Fix for occasional failure to update
.meta
files into Warp kernel cache on Windows. - Fix the OpenGL renderer not being able to run without a CUDA device available
(GH-344). - Fix incorrect CUDA driver function versions (GH-402).
v1.5.0
Changelog
[1.5.0] - 2024-12-02
Added
- Support for cooperative tile-based primitives using cuBLASDx and cuFFTDx, please see the tile
documentation for details. - Expose a
reversed()
built-in for iterators (GH-311). - Support for saving Volumes into
.nvdb
files with thesave_to_nvdb
method. - warp.fem: Add
wp.fem.Trimesh3D
andwp.fem.Quadmesh3D
geometry types for 3D surfaces with newexample_distortion_energy
example. - warp.fem: Add
"add"
option towp.fem.integrate()
for accumulating integration result to existing output. - warp.fem: Add
"assembly"
option towp.fem.integrate()
for selecting between more memory-efficient or more
computationally efficient integration algorithms. - warp.fem: Add Nédélec (first kind) and Raviart-Thomas vector-valued function spaces
providing conforming discretization ofcurl
anddiv
operators, respectively. - warp.sim: Add a graph coloring module that supports converting trimesh into a vertex graph and applying coloring.
Thewp.sim.ModelBuilder
now includes methods to color particles for use withwp.sim.VBDIntegrator()
,
users should callbuilder.color()
before finalizing assets. - warp.sim: Add support for a per-particle radius for soft-body triangle contact using the
wp.sim.Model.particle_radius
array (docs), replacing the previous
hard-coded value of 0.01 (GH-329). - Add a
particle_radius
parameter towp.sim.ModelBuilder.add_cloth_mesh()
andwp.sim.ModelBuilder.add_cloth_grid()
to set a uniform radius for the added particles. - Document
wp.array
attributes (GH-364). - Document time-to-compile tradeoffs when using vector component assignment statements in kernels.
- Add introductory Jupyter notebooks to the
notebooks
directory.
Changed
- Drop support for Python 3.7; Python 3.8 is now the minimum-supported version.
- Promote the
wp.Int
,wp.Float
, andwp.Scalar
generic annotation types to the public API. - warp.fem: Simplify querying neighboring cell quantities when integrating on sides using new
wp.fem.cells()
,wp.fem.to_inner_cell()
,wp.fem.to_outer_cell()
operators. - Show an error message when the type returned by a function differs from its annotation, which would have led to the compilation stage failing.
- Clarify that
wp.randn()
samples a normal distribution of mean 0 and variance 1. - Raise error when passing more than 32 variadic argument to the
wp.printf()
built-in.
Fixed
- Fix
place
setting of paddle backend. - warp.fem: Fix tri-cubic shape functions on quadrilateral meshes.
- warp.fem: Fix caching of integrand kernels when changing code-generation options.
- Fix
wp.expect_neq()
overloads missing for scalar types. - Fix an error when a
wp.kernel
or awp.func
object is annotated to return aNone
value. - Fix error when reading multi-volume, BLOSC-compressed
.nvdb
files. - Fix
wp.printf()
erroring out when no variadic arguments are passed (GH-333). - Fix memory access issues in soft-rigid contact collisions (GH-362).
- Fix gradient propagation for in-place addition/subtraction operations on custom vector-type arrays.
- Fix the OpenGL renderer's window not closing when clicking the X button.
- Fix the OpenGL renderer's camera snapping to a different direction from the initial camera's orientation when first looking around.
- Fix custom colors being ignored when rendering meshes in OpenGL (GH-343).
- Fix topology updates not being supported by the the OpenGL renderer.
v1.4.2
Changelog
[1.4.2] - 2024-11-13
Changed
- Make the output of
wp.print()
in backward kernels consistent for all supported data types.
Fixed
- Fix to relax the integer types expected when indexing arrays (regression in
1.3.0
). - Fix printing vector and matrix adjoints in backward kernels.
- Fix kernel compile error when printing structs.
- Fix an incorrect user function being sometimes resolved when multiple overloads are available with array parameters with different
dtype
values. - Fix error being raised when static and dynamic for-loops are written in sequence with the same iteration variable names (GH-331).
- Fix an issue with the
Texture Write
node, used in the Mandelbrot Omniverse sample, sometimes erroring out in multi-GPU environments. - Code generation of in-place multiplication and division operations (regression introduced in a69d061)(GH-342).
v1.4.1
Changelog
[1.4.1] - 2024-10-15
Fixed
- Fix
iter_reverse()
not working as expected for ranges with steps other than 1 (GH-311). - Fix potential out-of-bounds memory access when a
wp.sparse.BsrMatrix
object is reused for storing matrices of different shapes. - Fix robustness to very low desired tolerance in
wp.fem.utils.symmetric_eigenvalues_qr
. - Fix invalid code generation error messages when nesting dynamic and static for-loops.
- Fix caching of kernels with static expressions.
- Fix
ModelBuilder.add_builder(builder)
to correctly updatearticulation_start
and therebyarticulation_count
whenbuilder
contains more than one articulation. - Re-introduced the
wp.rand*()
,wp.sample*()
, andwp.poisson()
onto the Python scope to revert a breaking change.
v.1.4.0
CHANGELOG
[1.4.0] - 2024-10-01
Added
- Support for a new
wp.static(expr)
function that allows arbitrary Python expressions to be evaluated at the time of
function/kernel definition (docs). - Support for stream priorities to hint to the device that it should process pending work
in high-priority streams over pending work in low-priority streams when possible
(docs). - Adaptive sparse grid geometry to
warp.fem
(docs). - Support for defining
wp.kernel
andwp.func
objects from within closures. - Support for defining multiple versions of kernels, functions, and structs without manually assigning unique keys.
- Support for default argument values for user functions decorated with
wp.func
. - Allow passing custom launch dimensions to
jax_kernel()
(GH-310). - JAX interoperability examples for sharding and matrix multiplication (docs).
- Interoperability support for the PaddlePaddle ML framework (GH-318).
- Support
wp.mod()
for vector types (GH-282). - Expose the modulo operator
%
to Python's runtime scalar and vector types. - Support for fp64
atomic_add
,atomic_max
, andatomic_min
(GH-284). - Support for quaternion indexing (e.g.
q.w
). - Support shadowing builtin functions (GH-308).
- Support for redefining function overloads.
- Add an ocean sample to the
omni.warp
extension. warp.sim.VBDIntegrator
now supports body-particle collision.- Add a contributing guide to the Sphinx docs .
- Add documentation for dynamic code generation (docs).
Changed
wp.sim.Model.edge_indices
now includes boundary edges.- Unexposed
wp.rand*()
,wp.sample*()
, andwp.poisson()
from the Python scope. - Skip unused functions in module code generation, improving performance.
- Avoid reloading modules if their content does not change, improving performance.
wp.Mesh.points
is now a property instead of a raw data member, its reference can be changed after the mesh is initialized.- Improve error message when invalid objects are referenced in a Warp kernel.
if
/else
/elif
statements with constant conditions are resolved at compile time with no branches being inserted in the generated code.- Include all non-hidden builtins in the stub file.
- Improve accuracy of symmetric eigenvalues routine in
warp.fem
.
Fixed
- Fix for
wp.func
erroring out when defining aTuple
as a return type hint (GH-302). - Fix array in-place op (
+=
,-=
) adjoints to compute gradients correctly in the backwards pass - Fix vector, matrix in-place assignment adjoints to compute gradients correctly in the backwards pass, e.g.:
v[1] = x
- Fix a bug in which Python docstrings would be created as local function variables in generated code.
- Fix a bug with autograd array access validation in functions from different modules.
- Fix a rare crash during error reporting on some systems due to glibc mismatches.
- Handle
--num_tiles 1
inexample_render_opengl.py
(GH-306). - Fix the computation of body contact forces in
FeatherstoneIntegrator
when bodies and particles collide. - Fix bug in
FeatherstoneIntegrator
whereeval_rigid_jacobian
could give incorrect results or reach an infinite
loop when the body and joint indices were not in the same order. AddedModel.joint_ancestor
to fix the indexing
from a joint to its parent joint in the articulation. - Fix wrong vertex index passed to
add_edges()
called fromModelBuilder.add_cloth_mesh()
(GH-319). - Add a workaround for uninitialized memory read warning in the
compute-sanitizer
initcheck tool when usingwp.Mesh
. - Fix name clashes when Warp functions and structs are returned from Python functions multiple times.
- Fix name clashes between Warp functions and structs defined in different modules.
- Fix code generation errors when overloading generic kernels defined in a Python function.
- Fix issues with unrelated functions being treated as overloads (e.g., closures).
- Fix handling of
stream
argument inarray.__dlpack__()
. - Fix a bug related to reloading CPU modules.
- Fix a crash when kernel functions are not found in CPU modules.
- Fix conditions not being evaluated as expected in
while
statements. - Fix printing Boolean and 8-bit integer values.
- Fix array interface type strings used for Boolean and 8-bit integer values.
- Fix initialization error when setting struct members.
- Fix Warp not being initialized upon entering a
wp.Tape
context. - Use
kDLBool
instead ofkDLUInt
for DLPack interop of Booleans.
v1.3.3
[1.3.3] - 2024-09-04
- Bug fixes
- Fix an aliasing issue with zero-copy array initialization from NumPy introduced in Warp 1.3.0.
- Fix
wp.Volume.load_from_numpy()
behavior whenbg_value
is a sequence of values.
[1.3.2] - 2024-08-30
- Bug fixes
- Fix accuracy of 3x3 SVD
wp.svd3
with fp64 numbers (GH-281). - Fix module hashing when a kernel argument contained a struct array (GH-287).
- Fix a bug in
wp.bvh_query_ray()
where the direction instead of the reciprocal direction was used
(GH-288). - Fix errors when launching a CUDA graph after a module is reloaded. Modules that were used during graph capture
will no longer be unloaded before the graph is released. - Fix a bug in
wp.sim.collide.triangle_closest_point_barycentric()
where the returned barycentric coordinates may be
incorrect when the closest point lies on an edge. - Fix 32-bit overflow when array shape is specified using
np.int32
. - Fix handling of integer indices in the
input_output_mask
argument toautograd.jacobian
and
autograd.jacobian_fd
(GH-289). - Fix
ModelBuilder.collapse_fixed_joints()
to correctly update the body centers of mass and the
ModelBuilder.articulation_start
array. - Fix precedence of closure constants over global constants.
- Fix quadrature point indexing in
wp.fem.ExplicitQuadrature
(regression from 1.3.0).
- Fix accuracy of 3x3 SVD
- Documentation improvements
- Add missing return types for built-in functions.
- Clarify that atomic operations also return the previous value.
- Clarify that
wp.bvh_query_aabb()
returns parts that overlap the bounding volume.
[1.3.1] - 2024-07-27
- Remove
wp.synchronize()
from PyTorch autograd function example Tape.check_kernel_array_access()
andTape.reset_array_read_flags()
are now private methods.- Fix reporting unmatched argument types
[1.3.0] - 2024-07-25
-
Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
-
warp.autograd
improvements:- New
warp.autograd
module with utility functionsgradcheck()
,jacobian()
, andjacobian_fd()
for debugging kernel Jacobians (docs) - Add array overwrite detection, if
wp.config.verify_autograd_array_access
is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs) - Fix bug where modification of
@wp.func_replay
functions and native snippets would not trigger module recompilation - Add documentation for dynamic loop autograd limitations
- New
-
warp.sim
improvements:- Improve memory usage and performance for rigid body contact handling when
self.rigid_mesh_contact_max
is zero (default behavior). - The
mask
argument towp.sim.eval_fk()
now accepts both integer and boolean arrays to mask articulations. - Fix handling of
ModelBuilder.joint_act
inModelBuilder.collapse_fixed_joints()
(affected floating-base systems) - Fix and improve implementation of
ModelBuilder.plot_articulation()
to visualize the articulation tree of a rigid-body mechanism - Fix ShapeInstancer
__new__()
method (missing instance return and*args
parameter) - Fix handling of
upaxis
variable inModelBuilder
and the rendering thereof inOpenGLRenderer
- Improve memory usage and performance for rigid body contact handling when
-
warp.sparse
improvements:- Sparse matrix allocations (from
bsr_from_triplets()
,bsr_axpy()
, etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously. bsr_assign()
now supports changing block shape (including CSR/BSR conversions)- Add Python operator overloads for common sparse matrix operations, e.g
A += 0.5 * B
,y = x @ C
- Sparse matrix allocations (from
-
warp.fem
new features and fixes:- Support for variable number of nodes per element
- Global
wp.fem.lookup()
operator now supportswp.fem.Tetmesh
andwp.fem.Trimesh2D
geometries - Simplified defining custom subdomains (
wp.fem.Subdomain
), free-slip boundary conditions - New field types:
wp.fem.UniformField
,wp.fem.ImplicitField
andwp.fem.NonconformingField
- New
streamlines
,magnetostatics
andnonconforming_contact
examples, updatedmixed_elasticity
to use a nonlinear model - Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of
wp.fem.PicQuadrature
w.r.t. positions and measures
v1.3.2
[1.3.2] - 2024-08-30
- Bug fixes
- Fix accuracy of 3x3 SVD
wp.svd3
with fp64 numbers (GH-281). - Fix module hashing when a kernel argument contained a struct array (GH-287).
- Fix a bug in
wp.bvh_query_ray()
where the direction instead of the reciprocal direction was used
(GH-288). - Fix errors when launching a CUDA graph after a module is reloaded. Modules that were used during graph capture
will no longer be unloaded before the graph is released. - Fix a bug in
wp.sim.collide.triangle_closest_point_barycentric()
where the returned barycentric coordinates may be
incorrect when the closest point lies on an edge. - Fix 32-bit overflow when array shape is specified using
np.int32
. - Fix handling of integer indices in the
input_output_mask
argument toautograd.jacobian
and
autograd.jacobian_fd
(GH-289). - Fix
ModelBuilder.collapse_fixed_joints()
to correctly update the body centers of mass and the
ModelBuilder.articulation_start
array. - Fix precedence of closure constants over global constants.
- Fix quadrature point indexing in
wp.fem.ExplicitQuadrature
(regression from 1.3.0).
- Fix accuracy of 3x3 SVD
- Documentation improvements
- Add missing return types for built-in functions.
- Clarify that atomic operations also return the previous value.
- Clarify that
wp.bvh_query_aabb()
returns parts that overlap the bounding volume.
[1.3.1] - 2024-07-27
- Remove
wp.synchronize()
from PyTorch autograd function example Tape.check_kernel_array_access()
andTape.reset_array_read_flags()
are now private methods.- Fix reporting unmatched argument types
[1.3.0] - 2024-07-25
- Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
v1.3.1
[1.3.1] - 2024-07-27
- Remove
wp.synchronize()
from PyTorch autograd function example Tape.check_kernel_array_access()
andTape.reset_array_read_flags()
are now private methods.- Fix reporting unmatched argument types
[1.3.0] - 2024-07-25
-
Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
-
warp.autograd
improvements:- New
warp.autograd
module with utility functionsgradcheck()
,jacobian()
, andjacobian_fd()
for debugging kernel Jacobians (docs) - Add array overwrite detection, if
wp.config.verify_autograd_array_access
is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs) - Fix bug where modification of
@wp.func_replay
functions and native snippets would not trigger module recompilation - Add documentation for dynamic loop autograd limitations
- New
-
warp.sim
improvements:- Improve memory usage and performance for rigid body contact handling when
self.rigid_mesh_contact_max
is zero (default behavior). - The
mask
argument towp.sim.eval_fk()
now accepts both integer and boolean arrays to mask articulations. - Fix handling of
ModelBuilder.joint_act
inModelBuilder.collapse_fixed_joints()
(affected floating-base systems) - Fix and improve implementation of
ModelBuilder.plot_articulation()
to visualize the articulation tree of a rigid-body mechanism - Fix ShapeInstancer
__new__()
method (missing instance return and*args
parameter) - Fix handling of
upaxis
variable inModelBuilder
and the rendering thereof inOpenGLRenderer
- Improve memory usage and performance for rigid body contact handling when
-
warp.sparse
improvements:- Sparse matrix allocations (from
bsr_from_triplets()
,bsr_axpy()
, etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously. bsr_assign()
now supports changing block shape (including CSR/BSR conversions)- Add Python operator overloads for common sparse matrix operations, e.g
A += 0.5 * B
,y = x @ C
- Sparse matrix allocations (from
-
warp.fem
new features and fixes:- Support for variable number of nodes per element
- Global
wp.fem.lookup()
operator now supportswp.fem.Tetmesh
andwp.fem.Trimesh2D
geometries - Simplified defining custom subdomains (
wp.fem.Subdomain
), free-slip boundary conditions - New field types:
wp.fem.UniformField
,wp.fem.ImplicitField
andwp.fem.NonconformingField
- New
streamlines
,magnetostatics
andnonconforming_contact
examples, updatedmixed_elasticity
to use a nonlinear model - Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of
wp.fem.PicQuadrature
w.r.t. positions and measures
v1.3.0
[1.3.0] - 2024-07-25
-
Warp Core improvements
- Update to CUDA 12.x by default (requires NVIDIA driver 525 or newer), please see README.md for commands to install CUDA 11.x binaries for older drivers
- Add information to the module load print outs to indicate whether a module was
compiled(compiled)
, loaded from the cache(cached)
, or was unable to be
loaded(error)
. wp.config.verbose = True
now also prints out a message upon the entry to awp.ScopedTimer
.- Add
wp.clear_kernel_cache()
to the public API. This is equivalent towp.build.clear_kernel_cache()
. - Add code-completion support for
wp.config
variables. - Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns (GH-224)
- Improve error messages for unsupported Python operations such as sequence construction in kernels
- Update
wp.matmul()
CPU fallback to use dtype explicitly innp.matmul()
call - Add support for PEP 563's
from __future__ import annotations
(GH-256). - Allow passing external arrays/tensors to
wp.launch()
directly via__cuda_array_interface__
and__array_interface__
, up to 2.5x faster conversion from PyTorch - Add faster Torch interop path using
return_ctype
argument towp.from_torch()
- Handle incompatible CUDA driver versions gracefully
- Add
wp.abs()
andwp.sign()
for vector types - Expose scalar arithmetic operators to Python's runtime (e.g.:
wp.float16(1.23) * wp.float16(2.34)
) - Add support for creating volumes with anisotropic transforms
- Allow users to pass function arguments by keyword in a kernel using standard Python calling semantics
- Add additional documentation and examples demonstrating
wp.copy()
,wp.clone()
, andarray.assign()
differentiability - Add
__new__()
methods for all class__del__()
methods to handle when a class instance is created but not instantiated before garbage collection - Implement the assignment operator for
wp.quat
- Make the geometry-related built-ins available only from within kernels
- Rename the API-facing query types to remove their
_t
suffix:wp.BVHQuery
,wp.HashGridQuery
,wp.MeshQueryAABB
,wp.MeshQueryPoint
, andwp.MeshQueryRay
- Add
wp.array(ptr=...)
to allow initializing arrays from pointer addresses inside of kernels (GH-206)
-
warp.autograd
improvements:- New
warp.autograd
module with utility functionsgradcheck()
,jacobian()
, andjacobian_fd()
for debugging kernel Jacobians (docs) - Add array overwrite detection, if
wp.config.verify_autograd_array_access
is true in-place operations on arrays on the Tape that could break gradient computation will be detected (docs) - Fix bug where modification of
@wp.func_replay
functions and native snippets would not trigger module recompilation - Add documentation for dynamic loop autograd limitations
- New
-
warp.sim
improvements:- Improve memory usage and performance for rigid body contact handling when
self.rigid_mesh_contact_max
is zero (default behavior). - The
mask
argument towp.sim.eval_fk()
now accepts both integer and boolean arrays to mask articulations. - Fix handling of
ModelBuilder.joint_act
inModelBuilder.collapse_fixed_joints()
(affected floating-base systems) - Fix and improve implementation of
ModelBuilder.plot_articulation()
to visualize the articulation tree of a rigid-body mechanism - Fix ShapeInstancer
__new__()
method (missing instance return and*args
parameter) - Fix handling of
upaxis
variable inModelBuilder
and the rendering thereof inOpenGLRenderer
- Improve memory usage and performance for rigid body contact handling when
-
warp.sparse
improvements:- Sparse matrix allocations (from
bsr_from_triplets()
,bsr_axpy()
, etc.) can now be captured in CUDA graphs; exact number of non-zeros can be optionally requested asynchronously. bsr_assign()
now supports changing block shape (including CSR/BSR conversions)- Add Python operator overloads for common sparse matrix operations, e.g
A += 0.5 * B
,y = x @ C
- Sparse matrix allocations (from
-
warp.fem
new features and fixes:- Support for variable number of nodes per element
- Global
wp.fem.lookup()
operator now supportswp.fem.Tetmesh
andwp.fem.Trimesh2D
geometries - Simplified defining custom subdomains (
wp.fem.Subdomain
), free-slip boundary conditions - New field types:
wp.fem.UniformField
,wp.fem.ImplicitField
andwp.fem.NonconformingField
- New
streamlines
,magnetostatics
andnonconforming_contact
examples, updatedmixed_elasticity
to use a nonlinear model - Function spaces can now export VTK-compatible cells for visualization
- Fixed edge cases with NanoVDB function spaces
- Fixed differentiability of
wp.fem.PicQuadrature
w.r.t. positions and measures
v1.2.2
[1.2.2] - 2024-07-04
- Support for NumPy >= 2.0
[1.2.1] - 2024-06-14
- Fix generic function caching
- Fix Warp not being initialized when constructing arrays with
wp.array()
- Fix
wp.is_mempool_access_supported()
not resolving the provided device arguments towp.context.Device
[1.2.0] - 2024-06-06
- Add a not-a-number floating-point constant that can be used as
wp.NAN
orwp.nan
. - Add
wp.isnan()
,wp.isinf()
, andwp.isfinite()
for scalars, vectors, matrices, etc. - Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by allwp.constant()
variables declared in a Warp program. - Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory. - Add runtime checks for
wp.MarchingCubes
on field dimensions and size - Fix memory leak in
wp.Mesh
BVH (GH-225) - Use C++17 when building the Warp library and user kernels
- Increase PTX target architecture up to
sm_75
(fromsm_70
), enabling Turing ISA features - Extended NanoVDB support (see
warp.Volume
):- Add support for data-agnostic index grids, allocation at voxel granularity
- New
wp.volume_lookup_index()
,wp.volume_sample_index()
and genericwp.volume_sample()
/wp.volume_lookup()
/wp.volume_store()
kernel-level functions - Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
warp.fem
can now work directly on NanoVDB grids usingwarp.fem.Nanogrid
- Fixed
wp.volume_sample_v()
andwp.volume_store_*()
adjoints - Prevent
wp.volume_store()
from overwriting grid background values
- Improve validation of user-provided fields and values in
warp.fem
- Support headless rendering of
wp.render.OpenGLRenderer
viapyglet.options["headless"] = True
wp.render.RegisteredGLBuffer
can fall back to CPU-bound copying if CUDA/OpenGL interop is not available- Clarify terms for external contributions, please see CONTRIBUTING.md for details
- Improve performance of
wp.sparse.bsr_mm()
by ~5x on benchmark problems - Fix for XPBD incorrectly indexing into of joint actuations
joint_act
arrays - Fix for mass matrix gradients computation in
wp.sim.FeatherstoneIntegrator()
- Fix for handling of
--msvc_path
in build scripts - Fix for
wp.copy()
params to record dest and src offset parameters onwp.Tape()
- Fix for
wp.randn()
to ensure return values are finite - Fix for slicing of arrays with gradients in kernels
- Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
- Fix for handling of
bool
types in generic kernels - Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details