Skip to content

Commit 71a4b63

Browse files
authored
Merge pull request #10735 from amastbaum/v1.19.x_PUBLISH
NEWS: Add v1.19.0-rc1 description
2 parents bb0794a + 856c0bc commit 71a4b63

File tree

1 file changed

+71
-0
lines changed

1 file changed

+71
-0
lines changed

NEWS

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,77 @@
1111
### Features:
1212
### Bugfixes:
1313

14+
## 1.19.0 (June 18, 2025)
15+
### Features:
16+
#### UCP
17+
* Enabled multi-GPU support within a single process
18+
* Added dynamic selection between strong and weak fences in RMA flush operations
19+
* Improved endpoint reconfiguration capabilities
20+
* Added All2All lane selection for multi-NIC-GPU systems
21+
* Improved rkey debug info when config cache limit is reached
22+
* Improved UCP protocol selection based on available memory types
23+
* Removed dummy memory key from irrelevant transports (TCP, CMA and CUDA)
24+
* Improved RNDV performance with device-local staging buffers
25+
* Enabled error handling for RMA get_offload protocols
26+
#### UCT
27+
* Defined uct_rkey_unpack_v2 API to support passing sys-dev
28+
#### RDMA CORE (IB, ROCE, etc.)
29+
* Added SRD transport support in EFA with reordering, AM, and control operations
30+
* Removed XGVMI BF2 support (umem)
31+
* Removed device memory indirect key
32+
* Fixed VFS objects for DCIs and pools
33+
* Added routing table cache to the reachability check
34+
* Fixed strict order usage in IB auxiliary rkeys
35+
* Improved various init logging messages
36+
#### CUDA
37+
* Added multi-context support for remote key unpacking to CUDA IPC
38+
* Added context switching aware resource management to CUDA IPC
39+
* Use buffer ID to detect VA recycling in CUDA IPC
40+
* Added support for allocating CUDA memory on specific system devices
41+
* Added multi-device support in CUDA copy
42+
* Improved protocol lane selection for GPU memory operations
43+
* Relaxed CUDA context requirements in CUDA copy
44+
* Added deadlock prevention in CUDA copy
45+
* Added support for address range detection for VMM
46+
* Enabled memory attributes query after switching CUDA GPU
47+
* Added multi-GPU send tests for CUDA transports
48+
* Removed host-to-host performance estimation from CUDA copy transport
49+
* Replaced cuCtxCreate by cuDevicePrimaryCtxRetain
50+
* Improved various init logging messages
51+
#### ROCM
52+
* Added control parameters for IPC handle cache and signal pool size
53+
* Optimized ROCm memory type detection with caching
54+
#### UCS
55+
* Removed compilation warnings
56+
#### Tools
57+
* Added name filter option (-F 'str') to ucx_info for config and feature dumps
58+
* Improved ucx_info input validation
59+
### Bugfixes:
60+
#### UCP
61+
* Made UCX_TLS=^ib disable all transports including auxiliary
62+
* Fixed send request status handling
63+
* Fixed performance degradation in RNDV by optimizing md cache updates
64+
* Fixed protocol selection when first lane is filtered out by fragment size
65+
* Fixed rkey selection by using memory registration flag
66+
#### UCT
67+
#### RDMA CORE (IB, ROCE, etc.)
68+
* Improved reliability of DC transport by adding DCI validation and separating connection logic
69+
* Fixed segfault in DC fence operation
70+
#### GPU (CUDA, ROCM)
71+
* Updated ROCm configuration for ROCm 6.3 compatibility
72+
* Fixed system device detection for CUDA async memory operations
73+
* Fixed legacy type detection during CUDA IPC mpack
74+
* Fixed CUDA IPC RMA operations by using correct context for local buffers
75+
#### UCS
76+
* Use UCS function for counting leading zeros on x86 architecture
77+
* Fixed a compilation warning
78+
#### Shared Memory
79+
* Fixed FIFO availability check for sm transport
80+
#### Documentation
81+
* Fixed open-mpi clone instruction
82+
#### Build
83+
* Fixed enum-int-mismatch warnings with GCC 15
84+
1485
## 1.18.0 (January 17, 2025)
1586
### Features:
1687
#### UCP

0 commit comments

Comments
 (0)