Skip to content

Conversation

iyastreb
Copy link
Contributor

@iyastreb iyastreb commented Jul 29, 2025

What?

This PR introduces protocol variants for multi-protocols.
The problem: with existing implementation is that for multi-protocols selection we always prefer high-bw combination of lanes (for instance cuda_ipc), which might not be the best choice on low message dimensions. Because high-bw protocols often have higher latency.
This PR attempts to solve this problem by introducing protocol variants, so that for a single multi-protocol we may select multiple protocols: currently high-bw lanes selection and low-latency lanes selection.

Why?

Improve performance of multi-protocols on low message dimensions.

How?

  • Added infra for protocol variants with a possibility to extend
  • Added latency variant in addition to existing bandwidth one
  • Moved protocol selection logic to proto_common, so that we can use it for single protocols as well
  • Added UCX_PROTO_VARIANTS option that enables this feature with default value n
  • Allocate selection on heap, added memory pool for that

TODO in next PRs:

  • Enable UCX_PROTO_VARIANTS=y by default
  • Use protocol variants in single protocols
  • Long-living task: implement proper aggregation of multi-path selection (according to single path BW)

Testing results

@iyastreb iyastreb force-pushed the ucp-multi-proto-variants branch from 9ff9447 to 641e4c9 Compare July 29, 2025 13:59
@iyastreb iyastreb force-pushed the ucp-multi-proto-variants branch from 641e4c9 to 2d3196d Compare July 29, 2025 14:23
@iyastreb iyastreb marked this pull request as ready for review August 15, 2025 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant