You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now, we store the simple priority_matrix in metadata, that including the prefer/available nics for cpu numas/gpus.
I'm here proposal a richer/original topolopy matrix metadata, similar to NCCL_TOPO_FILE, which contains more useful info, including if gdr supported, nvlink, nvswitch, and etc.
After that, we do not need to store the protocol in metadata, instead, we could choose a proper protocol base on the topology.
Now, we store the simple
priority_matrix
in metadata, that including the prefer/available nics for cpu numas/gpus.I'm here proposal a richer/original topolopy matrix metadata, similar to NCCL_TOPO_FILE, which contains more useful info, including if gdr supported, nvlink, nvswitch, and etc.
After that, we do not need to store the protocol in metadata, instead, we could choose a proper protocol base on the topology.
Here is a simple sample topo from nccl:
The text was updated successfully, but these errors were encountered: