Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/EFA/SRD: Initial interface add #10412

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
5 changes: 5 additions & 0 deletions src/uct/ib/base/ib_iface.h
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,11 @@ enum {
#else
UCT_IB_QPT_DCI = UCT_IB_QPT_UNKNOWN,
#endif
#if HAVE_EFA
UCT_IB_QPT_SRD,
#else
UCT_IB_QPT_SRD = UCT_IB_QPT_UNKNOWN,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Do we really need to define UCT_IB_QPT_SRD is HAVE_EFA is 0?
  2. Shall we define UCT_IB_QPT_SRD to IBV_QPT_DRIVER? Or unify them somehow? Seems weird to define UCT_IB_QPT_SRD without any reference to an IB macro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed both

#endif
};


Expand Down
4 changes: 4 additions & 0 deletions src/uct/ib/base/ib_log.c
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ const char *uct_ib_qp_type_str(int qp_type)
#if HAVE_TL_DC
case UCT_IB_QPT_DCI:
return "DCI";
#endif
#if HAVE_EFA
case UCT_IB_QPT_SRD:
return "SRD";
#endif
default:
ucs_bug("invalid qp type: %d", qp_type);
Expand Down
7 changes: 5 additions & 2 deletions src/uct/ib/efa/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,12 @@ libuct_ib_efa_la_LIBADD = $(top_builddir)/src/ucs/libucs.la \
$(top_builddir)/src/uct/libuct.la \
$(top_builddir)/src/uct/ib/libuct_ib.la

libuct_ib_efa_la_SOURCES = ib_efa_md.c
libuct_ib_efa_la_SOURCES = ib_efa_md.c \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we move this to efa/base, now that we added the srd subdir?
to align with other transports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

srd/srd_iface.c

noinst_HEADERS = ib_efa.h
noinst_HEADERS = ib_efa.h \
srd/srd_iface.h \
srd/srd_def.h

PKG_CONFIG_NAME=ib-efa

Expand Down
4 changes: 4 additions & 0 deletions src/uct/ib/efa/ib_efa_md.c
Original file line number Diff line number Diff line change
Expand Up @@ -104,12 +104,16 @@ static uct_ib_md_ops_t uct_ib_efa_md_ops = {

UCT_IB_MD_DEFINE_ENTRY(efa, uct_ib_efa_md_ops);

extern uct_tl_t UCT_TL_NAME(srd);

void UCS_F_CTOR uct_efa_init(void)
{
ucs_list_add_head(&uct_ib_ops, &UCT_IB_MD_OPS_NAME(efa).list);
uct_tl_register(&uct_ib_component, &UCT_TL_NAME(srd));
}

void UCS_F_DTOR uct_efa_cleanup(void)
{
uct_tl_unregister(&UCT_TL_NAME(srd));
ucs_list_del(&UCT_IB_MD_OPS_NAME(efa).list);
}
74 changes: 74 additions & 0 deletions src/uct/ib/efa/srd/srd_def.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
/**
* Copyright (c) NVIDIA CORPORATION & AFFILIATES, 2025. ALL RIGHTS RESERVED.
*
* See file LICENSE for terms.
*/

#ifndef SRD_DEF_H_
#define SRD_DEF_H_

#include <uct/ib/base/ib_iface.h>
#include <ucs/datastruct/frag_list.h>


#define UCT_SRD_SEND_DESC_ALIGN UCS_SYS_CACHE_LINE_SIZE
#define UCT_SRD_SEND_OP_ALIGN UCS_SYS_CACHE_LINE_SIZE


typedef ucs_frag_list_sn_t uct_srd_psn_t;
typedef struct uct_srd_iface uct_srd_iface_t;
typedef struct uct_srd_iface_addr uct_srd_iface_addr_t;
typedef struct uct_srd_iface_peer uct_srd_iface_peer_t;
typedef struct uct_srd_ep_addr uct_srd_ep_addr_t;
typedef struct uct_srd_send_op uct_srd_send_op_t;
typedef struct uct_srd_send_desc uct_srd_send_desc_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need all these forward declarations (and in this PR)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed for now


typedef uint32_t uct_srd_ep_conn_sn_t;

typedef struct uct_srd_neth {
uint32_t packet_type;
uint8_t fc;
uct_srd_psn_t psn;
} UCS_S_PACKED uct_srd_neth_t;

typedef struct uct_srd_recv_desc {
uct_ib_iface_recv_desc_t super;
} uct_srd_recv_desc_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to reduce the PR size to the required LOC count, can we move the mpool and desc related code to next PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we strictly add when used, it will be added all at once, so it thought to include the creation/init of qp in pre posting buffers, init + teardown path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to split it somehow to be <500 LOC in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok fixed


enum {
#if UCS_ENABLE_ASSERT
UCT_SRD_SEND_OP_FLAG_INVALID = UCS_BIT(7), /* send op was released */
#else
UCT_SRD_SEND_OP_FLAG_INVALID = 0,
#endif
};

typedef struct uct_srd_peer_name {
char name[16];
int pid;
} uct_srd_peer_name_t;

typedef struct uct_srd_iface_addr {
uct_ib_uint24_t qp_num;
} uct_srd_iface_addr_t;

typedef struct uct_srd_ep_addr {
uct_srd_iface_addr_t iface_addr;
uct_ib_uint24_t ep_id;
} uct_srd_ep_addr_t;

/* Send operation (TX or RX) */
struct uct_srd_send_op {
uint16_t flags;
} UCS_V_ALIGNED(UCT_SRD_SEND_OP_ALIGN);

/*
* Send descriptor
* - used for any send op (including RDMA READ) that
* requires some form of handling after completion
*/
struct uct_srd_send_desc {
uct_srd_send_op_t super;
uint32_t lkey;
} UCS_S_PACKED UCS_V_ALIGNED(UCT_SRD_SEND_DESC_ALIGN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space line after

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

#endif
Loading
Loading