-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/GGA: basic iface and ep infrastructure #9845
base: master
Are you sure you want to change the base?
UCT/GGA: basic iface and ep infrastructure #9845
Conversation
@iyastreb can you pls review? |
@@ -72,6 +72,7 @@ enum { | |||
UCT_IB_MLX5_CMD_OP_2ERR_QP = 0x507, | |||
UCT_IB_MLX5_CMD_OP_2RST_QP = 0x50a, | |||
UCT_IB_MLX5_CMD_OP_QUERY_QP = 0x50b, | |||
UCT_IB_MLX5_CMD_OP_INIT2INIT_QP = 0x50e, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes covered by unit/integration tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is part of GGA ep_create
flow
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
@@ -29,6 +29,46 @@ typedef struct uct_gga_mlx5_iface_config { | |||
uct_rc_mlx5_iface_common_config_t rc_mlx5_common; | |||
} uct_gga_mlx5_iface_config_t; | |||
|
|||
typedef struct uct_gga_mlx5_dma_opaque { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: here and below, struct name is redundant, typedef should be sufficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is "UCX code style"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, in many places in UCX I see only typedef, but up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need names for internal structs (unless it's a forward declaration)
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
return UCS_ERR_NO_MEMORY; | ||
} | ||
|
||
ep->dma_opaque.mr = ibv_reg_mr(md->super.pd, ep->dma_opaque.buf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we reuse here uct_ib_reg_mr?
uct_ib_reg_mr handles EAGAIN error by repeating the call, should it be the similar in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we reuse here uct_ib_reg_mr?
IMO it's overcomplicated in this case
uct_ib_reg_mr handles EAGAIN error by repeating the call, should it be the similar in this case?
no, that's only for ODP case
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
iface_attr->device_addr_len += ucs_offsetof(uct_gga_mlx5_dev_addr_t, | ||
ib_addr); | ||
iface_attr->ep_addr_len = sizeof(uct_gga_mlx5_ep_address_t); | ||
iface_attr->cap.flags = /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, placeholder for a next patch
src/uct/ib/rc/accel/rc_mlx5_ep.c
Outdated
uct_rc_ep_cleanup_qp(&self->super.super, &cleanup_ctx->super, | ||
self->super.tx.wq.super.qp_num, | ||
outstanding - wqe_count); | ||
cleanup_ctx = ucs_malloc(sizeof(*cleanup_ctx), "mlx5_qp_cleanup_ctx"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unusual to allocate memory during cleanup. Where this memory is released?
Maybe allocation on stack would suffice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is async cleanup context, see IBV_EVENT_QP_LAST_WQE_REACHED
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
uct_gga_mlx5_iface_qp_cleanup_ctx_t *gga_ctx = | ||
ucs_derived_of(ctx, uct_gga_mlx5_iface_qp_cleanup_ctx_t); | ||
|
||
ibv_dereg_mr(gga_ctx->dma_opaque.mr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe uct_ib_dereg_mr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be aligned with https://github.com/openucx/ucx/pull/9845/files#r1587560718
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
@@ -29,6 +29,46 @@ typedef struct uct_gga_mlx5_iface_config { | |||
uct_rc_mlx5_iface_common_config_t rc_mlx5_common; | |||
} uct_gga_mlx5_iface_config_t; | |||
|
|||
typedef struct uct_gga_mlx5_dma_opaque { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need names for internal structs (unless it's a forward declaration)
src/uct/ib/mlx5/dv/ib_mlx5_dv.c
Outdated
UCT_IB_MLX5DV_SET(qpc, qpc, rq_type, !!attr->super.srq_num); | ||
UCT_IB_MLX5DV_SET(qpc, qpc, rq_type, | ||
attr->super.srq_num ? 1 /* SRQ */ : | ||
max_rx ? 0 /* RQ */ : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do we have a case of "RQ?
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
|
||
uct_ib_pack_uint24(gga_addr->qp_num, ep->super.tx.wq.super.qp_num); | ||
if (uct_rc_iface_flush_rkey_enabled(&iface->super)) { | ||
gga_addr->flags = UCT_GGA_MLX5_EP_ADDRESS_FLAG_FLUSH_RKEY; | ||
gga_addr->flush_rkey = (md->flush_rkey >> 16); | ||
gga_addr->flush_rkey = md->flush_rkey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flush_rkey in the address could be 24 bit since its lower 8 bits are always 0
see
ucx/src/uct/ib/mlx5/dv/ib_mlx5dv_md.c
Line 1739 in 407ab27
ucs_assert((md->super.flush_rkey & 0xff) == 0); |
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
iface = ucs_derived_of(tl_iface, uct_base_iface_t); | ||
md = ucs_derived_of(iface->md, uct_ib_mlx5_md_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init inline
src/uct/ib/base/ib_iface.h
Outdated
@@ -718,6 +718,10 @@ uct_ib_iface_roce_dscp(uct_ib_iface_t *iface) | |||
return iface->config.traffic_class >> 2; | |||
} | |||
|
|||
int uct_ib_iface_dev_addr_is_reachable(uct_ib_iface_t *iface, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls move this along with other function prototypes in this file
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
} | ||
|
||
static ucs_status_t | ||
uct_gga_mlx5_iface_init_rx(uct_rc_iface_t *rc_iface, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this wrapper?
src/uct/ib/rc/accel/gga_mlx5.c
Outdated
return UCS_OK; | ||
} | ||
|
||
static void uct_gga_mlx5_iface_qp_cleanup(uct_rc_iface_qp_cleanup_ctx_t *ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need cleanup mechanism for GGA. Cleanup is about waiting for LAST_WQE_REACHED event, which is not relevant since we don't have SEND/RECV operations. I'd suggest to not add it in this PR, unless we see some failure. Can test error handling and add it later if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoid LAST_WQE_REACHED
flow adding async
parameter which is false for GGA, this approach adds some overhead on ep_destroy
but re-uses dtors of base classed without significant change, otherwise need to refactor them deeper
src/uct/ib/rc/accel/rc_mlx5_common.c
Outdated
@@ -210,7 +210,8 @@ void uct_rc_mlx5_iface_common_prepost_recvs(uct_rc_mlx5_iface_common_t *iface) | |||
{ | |||
/* prepost recvs only if quota available (recvs were not preposted | |||
* before) */ | |||
if (iface->super.rx.srq.quota == 0) { | |||
if ((iface->super.rx.srq.quota == 0) || | |||
(iface->rx.srq.type == UCT_IB_MLX5_OBJ_TYPE_NULL)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the check iface->super.rx.srq.quota == 0 is not enough? can we set quota to 0 for gga?
@@ -526,6 +527,7 @@ void uct_rc_mlx5_iface_fill_attr(uct_rc_mlx5_iface_common_t *iface, | |||
srq->verbs.srq); | |||
break; | |||
case UCT_IB_MLX5_OBJ_TYPE_DEVX: | |||
case UCT_IB_MLX5_OBJ_TYPE_NULL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why UCT_IB_MLX5_OBJ_TYPE_NULL is not an empty operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question, I don't know why SRQ obj type defines QP attrs for TX too :)
ucs_mpool_params_reset(&mp_params); | ||
mp_params.elem_size = init_attr->fc_req_size; | ||
mp_params.elem_size = ucs_max(init_attr->fc_req_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why needed to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because fc_req_size == 0
for GGA
test/gtest/uct/ib/test_ib.cc
Outdated
@@ -237,6 +237,10 @@ test_uct_ib_with_specific_port::test_uct_ib_with_specific_port() { | |||
} | |||
|
|||
void test_uct_ib_with_specific_port::init() { | |||
if (has_transport("gga_mlx5")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the skip
/azp run |
Azure Pipelines successfully started running 4 pipeline(s). |
What
Implement and test basic GGA iface and ep infrastructure
Why ?
Continue #9811