Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update collective framework count/disp arrays for bigcount #12621

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jtronge
Copy link
Contributor

@jtronge jtronge commented Jun 14, 2024

This updates the collective framework functions using count/displacement arrays to support bigcount. Instead of directly using pointers to bigcount and non-bigcount arrays, this adds special descriptor types, ompi_count_array and ompi_disp_array, that internally hold a union of either type. Collective components can now access count and displacement values through inline get functions on the descriptors, allowing use of both bigcount/non-bigcount arrays, depending on how the descriptors were initialized.

Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This touches more code than I thought it would but I like the mechanism. Just some nitpicks.

ompi/util/count_disp_array.h Outdated Show resolved Hide resolved
ompi/util/count_disp_array.h Outdated Show resolved Hide resolved
ompi/util/count_disp_array.h Outdated Show resolved Hide resolved
ompi/util/count_disp_array.h Outdated Show resolved Hide resolved
Comment on lines 50 to 57
#define OMPI_COUNT_ARRAY_INIT(array, data) \
do { \
if (sizeof(*data) == sizeof(int)) { \
ompi_count_array_init(array, (const int *) data); \
} else if (sizeof(*data) == sizeof(size_t)) { \
ompi_count_array_init_c(array, (const size_t *) data); \
} \
} while (0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parens around macro parameters:

Suggested change
#define OMPI_COUNT_ARRAY_INIT(array, data) \
do { \
if (sizeof(*data) == sizeof(int)) { \
ompi_count_array_init(array, (const int *) data); \
} else if (sizeof(*data) == sizeof(size_t)) { \
ompi_count_array_init_c(array, (const size_t *) data); \
} \
} while (0)
#define OMPI_COUNT_ARRAY_INIT(array, data) \
do { \
if (sizeof(*(data)) == sizeof(int)) { \
ompi_count_array_init((array), (const int *) (data)); \
} else if (sizeof(*(data)) == sizeof(size_t)) { \
ompi_count_array_init_c((array), (const size_t *) (data)); \
} \
} while (0)

ompi/util/count_disp_array.h Outdated Show resolved Hide resolved
Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a full review. Most points are minor (I hope).

ompi/mca/coll/han/coll_han_gatherv.c Show resolved Hide resolved
@@ -121,7 +121,8 @@ int mca_coll_hcoll_allgatherv(const void *sbuf, int scount,
mca_coll_hcoll_module_t *hcoll_module = (mca_coll_hcoll_module_t*)module;
stype = ompi_dtype_2_hcoll_dtype(sdtype, NO_DERIVED);
rtype = ompi_dtype_2_hcoll_dtype(rdtype, NO_DERIVED);
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype))) {
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype)
|| ompi_count_array_is_64bit(rcount))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably something that someone from @janjust may want to look at: should hcoll disqualify itself if the count argument is 64bit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janjust Ping

ompi/mca/coll/hcoll/coll_hcoll_ops.c Show resolved Hide resolved
Comment on lines +110 to +121
/* TODO:BIGCOUNT: Remove tehese temporaries once ompi_datatype is updated for bigcount */
int *tmp_rcounts = malloc(sizeof(int) * size);
int *tmp_disps = malloc(sizeof(int) * size);
if (NULL == tmp_rcounts || NULL == tmp_disps) {
err = OMPI_ERR_OUT_OF_RESOURCE;
goto exit;
}
for (i = 0; i < size; ++i) {
tmp_rcounts[i] = (int) ompi_count_array_get(rcounts, i);
tmp_disps[i] = (int) ompi_disp_array_get(disps, i);
}
ompi_datatype_create_indexed(size,tmp_rcounts,tmp_disps,rdtype,&ndtype);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a PR for the datatype interface? I wonder whether we can reuse the array structure there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a draft PR: #12351. It seems like these structs could be used there as well.

ompi/mca/coll/inter/coll_inter_scatterv.c Outdated Show resolved Hide resolved
ompi/mca/coll/inter/coll_inter_scatterv.c Outdated Show resolved Hide resolved
ompi/mca/coll/inter/coll_inter_scatterv.c Outdated Show resolved Hide resolved
ompi/mca/coll/ucc/coll_ucc_allgatherv.c Outdated Show resolved Hide resolved
ompi/mca/io/ompio/io_ompio.c Outdated Show resolved Hide resolved
ompi/mca/io/ompio/io_ompio.c Outdated Show resolved Hide resolved
This updates the coll framework functions using count/displacement
arrays to support bigcount. Instead of directly using pointers to
bigcount/non-bigcount type arrays, this adds special descriptor types,
ompi_count_array_t and ompi_disp_array_t, that internally hold a union of
either type. Collective components can now access count and displacement
values through inline get functions on the descriptors, allowing use of
both bigcount/non-bigcount arrays, depending on how the descriptors were
initialized.

Co-authored-by: Howard Pritchard <[email protected]>
Signed-off-by: Jake Tronge <[email protected]>
@jtronge jtronge force-pushed the coll-framework-bigcount-arrays branch from a551039 to 2565796 Compare June 19, 2024 16:30
@jtronge jtronge marked this pull request as ready for review June 19, 2024 16:31
Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jtronge! LGTM now but I'd like for @bosilca and @janjust to take a look as well.

ompi/mca/coll/hcoll/coll_hcoll_ops.c Show resolved Hide resolved
@@ -121,7 +121,8 @@ int mca_coll_hcoll_allgatherv(const void *sbuf, int scount,
mca_coll_hcoll_module_t *hcoll_module = (mca_coll_hcoll_module_t*)module;
stype = ompi_dtype_2_hcoll_dtype(sdtype, NO_DERIVED);
rtype = ompi_dtype_2_hcoll_dtype(rdtype, NO_DERIVED);
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype))) {
if (OPAL_UNLIKELY(HCOL_DTE_IS_ZERO(stype) || HCOL_DTE_IS_ZERO(rtype)
|| ompi_count_array_is_64bit(rcount))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janjust Ping

@hppritcha
Copy link
Member

I believe hcoll is being replaced by UCC so i don't think there's going to be an investement of time in embiggening hcoll unless it already is.

@hppritcha hppritcha requested a review from janjust June 20, 2024 19:10
Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks legit. I have 2 comments:

  1. I don't really like the new ompi_count_array_t and ompi_disp_array_t structs because they misalign everything due to their sizes (enum, aka int, + pointer). Taking in account that we only deal with 32 or 64 bits arrays, we could taken advantage of their implicit alignment (to 4 or 8 bytes), and repurpose the last bit as a 32/64 indicator.
typedef struct ompi_count_array_t {
    union {
        const int *int_array;
        const size_t *size_t_array;
    } data;
} ompi_count_array_t;

static inline void ompi_count_array_init(ompi_count_array_t *array, const int *data)
{
    array->data.int_array = data | 0x1;
}

static inline void ompi_count_array_init_c(ompi_count_array_t *array, const size_t *data)
{
    array->data.size_t_array = data;
}

static inline bool ompi_count_array_is_64bit(ompi_count_array_t *array)
{
    return !(array->data.size_t_array & 0x1);
}

static inline const void *ompi_count_array_ptr(ompi_count_array_t *array)
{
    return array->data.size_t_array;
}

/* Get a count in the array at index i */
static inline size_t ompi_count_array_get(ompi_count_array_t *array, size_t i)
{
    if (OPAL_LIKELY(array->data.int_t_array & 0x1)) {
        return array->data.int_array[i];
    }
    return array->data.size_t_array[i];
}
  1. Why do you call the union field data but the corresponding accessor is called ompi_count_array_ptr (basically ptr) ?

@hppritcha
Copy link
Member

well the suggested code doesn't even compile but I'll see about using the idea

make[2]: Entering directory '/usr/projects/artab/users/hpp/ompi/ompi/datatype'
  CC       ompi_datatype_module.lo
In file included from ../../ompi/mca/coll/coll.h:85,
                 from ../../ompi/instance/instance.h:22,
                 from ompi_datatype_module.c:42:
../../ompi/util/count_disp_array.h: In function 'ompi_count_array_init':
../../ompi/util/count_disp_array.h:26:48: error: invalid operands to binary | (have 'const int *' and 'int')
   26 |     array->data.int_array = (const int *)(data | 0x1);
      |                                                ^
../../ompi/util/count_disp_array.h: In function 'ompi_count_array_is_64bit':
../../ompi/util/count_disp_array.h:55:39: error: invalid operands to binary & (have 'const size_t *' {aka 'const long unsigned int *'} and 'int')
   55 |     return !(array->data.size_t_array & 0x1);
      |              ~~~~~~~~~~~~~~~~~~~~~~~~ ^
      |                         |
      |                         const size_t * {aka const long unsigned int *}

Signed-off-by: Howard Pritchard <[email protected]>
@hppritcha
Copy link
Member

@bosilca okay there's a commit with your suggested change - modified so its compilable and correct.
@devreal what do you think of this?

static inline const void *ompi_disp_array_ptr(ompi_disp_array_t *array)
{
if (OPAL_LIKELY((uint64_t)array->data.int_array & 0x1L)){
return (const void *)((uint64_t)array->data.int_array & ~0x1L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small suggestion: I'd prefer intptr_t instead of uint64_t. long (as in ~0x1L) is not guaranteed to be 64bit on all machines but the promotion to (signed) intptr_t in two-compliment will guarantee that the sign is propagated and we end up with a correct 64bit mask. I'm not sure this will work with unsigned 64bit. Also, intptr_t is guaranteed to be large enough to hold a pointer.

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better. One thing I'm concerned about is the use of pointers to ompi_count_array_t and ompi_disp_array_t.

First, because it suggests the possible use of a pointer to an object located outside the scope of the collective. This is not true for persistent or nonblocking collectives, as this object must be available for the entire duration of the operations. I know in the current code this is not really a pointer to an object, but the user pointer disguised, but other casual readers might be unaware of this.

Moreover, if we change the prototype to the internal functions to use ompi_count_array_t and ompi_disp_array_t, in addition to a cleaner code, it will also prevent the use of these values directly (without going through the accessors that returns the pointers to the arrays).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants