Skip to content

[Question]: bootstrap persistence & multi-module coexistence #68

@Enigmatisms

Description

@Enigmatisms

Question

NVSHMEM 3.x | MoE training (DeepSeek DeepEP + custom modules in one process)

Background

We're building a training framework where multiple independent modules (e.g., DeepEP for MoE all-to-all, a custom overlap module) each call nvshmemx_init_attr(NVSHMEMX_INIT_WITH_UNIQUEID, ...) with potentially different rank/nranks. We ran into some issues and did a source audit — hoping to confirm our understanding and get advice.

Q1: Is it possible to change nranks via finalize + re-init?

From reading the source, it appears that nvshmem_finalize() resets the initialized flag but not the bootstrapped flag. So after finalize + re-init, the new rank/nranks/uid parameters are silently ignored — bootstrap is skipped entirely and the old boot_handle (pg_rank, pg_size, node topology, etc.) is reused.

Is this intended? Is there any supported way to fully reset bootstrap state within a process so that a subsequent init can join a different-sized communication world?

Q2: Multiple modules calling init with different nranks — is this safe?

For example:

  • Module A inits with nranks=8 (intra-node only)
  • Module B inits with nranks=32 (all expert-parallel ranks)

Since NVSHMEM is a process-global singleton, the second init just bumps the refcount and its parameters are discarded. Module B ends up in Module A's 32-PE world without knowing it. This seems fundamentally unsafe — wrong PE numbering, nvshmem_malloc blocking on a global barrier that Module B's subset can't satisfy, etc.

What's the recommended practice here? Our best guess is:

  1. Coordinate a single init with the superset of all PEs
  2. Use nvshmem_team_split_strided() for per-module sub-groups
  3. Use team-based collectives + nvshmem_team_translate_pe() for RMA

Is this the right approach? Any guidance on coordinating NVSHMEM across independently-developed modules would be very helpful.

Context

This came up while integrating DeepEP (which has its own init/finalize lifecycle) into PaddlePaddle alongside other NVSHMEM-based modules. Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions