Fix the usage of meta information by artpol84 · Pull Request #23 · perplexityai/pplx-kernels

artpol84 · 2025-06-05T16:17:04Z

Hello,

While reviewing the code, I spotted a potential misuse of the meta-arrays generated during the dispatch phase.
Please find my reasoning below.

Here instead of getting sequential indices of tokens sent to this specific expert, the global token numbers will be obtained.
This will result in tokens written in non-contiguous fashion (instead of being compacted at the beginning of the corresponding buffer segment).

Because the target buffer is always large enough to accommodate all buffers, it will not cause a buffer overflow.
And due to the absence (yet?) of data verification, the issue went unnoticed.

…ombine

artpol84 · 2025-06-05T16:41:53Z

BTW, the remote token number (stored in sourceToken) seems to be unused now.
It may be good information for logging/verification/etc., but it shouldn't be needed for data exchange.
IMO expert shouldn't care about the original indexing of tokens on the originating rank. All that's important is the logical order of the incoming tokens as observed by the expert (which should be located in sourceIndex)

nandor · 2025-06-05T16:42:58Z

It is used in combine

artpol84 · 2025-06-06T14:47:25Z

Sketches of the current and possible solutions as discussed offline:

artpol84 · 2025-06-06T15:23:19Z

While reviewing the code, I noticed that the kernels copy data from CUDA buffers to NVSHMEM buffers before sending. For example, in dispatch here. This effectively disables zero copy that is possible as NVSHMEM can send from both NVSHMEM and CUDA buffers.

This is required for 2 reasons:

To combine the token and "scale" portion
To piggy-back the token number (here)

In the proposed variant, no piggy-back is required, so if performance-wise it's ok to send the token and it's scaling addition separately, no copy is required.

I want to play with it to see if it gives any improvement.

Fix the usage of meta information generated in between dispatch and c…

1f3ab05

…ombine

abcdabcd987 requested a review from nandor June 5, 2025 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the usage of meta information #23

Fix the usage of meta information #23
artpol84 wants to merge 1 commit intoperplexityai:masterfrom
artpol84:fix/dispatch_meta_info

artpol84 commented Jun 5, 2025

Uh oh!

artpol84 commented Jun 5, 2025

Uh oh!

nandor commented Jun 5, 2025

Uh oh!

artpol84 commented Jun 6, 2025

Uh oh!

artpol84 commented Jun 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

artpol84 commented Jun 5, 2025

Uh oh!

artpol84 commented Jun 5, 2025

Uh oh!

nandor commented Jun 5, 2025

Uh oh!

artpol84 commented Jun 6, 2025

Uh oh!

artpol84 commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

artpol84 commented Jun 6, 2025 •

edited

Loading