Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Tensor parallelism for tensorrt_llm #79

Open
JoeLiu996 opened this issue Jul 5, 2024 · 1 comment
Open

[Question] Tensor parallelism for tensorrt_llm #79

JoeLiu996 opened this issue Jul 5, 2024 · 1 comment
Assignees
Labels
non-stale This label can be used to prevent marking issues or PRs as Stale

Comments

@JoeLiu996
Copy link

Is your feature request related to a problem? Please describe.
I am aware that PyTriton already have an example for using PyTriton with tensorrt_llm. But I noticed that the example only support single gpu inference. Therefore, may I ask is there any other examples or reference docs which using tensorrt_llm with PyTriton and support tensor parallelism.

Describe the solution you'd like
I think right now the example is excellent, but will be more comprehensive if can add multiple gpu inference(tensor parallelism inference) examples since this will be one of the widely use case.

@JoeLiu996 JoeLiu996 changed the title Tensor parallelism for tensorrt_llm [Question] Tensor parallelism for tensorrt_llm Jul 5, 2024
@piotrm-nvidia piotrm-nvidia self-assigned this Jul 8, 2024
Copy link

This issue is stale because it has been open 21 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 30, 2024
@piotrm-nvidia piotrm-nvidia added non-stale This label can be used to prevent marking issues or PRs as Stale and removed Stale labels Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-stale This label can be used to prevent marking issues or PRs as Stale
Projects
None yet
Development

No branches or pull requests

3 participants