Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply dualpipe from deepseek-v3 to a trainer or model #36439

Closed
jp1924 opened this issue Feb 27, 2025 · 0 comments
Closed

Apply dualpipe from deepseek-v3 to a trainer or model #36439

jp1924 opened this issue Feb 27, 2025 · 0 comments
Labels
Feature request Request for a new feature

Comments

@jp1924
Copy link
Contributor

jp1924 commented Feb 27, 2025

Feature request

Applying deepseek's dual-pipe to transformers or accelrate

Motivation

Recently, deepseek released the proposed dual-pipe code in the DeepSeek-V3 Technical Report.
Looking at the code structure, it is 100% python, and it seems easy to apply.

Your contribution

We can modify the code of dual-pipe and apply it to transformer.

@jp1924 jp1924 added the Feature request Request for a new feature label Feb 27, 2025
@jp1924 jp1924 closed this as completed Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

1 participant