Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Qwen2 models #746

Merged
merged 11 commits into from
Dec 5, 2024
Merged

Add support for Qwen2 models #746

merged 11 commits into from
Dec 5, 2024

Conversation

dacorvo
Copy link
Collaborator

@dacorvo dacorvo commented Dec 3, 2024

What does this PR do?

This adds support for the Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes: 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.

It also supports yhe Qwen2.5 Coder and Math variants.

@JingyaHuang I think my previous pull-request to bump AWS Neuron SDK version to 2.20.2 made the T5 export unstable (at least on an 8xlarge): could you check ?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@dacorvo dacorvo marked this pull request as ready for review December 3, 2024 16:18
@pagezyhf
Copy link
Collaborator

pagezyhf commented Dec 4, 2024

Could you add an inference tutorial to the doc for this one? I think it would be a good material for comms

Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some nits, lgtm in general.

optimum/neuron/models/qwen2/model.py Show resolved Hide resolved
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Neuron export configurations for models using transformers_neuronx."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why name the file decoder config? it sounds a bit confusing as the other one is named traced config...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use another name if you prefer, it is just that these are the names that are already used in modeling.

optimum/neuron/models/qwen2/modules.py Show resolved Hide resolved
@@ -18,7 +18,7 @@

import torch

from ...utils import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the optimum subpkg design, previous path was preferred.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in the contrary: this is precisely the kind of relative import that is the root cause of all the issues we have.
optimum is a namespace, not a package, and we should never use relative imports between packages inside the namespace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on what issues? optimum namespace pkg is mandatory for all subpkg, tbh absolute or relative import both are ok for me, and absolute paths are what I used beginning this subpkg, but it's the rule of thumb that I was told that I adopted relative import, if it's causing any issue, I will change those imports to absolute.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you must use absolute, because otherwise you can't install the package in editable mode (among other side-effects).

Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very knowledgeable on the topic but lgtm overall.

Comment on lines 30 to 44
def __init__(
self,
config,
*,
n_positions=2048,
batch_size=1,
amp="f32",
tp_degree=2,
context_length_estimate=None,
context_unroll=None,
unroll=None,
neuron_config=None,
prefixed_length=0,
**kwargs,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add type annotations please?

@dacorvo dacorvo requested a review from JingyaHuang December 5, 2024 09:12
Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @dacorvo !

@JingyaHuang
Copy link
Collaborator

I will take care of the t5 tp test failure in a coming PR, no worries.

@dacorvo dacorvo merged commit 6ba3db9 into main Dec 5, 2024
10 of 11 checks passed
@dacorvo dacorvo deleted the qwen2 branch December 5, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants