Add support for Qwen2 models #746

dacorvo · 2024-12-03T11:06:05Z

What does this PR do?

This adds support for the Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes: 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.

It also supports yhe Qwen2.5 Coder and Math variants.

@JingyaHuang I think my previous pull-request to bump AWS Neuron SDK version to 2.20.2 made the T5 export unstable (at least on an 8xlarge): could you check ?

These configs depend on the availability of transformers_neuronx, so it makes sense to only register them if the package is available.

HuggingFaceDocBuilderDev · 2024-12-03T16:12:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

pagezyhf · 2024-12-04T12:50:23Z

Could you add an inference tutorial to the doc for this one? I think it would be a good material for comms

JingyaHuang

Left some nits, lgtm in general.

optimum/neuron/models/qwen2/model.py

JingyaHuang · 2024-12-04T14:34:54Z

optimum/exporters/neuron/model_configs/decoder_configs.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Neuron export configurations for models using transformers_neuronx."""


why name the file decoder config? it sounds a bit confusing as the other one is named traced config...

I can use another name if you prefer, it is just that these are the names that are already used in modeling.

optimum/neuron/models/qwen2/modules.py

JingyaHuang · 2024-12-04T14:39:42Z

optimum/neuron/utils/input_generators.py

@@ -18,7 +18,7 @@

 import torch

-from ...utils import (


for the optimum subpkg design, previous path was preferred.

No, in the contrary: this is precisely the kind of relative import that is the root cause of all the issues we have.
optimum is a namespace, not a package, and we should never use relative imports between packages inside the namespace.

Could you elaborate on what issues? optimum namespace pkg is mandatory for all subpkg, tbh absolute or relative import both are ok for me, and absolute paths are what I used beginning this subpkg, but it's the rule of thumb that I was told that I adopted relative import, if it's causing any issue, I will change those imports to absolute.

Yes, you must use absolute, because otherwise you can't install the package in editable mode (among other side-effects).

michaelbenayoun

Not very knowledgeable on the topic but lgtm overall.

michaelbenayoun · 2024-12-04T15:20:29Z

optimum/neuron/models/qwen2/model.py

+    def __init__(
+        self,
+        config,
+        *,
+        n_positions=2048,
+        batch_size=1,
+        amp="f32",
+        tp_degree=2,
+        context_length_estimate=None,
+        context_unroll=None,
+        unroll=None,
+        neuron_config=None,
+        prefixed_length=0,
+        **kwargs,
+    ):


nit: can we add type annotations please?

JingyaHuang

LGTM, thanks @dacorvo !

JingyaHuang · 2024-12-05T13:43:44Z

I will take care of the t5 tp test failure in a coming PR, no worries.

dacorvo added 3 commits December 2, 2024 15:18

feat(export): allow to set fuse_qkv=False for decoders

4457558

feat(decoder): add Qwen2 modeling code

bb05b89

feat(decoder): allow export from local class

b7adfe0

dacorvo force-pushed the qwen2 branch from 0e5e040 to 6541642 Compare December 3, 2024 15:18

dacorvo added 5 commits December 3, 2024 16:06

refactor(exporters): isolate decoder export configs

2afc69d

These configs depend on the availability of transformers_neuronx, so it makes sense to only register them if the package is available.

feat(export): add support for Qwen2

d0f1ebc

test(decoder): add Qwen2

bd9dd97

test(tgi): add Qwen2 tests

c61cadd

perf: add QwenMath accuracy example results

ddc084b

dacorvo force-pushed the qwen2 branch from 6541642 to ddc084b Compare December 3, 2024 16:08

perf(tgi): add Qwen2.5-7b performances

1c831a7

dacorvo marked this pull request as ready for review December 3, 2024 16:18

dacorvo requested review from JingyaHuang, michaelbenayoun, philschmid and pagezyhf December 3, 2024 16:18

JingyaHuang reviewed Dec 4, 2024

View reviewed changes

michaelbenayoun approved these changes Dec 4, 2024

View reviewed changes

dacorvo added 2 commits December 4, 2024 15:56

review: add type annotations

113810e

review: add comments in Qwen2ForSampling

6012a37

dacorvo requested a review from JingyaHuang December 5, 2024 09:12

JingyaHuang approved these changes Dec 5, 2024

View reviewed changes

dacorvo merged commit 6ba3db9 into main Dec 5, 2024
10 of 11 checks passed

dacorvo deleted the qwen2 branch December 5, 2024 13:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Qwen2 models #746

Add support for Qwen2 models #746

dacorvo commented Dec 3, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 3, 2024

pagezyhf commented Dec 4, 2024

JingyaHuang left a comment

JingyaHuang Dec 4, 2024

dacorvo Dec 4, 2024

JingyaHuang Dec 4, 2024

dacorvo Dec 4, 2024

JingyaHuang Dec 4, 2024

dacorvo Dec 4, 2024

michaelbenayoun left a comment

michaelbenayoun Dec 4, 2024

JingyaHuang left a comment

JingyaHuang commented Dec 5, 2024

Add support for Qwen2 models #746

Add support for Qwen2 models #746

Conversation

dacorvo commented Dec 3, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Dec 3, 2024

pagezyhf commented Dec 4, 2024

JingyaHuang left a comment

Choose a reason for hiding this comment

JingyaHuang Dec 4, 2024

Choose a reason for hiding this comment

dacorvo Dec 4, 2024

Choose a reason for hiding this comment

JingyaHuang Dec 4, 2024

Choose a reason for hiding this comment

dacorvo Dec 4, 2024

Choose a reason for hiding this comment

JingyaHuang Dec 4, 2024

Choose a reason for hiding this comment

dacorvo Dec 4, 2024

Choose a reason for hiding this comment

michaelbenayoun left a comment

Choose a reason for hiding this comment

michaelbenayoun Dec 4, 2024

Choose a reason for hiding this comment

JingyaHuang left a comment

Choose a reason for hiding this comment

JingyaHuang commented Dec 5, 2024

dacorvo commented Dec 3, 2024 •

edited

Loading