-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Qwen2 models #746
Conversation
These configs depend on the availability of transformers_neuronx, so it makes sense to only register them if the package is available.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Could you add an inference tutorial to the doc for this one? I think it would be a good material for comms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some nits, lgtm in general.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
"""Neuron export configurations for models using transformers_neuronx.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why name the file decoder config? it sounds a bit confusing as the other one is named traced config...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can use another name if you prefer, it is just that these are the names that are already used in modeling.
@@ -18,7 +18,7 @@ | |||
|
|||
import torch | |||
|
|||
from ...utils import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the optimum subpkg design, previous path was preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, in the contrary: this is precisely the kind of relative import that is the root cause of all the issues we have.
optimum is a namespace, not a package, and we should never use relative imports between packages inside the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on what issues? optimum namespace pkg is mandatory for all subpkg, tbh absolute or relative import both are ok for me, and absolute paths are what I used beginning this subpkg, but it's the rule of thumb that I was told that I adopted relative import, if it's causing any issue, I will change those imports to absolute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you must use absolute, because otherwise you can't install the package in editable mode (among other side-effects).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very knowledgeable on the topic but lgtm overall.
optimum/neuron/models/qwen2/model.py
Outdated
def __init__( | ||
self, | ||
config, | ||
*, | ||
n_positions=2048, | ||
batch_size=1, | ||
amp="f32", | ||
tp_degree=2, | ||
context_length_estimate=None, | ||
context_unroll=None, | ||
unroll=None, | ||
neuron_config=None, | ||
prefixed_length=0, | ||
**kwargs, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add type annotations please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @dacorvo !
I will take care of the t5 tp test failure in a coming PR, no worries. |
What does this PR do?
This adds support for the Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes: 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B.
It also supports yhe Qwen2.5 Coder and Math variants.
@JingyaHuang I think my previous pull-request to bump AWS Neuron SDK version to 2.20.2 made the T5 export unstable (at least on an 8xlarge): could you check ?