support nemo llama 70b lora train by youth123 · Pull Request #1084 · flagos-ai/FlagScale

youth123 · 2026-01-26T07:21:49Z

PR Category
Train

PR Types
New Features

PR Description

Supports loading and saving checkpoints in nemo zarr format
Supports train packed seqs
Fix the issue where wandb finalization cannot find the latest_checkpointed_iteration file
Fix lora can not support layernorm weight load & not support nemo zarr

The checkpoint file format is as follows：
load zarr format:
-context
-weights
-module.decoder.xxx._extra_state
-module.decoder.xxx.weight
-optimizer.state.fp32_param.xxx.weight
-optimizer.state.fp32_param.xxx.weight.sync
common.pt
meatadata.json

save zarr format：
-iter_xxx
-module.decoder.xxx._extra_state
-module.decoder.xxx.weight
-optimizer.state.fp32_param.xxx.weight
-optimizer.state.fp32_param.xxx.weight.sync
common.pt
meatadata.json
latest_checkpointed_iteration.txt

CLAassistant · 2026-01-26T07:21:59Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

liji seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

tengqm · 2026-02-03T11:36:04Z

@youth123 Please help double check if you are submitting the PR using the email address for your github account (git config user.email), also please make sure you have signed the CLA in order for your PR to be reviewed/approved. Thanks.

support nemo llama 70b lora train

97394c8

youth123 requested review from aoyulong, heavyrain-lzy and zhaoyinglia as code owners January 26, 2026 07:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support nemo llama 70b lora train#1084

support nemo llama 70b lora train#1084
youth123 wants to merge 1 commit intoflagos-ai:main-legacyfrom
youth123:support_mlperf_nemo_v2

youth123 commented Jan 26, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Jan 26, 2026

Uh oh!

tengqm commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

youth123 commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Jan 26, 2026

Uh oh!

tengqm commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

youth123 commented Jan 26, 2026 •

edited

Loading