template differences？ #77

tsw123678 · 2024-06-11T13:02:21Z

Are there any differences in the _make_masks function across different LLM models? Don't they all compute loss only for the response part? What causes the variations among them?

jiajunlong · 2024-06-12T04:05:45Z

Different models use different tokenizers, and when different tokenizers tokenize the text, the corresponding label positions are different.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

template differences？ #77

template differences？ #77

tsw123678 commented Jun 11, 2024

jiajunlong commented Jun 12, 2024

template differences？ #77

template differences？ #77

Comments

tsw123678 commented Jun 11, 2024

jiajunlong commented Jun 12, 2024