Skip to content

Commit

Permalink
add BOS & EOS
Browse files Browse the repository at this point in the history
  • Loading branch information
grencez committed Jul 27, 2024
1 parent 18c7820 commit 8d8ccad
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 4 deletions.
2 changes: 1 addition & 1 deletion example/prompt/assistant_chatml/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
This example should be run with [ChatML](https://github.com/openai/openai-python/blob/main/chatml.md)-style models that are tuned to behave like an instruction-following assistant chatbot.

The model typically should have special `<|im_start|>` and `<|im_end|>` tokens, but `setting.sxpb` configures fallbacks that attempt to support any model.
Models that don't support ChatML may produce nonsense, but Gemma seems to behave well, so we specifically recognize Gemma-style `<start_of_turn>` and `<end_of_turn>` tokens in this example.
Models that don't support ChatML may produce nonsense, but Gemma seems to behave well, so we specifically try Gemma-style `<start_of_turn>` and `<end_of_turn>` tokens as fallbacks.
When no special tokens are found, we fall back to using BOS and EOS tokens to support jondurbin's Bagel finetunes like [bagel-7b-v0.5](https://huggingface.co/jondurbin/bagel-7b-v0.5).
7 changes: 4 additions & 3 deletions example/prompt/assistant_chatml/setting.sxpb
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,22 @@
)
)
(substitution
; Uncomment the next 2 lines if your model doesn't support ChatML tokens.
;(bos_token_alias "<|im_start|>")
;(eos_token_alias "<|im_end|>")
(bos_token_alias "<bos_token>")
(eos_token_alias "<eos_token>")
(special_tokens (())
(()
(alias "<|im_start|>")
(candidates (())
"<|im_start|>"
"<start_of_turn>" ; For Gemma models.
"<bos_token>"
))
(()
(alias "<|im_end|>")
(candidates (())
"<|im_end|>"
"<end_of_turn>" ; For Gemma models.
"<eos_token>"
))
)
)
Expand Down

0 comments on commit 8d8ccad

Please sign in to comment.