Skip to content

Commit

Permalink
chore: Update tests/test_tokenization.py
Browse files Browse the repository at this point in the history
  • Loading branch information
le1nux authored Jun 14, 2024
1 parent 8af49c1 commit 1cffee8
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion tests/test_tokenization.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def _assert_tokenization(tokenizer: TokenizerWrapper):
# If padding="max_length", we want a sequence to be padded to the max_length, irrespective of the truncation flag
# and only if max_length is specified.
# If max_length is not specified, we pad to the max model input length (i.e., 1024 for the gpt2 model).
# NOTE: "AAAAAAAA" is a single token for the gpt2 tokenizer, there is "A" sequence longer than that in the vocabulary
# NOTE: "AAAAAAAA" is a single token for the gpt2 tokenizer, there is no "A" sequence longer than that in the vocabulary.
(
"AAAAAAAA" * 6,
PreTrainedHFTokenizerConfig(
Expand Down

0 comments on commit 1cffee8

Please sign in to comment.