Skip to content

Commit

Permalink
An informative comment.
Browse files Browse the repository at this point in the history
  • Loading branch information
codetalker7 committed Jun 2, 2024
1 parent b73c39a commit a9b266d
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions src/modelling/tokenization/doc_tokenization.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ function tensorize(doc_tokenizer::DocTokenizer, tokenizer::Transformers.TextEnco
if ismissing(bsize)
return integer_ids, integer_mask
else
# we sort passages by length to do batch packing for more efficient use of the GPU
integer_ids, integer_mask, reverse_indices = _sort_by_length(integer_ids, integer_mask, bsize)
batches = _split_into_batches(integer_ids, integer_mask, bsize)

Expand Down

0 comments on commit a9b266d

Please sign in to comment.