Skip to content

Commit

Permalink
Update utils
Browse files Browse the repository at this point in the history
  • Loading branch information
Javclaude committed May 30, 2020
1 parent a022919 commit d789533
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def encode_texts(df: pd.DataFrame, texts_col: str, tokenizer: str = "bert-base-u
return np.array of encoded sequence
"""
pretrained_tokenizer = AutoTokenizer.from_pretrained(tokenizer, use_fast=True)
print(pretrained_tokenizer)

texts = list(df[texts_col].astype(str))

encoded_sequence = pretrained_tokenizer.batch_encode_plus(texts,
Expand Down

0 comments on commit d789533

Please sign in to comment.