Skip to content

Aspect Polarity Classification doesn't work for multilingual : Exception: 'DebertaV2TokenizerFast' object has no attribute 'clean_up_tokenization_spaces' #407

@QuentinDiago

Description

@QuentinDiago

Version
PyABSA version 2.4.1.post1

Describe the bug
When loading the multilingual SentimentClassifier model, PyABSA raises an exception about an AttributeError :

[2024-07-30 17:27:08] (2.4.1.post1) Please specify the task code, e.g. from pyabsa import TaskCodeOption
[2024-07-30 17:27:09] (2.4.1.post1) ********** Available APC model checkpoints for Version:2.4.1.post1 (this version) **********
[2024-07-30 17:27:09] (2.4.1.post1) ********** Available APC model checkpoints for Version:2.4.1.post1 (this version) **********
[2024-07-30 17:27:09] (2.4.1.post1) Downloading checkpoint:multilingual 
[2024-07-30 17:27:09] (2.4.1.post1) Notice: The pretrained model are used for testing, it is recommended to train the model on your own custom datasets
[2024-07-30 17:27:09] (2.4.1.post1) Checkpoint already downloaded, skip
[2024-07-30 17:27:09] (2.4.1.post1) Load sentiment classifier from checkpoints\APC_MULTILINGUAL_CHECKPOINT
[2024-07-30 17:27:09] (2.4.1.post1) config: checkpoints\APC_MULTILINGUAL_CHECKPOINT\fast_lcf_bert.config
[2024-07-30 17:27:09] (2.4.1.post1) state_dict: checkpoints\APC_MULTILINGUAL_CHECKPOINT\fast_lcf_bert.state_dict
[2024-07-30 17:27:09] (2.4.1.post1) model: None
[2024-07-30 17:27:09] (2.4.1.post1) tokenizer: checkpoints\APC_MULTILINGUAL_CHECKPOINT\fast_lcf_bert.tokenizer
[2024-07-30 17:27:09] (2.4.1.post1) Set Model Device: cpu
[2024-07-30 17:27:09] (2.4.1.post1) Device Name: Unknown

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File c:\Users\helpd\envs\eval\lib\site-packages\pyabsa\tasks\AspectPolarityClassification\prediction\sentiment_classifier.py:83, in SentimentClassifier.__init__(self, checkpoint, **kwargs)
     [82](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:82) if state_dict_path:
---> [83](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:83)     self.model = APCEnsembler(
     [84](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:84)         self.config, load_dataset=False, **kwargs
     [85](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:85)     )
     [86](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:86)     self.model.load_state_dict(
     [87](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:87)         torch.load(
     [88](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:88)             state_dict_path, map_location=DeviceTypeOption.CPU
     [89](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:89)         ),
     [90](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:90)         strict=False,
     [91](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:91)     )

File c:\Users\helpd\envs\eval\lib\site-packages\pyabsa\tasks\AspectPolarityClassification\instructor\ensembler.py:79, in APCEnsembler.__init__(self, config, load_dataset, **kwargs)
     [73](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:73) for i in range(len(models)):
     [74](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:74)     config_str = re.sub(
     [75](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:75)         r"<.*?>",
     [76](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:76)         "",
     [77](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:77)         str(
     [78](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:78)             sorted(
---> [79](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:79)                 [
     [80](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:80)                     str(self.config.args[k])
     [81](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:81)                     for k in self.config.args
     [82](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/instructor/ensembler.py:82)                     if k != "seed"
...
    [111](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:111) if isinstance(self.config.model, list):
    [112](file:///C:/Users/helpd/envs/eval/lib/site-packages/pyabsa/tasks/AspectPolarityClassification/prediction/sentiment_classifier.py:112)     if hasattr(APCModelList, self.config.model[0].__name__):

RuntimeError: Fail to load the model from multilingual! Please make sure the version of checkpoint and PyABSA are compatible. Try to remove he checkpoint and download again 
Exception: 'DebertaV2TokenizerFast' object has no attribute 'clean_up_tokenization_spaces'

Code To Reproduce

%pip install pyabsa -U
from pyabsa import AspectPolarityClassification as APC

from pyabsa import available_checkpoints

ckpts = available_checkpoints()
# find a suitable checkpoint and use the name:
sentiment_classifier = APC.SentimentClassifier(
    checkpoint="multilingual"
) 

Expected behavior
PyABSA should download and load the model without problem

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions