Skip to content

During training: warnings for ATE Classification Report #394

@KadriMufti

Description

@KadriMufti

Version

Version: 2.4.1.post1
Summary: This tool provides the state-of-the-art models for aspect term extraction (ATE), aspect polarity classification (APC), and text classification (TC).
Home-page: https://github.com/yangheng95/PyABSA
Author: Yang, Heng
Author-email: hy345@exeter.ac.uk
License: MIT
Location: /usr/local/lib/python3.8/dist-packages
Requires: metric-visualizer, boostaug, networkx, seqeval, torch, sentencepiece, protobuf, update-checker, pytorch-warmup, transformers, tqdm, findfile, pandas, typing-extensions, gitpython, termcolor, spacy, autocuda
Required-by: boostaug
Name: torch
Version: 2.2.1+cu121
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /usr/local/lib/python3.8/dist-packages
Requires: networkx, sympy, nvidia-cuda-nvrtc-cu12, triton, nvidia-cusolver-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, typing-extensions, fsspec, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, jinja2, nvidia-curand-cu12, filelock, nvidia-cuda-runtime-cu12, nvidia-cusparse-cu12
Required-by: trl, torchvision, torchaudio, timm, pytorch-warmup, pyabsa, peft, OpenNMT-py, flash-attn, deepspeed, accelerate
Name: transformers
Version: 4.40.0.dev0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /usr/local/lib/python3.8/dist-packages
Requires: regex, requests, packaging, pyyaml, filelock, tokenizers, numpy, safetensors, tqdm, huggingface-hub
Required-by: trl, pyabsa, peft

Describe the bug
During training I get warnings regarding metrics as seen below (I pasted them in bold font). Why are "Recall and F-score are ill-defined"?

  warnings.warn('{} seems not to be NE tag.'.format(chunk))
/usr/local/lib/python3.8/dist-packages/seqeval/metrics/sequence_labeling.py:171: UserWarning: [CLS] seems not to be NE tag.
  warnings.warn('{} seems not to be NE tag.'.format(chunk))
/usr/local/lib/python3.8/dist-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))**
[2024-03-29 12:45:21] (2.4.1.post1) 
---------------------------- ATE Classification Report ----------------------------


[2024-03-29 12:45:21] (2.4.1.post1)
               precision    recall  f1-score   support

         ASP     0.0558    0.5437    0.1013      9428
        CLS]     0.0000    0.0000    0.0000         0
        SEP]     1.0000    0.9997    0.9999      3667

   micro avg     0.0895    0.6714    0.1579     13095
   macro avg     0.3519    0.5145    0.3670     13095
weighted avg     0.3202    0.6714    0.3529     13095

[2024-03-29 12:45:21] (2.4.1.post1) 
---------------------------- ATE Classification Report ----------------------------

Code To Reproduce

config.model = ATEPC.ATEPCModelList.FAST_LCF_ATEPC
config.evaluate_begin = 0
config.max_seq_len = 512
config.num_epoch = 50
config.batch_size = 16
config.patience = 10
config.log_step = -1
config.seed = [1]
config.show_metric = True
config.gradient_accumulation_steps = 4
config.verbose = False  # If verbose == True, PyABSA will output the model strcture and several processed data examples

config.pretrained_bert = "MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7" 
config.notice = (
    f"This is a finetuned aspect term extraction model, based on {config.pretrained_bert}, using combined Arabic and English data from various sources."  # for memos usage
)

model = config.pretrained_bert.split('/')[-1]
base_path =  f'/app/aspect/code4_pyabsa/NEW_ATEPC_MULTILINGUAL_CHECKPOINT_{model}_3/'
# checkpoint = base_path + get_latest_checkpoint(base_path)
trainer = ATEPC.ATEPCTrainer(
    config=config,
    dataset=my_dataset,
    auto_device=DeviceTypeOption.AUTO, 
    checkpoint_save_mode=2,
    load_aug=False, 
    path_to_save=base_path
)

Expected behavior

I expect the training should continue without problems because I checked my training and test data and there are no errors or mistakes in the formatting.

Screenshots

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions