Title: Integrate Custom Dataset from Databricks for Aspect Sentiment Triplet Extraction

Title: Integrate Custom Dataset from Databricks for Aspect Sentiment Triplet Extraction
Description
I am trying to integrate a custom dataset stored in Databricks for Aspect Sentiment Triplet Extraction (ASTE) using the pyabsa library. However, I am encountering an error related to dataset loading. Below are the details of my implementation and the issues I am facing.

Code Implementation
python
Copy code
from pyabsa import (
    ModelSaveOption,
    DeviceTypeOption,
    DatasetItem,
)

from pyabsa import AspectSentimentTripletExtraction as ASTE
import pandas as pd

if __name__ == "__main__":
    config = ASTE.ASTEConfigManager.get_aste_config_english()
    config.max_seq_len = 120
    config.log_step = -1
    config.pretrained_bert = "bert-base-chinese"
    config.num_epoch = 100
    config.learning_rate = 2e-5
    config.use_amp = True
    config.cache_dataset = True
    config.spacy_model = "zh_core_web_sm"

    # Load dataset from Databricks
    dataset_path = "datasets/atepc_datasets/300.vokols/vokols.test.txt.atepc'"
    dataset = '300.vokols'
   

    trainer = ASTE.ASTETrainer(
        config=config,
        dataset=dataset,
        checkpoint_save_mode=ModelSaveOption.SAVE_MODEL_STATE_DICT,
        auto_device=True,
    )
    triplet_extractor = trainer.load_trained_model()

    examples = [
        "I love this laptop, it is very good.",
        "I hate this laptop, it is very bad.",
        "I like this laptop, it is very good.",
        "I dislike this laptop, it is very bad.",
    ]
    for example in examples:
        prediction = triplet_extractor.predict(example)
        print(prediction)
Error Encountered
vbnet
Copy code
ValueError: Cannot find dataset: 300.vokols, you may need to remove existing integrated_datasets and try again. Please note that if you are using keywords to let findfile search the dataset, you need to save your dataset(s) in integrated_datasets/task_name/dataset_name
Issues Faced
Dataset Loading: Clarification is needed on how to properly format and load a custom dataset from Databricks into the pyabsa library.
Integration: Guidance on ensuring that the custom dataset is correctly integrated and utilized during the training process.
Directory Structure: Instructions on the required directory structure for custom datasets to be recognized by pyabsa.
Steps to Reproduce
Place a custom dataset in Databricks (ensure it is in .atepc format).
Use the provided code to load the dataset and attempt to train the model.
Observe the error related to dataset loading.
Expected Behavior
The custom dataset should be loaded correctly, and the model should train and predict without errors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Title: Integrate Custom Dataset from Databricks for Aspect Sentiment Triplet Extraction #406

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Title: Integrate Custom Dataset from Databricks for Aspect Sentiment Triplet Extraction #406

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions