-
Pay attention that the directory of data(named "data") should be placed in the same directory as .sh and .py files, and it should contain four files, namely
sst_train.csv,sst_test.csv,yelp_train.csv,yelp_test.csv. -
The structure of working directory should be:
-main.py
-model.py
-config.py
-cleandata.py
-utils.py
-run.sh
-data
|----sst_train.csv
|----sst_test.csv
|----yelp_train.csv
|----yelp_test.csv
main.py contains the main routine of the procedure, which includes loading data, pre-processing data, training model and evaluation.
model.py contains the DIY model, and many sub-functions defined in it.
config.py contains the operation of getting options by using argparse.ArgumentParser. --dataset and --alpha could be defined by users in shell.
cleandata.py contains functions of doing data pre-processing.
utils.py defines metrics and other basic functions.
Python 3.8.9(64-bit)
NLTK 3.5
numpy 1.20.2
Please run the shell to check the program by typing as follows:
- If you are MAC user, then:
bash run_mac.sh {sst, yelp}
- Otherwise:
bash run.sh {sst, yelp}
The argument(chosen dataset) for .sh file will be passed to the program. By default, it'll run on sst-5.