Implementing Decision Tree and compared with other classification algorithms in sklearn library.
Main programming files are:
main.py preprocessing.py data_splitting.py classification.py
decision_tree.py my_decision_tree.py naive_bayes.py svm.py
Input files are
amazon_cells_labelled.txt imdb_labelled.txt yelp_labelled.txt
Extra files:
- rawFtAccuracy.png and reducedFtAccuracy.png saved output bar charts
- HW 3 Report.pdf output:
- console ouput: each algorithm's time and accuracy perfomance
- bar charts to visualize the output data
- Run the main.py file
- give the input amazon_cells_labelled.txt imdb_labelled.txt yelp_labelled.txt when see the prompt
Using raw feature matrix:
After Applying feature selection algorithm (result_chart/top K frequent words)
Compare the performance on time:
Leon Cai (https://github.com/LeonCai1)