Skip to content

LeonCai1/Data-Classification-Algorithms-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Classification Algorithms analysis

Implementing Decision Tree and compared with other classification algorithms in sklearn library.

Structure

Main programming files are:

main.py preprocessing.py data_splitting.py classification.py 
decision_tree.py my_decision_tree.py naive_bayes.py svm.py

Input files are

amazon_cells_labelled.txt imdb_labelled.txt yelp_labelled.txt

Extra files:

  • rawFtAccuracy.png and reducedFtAccuracy.png saved output bar charts
  • HW 3 Report.pdf output:
  • console ouput: each algorithm's time and accuracy perfomance
  • bar charts to visualize the output data

Usage

  1. Run the main.py file
  2. give the input amazon_cells_labelled.txt imdb_labelled.txt yelp_labelled.txt when see the prompt

Result bar charts

Using raw feature matrix:

Accuracy before using feature selection

After Applying feature selection algorithm (result_chart/top K frequent words)

Accuracy after using feature selection Compare the performance on time:

Time comparision

License and Authority

Leon Cai (https://github.com/LeonCai1)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages