Principal component analysis (PCA) and classification via supervised learning are two popular topics in data science today. In our project, we combine techniques from both areas in order to classify news articles based on their word frequency content. We find that we can accurately classify the data by projecting onto a small subset of principal components, reducing the feature space from nearly 10,000 elements to only 4. We also compare results from the traditional and robust PCA formulations, and discuss what additional semantic information can be inferred from our results.
-
Notifications
You must be signed in to change notification settings - Fork 0
kels271828/582FinalProject
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Final Project for AMATH 582: Computational Methods for Data Analysis
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published