These are the main 7 projects that comprised 49% of myt grade in CS320: Data Programming II at the University of Wisconsin Madison. The course was taught by Tyler Caraza-Harter and I took it in Spring 2022.
Most of this project is review Python concepts from the previous course, CS220, and the end of it focuses on new topics check_output, time, and git.
In this project I analyzed every loan application made in Wisconsin in 2020, and practiced classes, large datasets, trees, recursion, testing, and writing modules.
In this project I built a module that could search through graphs in both BFS and DFS methods, and then used applied that method to search through graphs, matrices, and the web.
For this project I build a website using flask that displays multiple plots on the home page, a page for browsing through the table behind the plots, a link to a donation page that is optimized via A/B testing, and a subscirbe button that only accepts valid email addresses.
In this project I analyzed the statements and reports that companies made to the SEC's EDGAR database over the course of one day. This was great practice with working with a huge dataset, manipulating the data using modules, using regex, and data visualization.
For this project I made predictions about census data for Wisconsin using regression models. I extracted data using SQL queries from four files to construct DataFrames suitable for machine learning training during this project. I also used matrices to visualize land data in Milwaukee. Finally, I had to develop and deploy a model to predict population on a per-census tract basis with a requirement of having an explained variance of at least 0.35.
In the final project of the semester, I used web browsing data to develop and deploy a logistic regression classification model that would predict which users are interested in receiving an email promotion, with a requirement of 75% accuracy on the test data.