Please view the blog at https://jamiepotter17.github.io/stackoverflow/
Code in the Jupyter Notebook file 'CRISP-DM Process.ipynb' should run on a standard Anaconda distribution of Python 3.*.
However, it ran without errors on this specific installation if a failsafe is needed:
- python 3.6.13
- jupyter 1.0.0
- matplotlib 3.3.4
- numpy 1.17.0
- pandas 1.1.5
This project was completed as part of Udacity's Data Scientist Nanodegree Program. Information about this online course can be found here. Specifically, it was the first project required.
For this project, I was interested in using Stack Overflow survey data longitudinally in order to evaluate:
- What Size Organisation do People Work For and Has This Changed?
- What Countries are Developers From?
- What Programming Languages are Developers Using?
- ./docs/ - contains files for the maintenance of the blog post, which is viewable at https://jamiepotter17.github.io/stackoverflow/.
- .gitignore and ./.ipynb_checkpoints - github files.
- README.md - this file.
- CRISP-DM Process.ipynb - the Jupyter Notebook file used in the exploration and analysis of the Stack Overflow data.
- StackOverflowData.zip - contains the relevant .csv files used by the Jupyter Notebook file.
The main findings are summarised on the blog at https://jamiepotter17.github.io/stackoverflow/.
Many thanks to Stack Overflow for collecting and publishing the data, which is available here. Feel free to use this information and code as you will.