This is a collection of self paced resources for anyone looking to get into data science. The materials assume an absolute beginner and are intended to prepare students for the Galvanize Data Science interview process: http://www.galvanize.com/courses/data-science/
We see many aspiring data scientist come to us from a variety of backgrounds: statisticians, mechanical engineers, political scientists, business analysts, software engineers, etc., etc. We have pretty much seen it all! And many of these folks come to us with one simple question:
Where do I get started?
This respository is a curated set of the best resources out there to provide an on-ramp to becoming a data scientist no matter someone's background. The skills needed can be broken up into the following topics: Programming (Python for us!), Linear Algebra, Statistics, Probability, and SQL. And as extra, it helps to have a high level overview of machine learning.
By no means do you need to be an expert in all of these, but we have identified these topics as the ones that we have seen set students up for success. And as such, anyone who is looking to apply to our Galvanize Data Science Immersive program can prepare for the interview/application process by completing these resources!
If you have any questions about any of Galvanize's educational offerings, or questions about this material please feel free to reach out!
Each sub heading below has two sections, a Review section intended for anyone who is familiar with the subject but needs a quick refresher as well as a In-Depth intended for absolute new comers who want a throrough treatment of the topic.
- Review
- In-Depth
- Review
- In-Depth
- Khan Academy: Independent and Dependent Events
- Khan Academy: Probability and Combinatorics
- Khan Academy: Random Variables and Probability Distributions
- Review
- In-Depth
- Khan Academy: Displaying and Describing aata
- Khan Academy: Modeling Distributions of data
- Khan Academy: Describing relationships in quantitative data
- Khan Academy: Confidence Intervals
- Khan Academy: Significance Tests
- Review
- In-Depth
- Review
- In-Depth
- Review
- In-Depth
- Andrew Ng Coursera (online | archive)
- Stanford Statistical Learning
Thanks to the following course for putting their content online for all to leverage:
- Khan Academy
- UW: Introduction to ML
- MITx: Introduction to Computer Science
- Coursera: Machine Learning
This resource is intentionally meant to be curated, concise, and compact. It covers the absolute necessities. If you are looking for even more resources we recommend looking to the The Open-Source Data Science Masters.