This course is designed to bring students from the fundimentals of probability theory through the professional use and reporting of regression models. There are five core learning objectives.
Understand the building blocks of probability theory that prepare learners for the study of statistical models.
- Understand the mathematical objects of probability theory and be able to apply their properties.
- Understand how high-level concepts from calculus and linear algebra are related to common procedures in data science.
- Translate between problems that are defined in business or research terms into problems that can be solved with math.
- Understand the theory of statistics to prepare students for inferrential statements.
- Understand model parameters and high level strategies to estimate them: means, least squares, and maximum likelihood.
- Choose an appropriate statistic, and conduct a hypothesis test in the Neyman-Pearson framework.
- Interpret the results of a statistical test, including statistical significance and practical significance.
- Recognize limitations of the Neyman-Pearson hypothesis testing framework and be a conscientious participant in the scientific process.
- Explore and wrangle data with the intention of understanding the information and relationships that are (and are not) present.
- Identify the goals of your analysis.
- Build a model that achieves the goals of an analysis.
- Identify their audience and report process and findings in a manner appropriate to that audience.
- Construct regression oriented reports that provide insight for stakeholders.
- Construct technical documents of process and code for collaboration and reproducability with peer data scientists.
- SWBAT read, understand, and assess the claims that are made in technical, regression oriented reports.
Contribute proficient, basic work, using industry standard tools and coding practices to a modern data science team.
- Demonstrate programming proficiency by translating statistical problems into code.
- Understand and incorporate best practices for coding style and data carpentry.
- Utilize industry standard tooling for collaboration.