We uploaded examples and answers!
Learning programming is rather challenging. There are so many programming languages, so many frameworks, so many libraries, so many IDEs, so many concepts, so many things to learn. It is easy to get lost in the sea of information 🤷.
What is the way out? The answer is simple: start with Python 🐍. Python is a general-purpose programming language that is becoming more and more popular in the data science community. It is easy to learn, it is 🆓, it is open-source, it is cross-platform, it is powerful, it is flexible, it is fun.
One of the core advantages of Python is that it has a huge community 🧑🤝🧑. This means that there are a lot of resources available online. You can find a lot of tutorials, courses, books, and videos. Besides that, it has a great collection of packages 📦 for data science and machine learning. For example, Pandas is a library for data manipulation and analysis. It provides high-performance, easy-to-use data structures and data analysis tools. scikit-learn is a library for machine learning in Python. It has a wide range of supervised and unsupervised learning algorithms via a consistent interface in Python.
- What is Python, and what can it do?
- How to program in Python?
- Variables: types, names, and values;
- Operators: arithmetic, comparison, and logical;
- Conditional statements:
if
,elif
, andelse
; - Loops:
for
andwhile
; - Basics data structures: lists, tuples, and dictionaries;
- Graphs: matplotlib;
- Extra: sneak peek into Pandas and scikit-learn.
We're going to have four sessions, each one of them will be 40 minutes long. The last session will be 30 minutes long. We'll have 5 minutes break between each session. Here is a more detailed schedule:
Session | Topic | Duration |
---|---|---|
1 | What is Python, and what can it do? | 40 min |
2 | Operators and conditionals | 40 min |
3 | Loops and data structures | 40 min |
4 | Graphs and sneak peak into Pandas and scikit-learn | 30 min |
We are going to use Anaconda Distibution. It is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. The distribution includes data-science packages suitable for Windows, Linux, and macOS. It is the easiest way to start performing Python/R data science and machine learning on a single machine.
We are going to work in Jupyter Lab, which is a web-based interactive development environment for Jupyter notebooks, code, and data. It enables you to create and share documents that contain live code, equations, visualizations and narrative text. Later on you might explore other IDEs, such as PyCharm, Spyder or Visual Studio Code.
To install Anaconda Distribution follow the steps below:
- Download Anaconda Distribution from here.
- Open Anaconda Navigator and launch Jupyter Lab.
We are going to use Jupyter Notebooks. You can find the workshop materials in this repository. To download the repository click on the green button Clone or download and then Download ZIP. After downloading the repository, unzip it and open the folder in Jupyter Lab.
├── images <- images used in the README.md and main.ipynb
├── answers.ipynb <- answers to the exercises
├── main.ipynb <- this workshop
├── main.slides.html <- slides of the workshop
├── examples.ipynb <- example exercises
├── exercises.ipynb <- exercises
├── extra.ipynb <- extra exercises
├── README.md <- this file