Skip to content

Commit 55891d6

Browse files
author
Alexander Hentschel
committed
first version of python setup tutorial
1 parent abf0187 commit 55891d6

File tree

6 files changed

+176
-48
lines changed

6 files changed

+176
-48
lines changed

01_install_python/01_install_python.md 01_setup_python/01_setup_python.md

+100-48
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
# Installation of Python for Data Science
22

3-
This tutorial gives a recommendation for a data science python stack. This tutorial
4-
is merely a suggestion for one out of many different usefull alternatives.
3+
This tutorial gives a recommendation for installing and organizing a data-science python stack.
4+
In the tutorial I describe a system for organizing the different components. I found that
5+
this organization scheme scales particularly well when working on many different projects over time.
6+
It allows for different python environments and version to be used for different projects.
7+
However, this tutorial is merely a suggestion for one out of many different usefull alternatives.
58

69
Overview:
710
- The first section describes the fastest path
@@ -12,7 +15,7 @@ I will point out where you are taking shortcuts shortcuts.
1215

1316
# Pythonic Data-Science Stack: fast route
1417

15-
#### Install Anaconda Python
18+
### Install Anaconda Python
1619

1720
For data science in python Python, you need a python interpreter plus various numerical packages.
1821
Continuum Analytics provides commercial images for various cloud platforms that include
@@ -93,71 +96,120 @@ Get the latest PyCharm from [jetbrains](https://www.jetbrains.com/pycharm/).
9396
/Users/alex/Development/Python/pycharm
9497
```
9598
96-
#### Create a PyCharm Project
99+
### Create a PyCharm Project
97100
98101
When PyCharm first opens, it presents you with a welcome screen
99102
and prompts you create or open a project. In PyCharm, a project
100103
is simply a configuration that tells PyCharm which python
101-
packages and folders should be opened and what python environments to use to
104+
packages and folders should be opened and what python environments to use to
102105
execute what code. A project _points_ to python code, but
103-
the project configuration folder itself should _not contain_ python sources.
106+
the project configuration folder itself should _not contain_ python sources.
104107
I generally create a new Project for each topic I am working on. In addition,
105108
I keep a `sandbox` project for random little experiments.
106109
107-
As you might want to open some python sources in the context of several projects,
108-
I recommend keeping the PyCharm Projects separate from the python source code.
110+
As you might want to open some python sources in the context of several projects,
111+
I recommend keeping the PyCharm Projects separate from the python source code.
109112
While I check out all source code into `/Users/alex/Git/`,
110113
my PyCharm project configurations live in `/Users/alex/Development/Python/IDE-Project-Configurations/`.
111114
112115
**Creating the `sandbox` Project** (for MacOS):
113-
1. On the welcome screen, select `Create New Project`. Alternatively,
116+
1. On the welcome screen, select `Create New Project`. Alternatively,
114117
when closing the main window of the PyCharm IDE, it will go back to the
115118
welcome screen.
116-
2. You will be presented with:
117-
![alt text](https://github.com/AlexHentschel/python-tutorials/blob/master/figures/Project_conf_1.png)
118-
119-
120-
The project's name is implicitly determined by the tailing folder name. Hence,
121-
to create a project with the name `sandbox`, specify as folder:
119+
2. Your project configuration could look something like this:
120+
![project configuration](https://github.com/AlexHentschel/python-tutorials/blob/master/figures/Project_conf_1.png)
121+
- The project's name is implicitly determined by the tailing folder name. Hence,
122+
to create a project with the name `sandbox`, specify the `location`:
122123
```
123124
/Users/alex/Development/Python/IDE-Project-Configurations/sandbox
124125
```
125-
126-
Lets name our first project
127-
128-
3) Configure Anaconda as an interpreter in pycharm:
129-
130-
131-
/Users/alex/Development/Python/Python3.6-x64_Anaconda-5.2.0/bin/python
132-
133-
134-
135-
136-
https://www.anaconda.com/download/
137-
138-
1) dow Anaconda Python 3.6: https://www.anaconda.com/download/#macos
139-
140-
141-
142-
2) create virtual environment (Python 3 already comes with everything to create a virtual environment):
143-
`python -m venv --symlinks <path-to-new-virtual-environment>`
144-
`source <path-to-new-virtual-environment>/bin/activate`
145-
this will change your default python to the one you activated _only in the current shell session_
146-
to go back to default: `deactivate`
147-
details: https://docs.python.org/3/library/venv.html (edited)
148-
3) In the _activated_ python environment, install tensorflow (https://www.tensorflow.org/install/install_mac):
149-
• Ensure pip ≥8.1 is installed:
150-
`easy_install -U pip`
151-
• install TensorFlow
152-
`pip3 install --upgrade tensorflow`
153-
• install other useful dependencies for data science
154-
`pip install matplotlib pandas h5py` (edited)
126+
- Configure the default python interpreter that will be used to
127+
execute code in the project:
128+
- Select `Existing Interpreter` and lick on the `...` button on the right.
129+
(If you have previously already configured PyCharm to use Anaconda, it
130+
should be available in the drop-down menu).
131+
- In the window ![interpreter configuration](https://github.com/AlexHentschel/python-tutorials/blob/master/figures/Project_conf_2.png)
132+
choose `System Interpreter` (left) and use the `...` button to select an
133+
already installed python environment. You need to select the `python` executable,
134+
in our example:
135+
```
136+
/Users/alex/Development/Python/Python3.6-x64_Anaconda-5.2.0/bin/python
137+
```
138+
139+
### PyCharm in Action
140+
141+
In the following, I will use the [python-tutorials repository](https://github.com/AlexHentschel/python-tutorials) as an example.
142+
I assume you already have cloned the repo so you can execute the provided examples. In the following,
143+
lets assume the repository to be located in `/Users/alex/Git/python-tutorials`.
144+
145+
1. Open Pycharm and the `sandbox` project (`File` -> `Open Recent` lets you switch projects).
146+
2. Now, we are going to _add_ the `python-tutorials` folder to PyCharm's `sandbox` project.
147+
(This merely instructs PyCharm to add the folder you choose to a list of displayed folders.
148+
Files and code remain where they are.)
149+
- Go to `File` -> `Open` and select the folder `/Users/alex/Git/python-tutorials`.
150+
- Now `python-tutorials` should be listed in the left of the IDE with its location on your
151+
hard disk next to it in grey print.
152+
153+
### The iPython console
154+
155+
The iPython console allows you to _interactively_ execute code _while_ you are developing it.
156+
I find this immensely useful, specifically for data science and machine learning projects.
157+
158+
- Open (drouble click) the python script `python-tutorials/example_code/hello_world.py`
159+
- Mark all lines of code and press `Control`+`Shift`+`e`. The `Python Console` will open
160+
and execute the selected code.
161+
- You can open _multiple_ iPython consoles and work with them in parallel.
162+
163+
### Interactive plotting
164+
165+
Similarly, iPython allows you to _interactively_ plot graphs.
166+
167+
- An example is given in `python-tutorials/example_code/hello_plot.py`
168+
- Again, execute all the lines of code using `Control`+`Shift`+`e`.
169+
Now try to edit the code while *keeping the plot open*.
170+
On mys system, the plot always stays in front covering up part of the
171+
PhCarm editor. I found this rather irritating and counterproductive.
172+
- Execute `python-tutorials/example_code/hello_plot2.py` in a _newly opened_
173+
Python Console. In this example, we use the `TKAgg` backend for `matplotlib`
174+
which fixed this behaviour for me.
175+
- You can make the backend change permanent by editing your
176+
`~/.matplotlib/matplotlibrc` file. In its default configuration, your
177+
`matplotlibrc` states for MacOS
178+
```
179+
backend : macosx
180+
```
181+
Change this to
182+
```
183+
backend : TKAgg
184+
```
185+
(see [here](http://matplotlib.org/users/customizing.html#the-matplotlibrc-file) for
186+
more details)
155187
156188
# Pythonic Data-Science Stack: best practices
157189
158-
#### Use Miniconda as root environment
190+
### Use Miniconda as root environment:
159191
160192
https://conda.io/miniconda.html
161193
162-
163-
###
194+
### Use Virtual Python Environment
195+
196+
**Creation of virtual environments**:
197+
- Python 3 already comes with everything to create a virtual environment.
198+
Execute in the command line:
199+
```
200+
/Users/alex/Development/Python/Python3.6-x64_Anaconda-5.2.0/bin/python
201+
```
202+
Make sure you select the correct python distribution that should serve
203+
as a root.
204+
- On the command line, you can select a virtual environment by
205+
```
206+
source <path-to-new-virtual-environment>/bin/activate
207+
```
208+
this will change your default python to the one you just activated,
209+
but _only in the current shell session_.
210+
Note: the command prompt will change and display the python environment.
211+
To deactivate (go back to the default python environment), type
212+
```
213+
deactivate
214+
```
215+
(further reading on virtual environments: https://docs.python.org/3/library/venv.html)

example_code/__init__.py

Whitespace-only changes.

example_code/hello_plot.py

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
"""
2+
Hello World example
3+
4+
Author: Alexander Hentschel, [email protected]
5+
"""
6+
7+
import numpy as np
8+
import matplotlib.pyplot as plt
9+
10+
def sample_sin(start=0, stop=10, points=10000):
11+
x = np.linspace(start, stop, num=points)
12+
y = np.sin(x)
13+
return x,y
14+
15+
x, y = sample_sin()
16+
17+
fig = plt.figure()
18+
plt.plot(x,y)
19+
plt.draw()
20+
21+
# Now try to edit the code while KEEPING the PLOT OPEN
22+
# ...
23+
# The plot always stays in front. I found this rather irritating.
24+
# If you like to change that behaviour, see `hello_plot2_.py`

example_code/hello_plot2.py

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
"""
2+
Hello World example
3+
4+
Author: Alexander Hentschel, [email protected]
5+
"""
6+
7+
import numpy as np
8+
9+
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #
10+
# Change matplotlib backend from "MacOSX" (default) to "TKAgg"
11+
# This should fix a variety of issues under MacOS.
12+
# IMPORTANT: execute in a newly opend Python Console.
13+
# ............................................................................ #
14+
import matplotlib as mpl
15+
mpl.use("TKAgg", warn=False, force=True)
16+
17+
# For making this change permanent, modify your `~/.matplotlib/matplotlibrc`
18+
# as described in the 01_setup_python.md.
19+
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #
20+
21+
import matplotlib.pyplot as plt
22+
23+
def sample_sin(start=0, stop=10, points=10000):
24+
x = np.linspace(start, stop, num=points)
25+
y = np.sin(x)
26+
return x,y
27+
28+
x, y = sample_sin()
29+
30+
fig = plt.figure()
31+
plt.plot(x,y)
32+
plt.draw()

example_code/hello_world.py

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
"""
2+
Hello World example
3+
4+
Author: Alexander Hentschel, [email protected]
5+
"""
6+
7+
import sys
8+
9+
10+
def hello_world():
11+
"""
12+
Prints a greeting in the console and the location of the currently used Python interpreter.
13+
"""
14+
p = sys.executable # resolves the current path into an absolute path. Result is a string
15+
print("Hello World, I am running Python from: '%s'" % p)
16+
17+
18+
hello_world()
19+
20+

0 commit comments

Comments
 (0)