Photographed at the Christchurch Art Gallery in 2016
Two hour tutorial at Kiwi Pycon 2022 in Ōtautahi (Christchurch) New Zealand.
Covers the foundations needed to be effective with running computation across many workers.
Senior Data Scientist @ Orkestra Energy - I live in Christchurch - LinkedIn - blog.
- functional programming fundamentals -
map
,filter
,functools.reduce
, - CPU cores, threads & processes,
- concurrency, parallelism & asynchrony,
- why & how to use
multiprocessing
andasyncio
, - introduction to the ecosystem for distributing compute over many machines in Python,
- demonstration of using an EC2/Dask/Coiled/Prefect stack to deploy a Dask cluster to EC2.
1. Functional Programming - run in binder
20 min theory, 20 min practical
Warmup session introducing functional programming in Python.
2. Single Machine - run in binder
30 min theory, 20 min practical
Options for distributing computation on a single machine with the Python standard library - doing many things at once on single machine.
- threads &
threading
, processes &multiprocessing
,asyncio
, - why not to used threads in Python (possible but dangerous),
multiprocessing
for CPU bound tasks,asyncio
for IO bound tasks,- 3 short exercises & one large exercise,
3. Many Machines - read in binder
10 min theory, 15 min demonstration, balance practical
- options for distributing computation on the cloud (many machines) in Python,
- demonstration of distributing compute over an AWS EC2 cluster with Dask, Coiled & Prefect,
- note that this notebook
3-many-machines
will not run in Binder - you can run locally combined with a$ pip install -r requirements-coiled.txt
.
As this repo targets two environments (Binder & Coiled):
requirements.txt
= Binder (important to not install Dask in Binder),requirements-coiled.txt
= Coiled (will run on EC2 instances),requirements-coiled.txt
= local development.
Python version is important for Dask/Coiled.
To setup Python locally:
$ make setup