Welcome to the Python Intermediate 2 days course for Equinor ASA. This entire course is available under the Creative Commons Attribution Share Alike 4.0 International license (cc-by-4.0).
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
This license is acceptable for Free Cultural Works.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
This course is an intermediate level course in Python programming. We will go through and learn intermediate concepts, and get hands-on experience with these concepts. Obviously, reading about topics and concepts in programming is never a substitute for programming! You will not be an intermediate level programmer after this course, but you will have the tools available to become one.
As this is not an introductory course to Python,
we assume familiarity with all basic types of Python, including set
, dict
,
tuple
, some functional programming tools in Python such as map
, filter
,
zip
, object oriented programming, organization of Python modules as well as
file management.
We also assume a certain level of programming maturity; the Warming up
exercise should be quite feasible.
Most sections have a reference section for further reading, e.g.
References
It is recommended that you take some time to read through the material.
- Warming up
- The iteration and iterable protocols
- Error handling and exceptions
- Closures
- Creating context managers
- Packaging and distribution of Python packages
- Calling, lambdas, and functions
- Decorators
- Object oriented programming members
- String representations and format strings
- Specialized numeric and scalar types
- Functional programming
- Containers ABC
- SQL and
sqlite
- Test driven development
- Multiple inheritance, method resolution order, and super()
- Python 3.7, 3.8, 3.9 and beyond
Log into Advent of Code, 2019!
As a start, we will warm up with a very basic exercise in programming. Start by logging in to your GitHub account, and then proceed to log in to Advent of Code by authenticating yourself using GitHub.
You need to download a file, let's call it 'input'
, and your program should
take the file path as input, i.e., you call your program like this:
$ python aoc01.py input
<answer>
... and out comes your answer. Remember a proper use of functions, and that you
should use the
__name__ == '__main__'
idiom as in all other scripts we write.
(To get it out of the way, the double underscores are pronounced dunder. We often skip saying the trailing dunder, and we thus pronounce the above idiom as "dunder name equals dunder main". We will see lots of dunders in this course.)
- Read a file containing one int per line, make into a list of ints.
- Solve Day 1, parts 1 and 2 of 2019, receiving a gold star.
- [optional] Solve the rest of Advent of Code 2019.
Iterating is one of the most fundamental things we do in programming. It means to consider one item at a time of a sequence of items. The question then becomes "what is a sequence of items"?
We are certainly familiar with some types of sequences, like range(4)
, or
the list [0, 1, 2, 3]
, or the tuple (0, 1, 2, 3)
, and you might know that
in Python strings are sequences of one-character strings.
These are things that we can do the following with:
for element in sequence:
print(element)
But there are more sequences, like sets of the type set
(which don't have a
pre-determined order), dictionaries (whose sequence becomes a sequence of the
keys in the dictionary).
Whenever we use for
, map
, filter
, reduce
, list comprehensions, etc.,
Python iterates through an iterable. The iterable is any object that
implements an __iter__
function (or the __getitem__
, but we will skip that
for now).
The __iter__
function returns an iterator. An iterator is any type
implementing __next__
. That function returns elements in order, halting the
iteration by returning StopIteration
.
Quiz: Why not return None
? (See: Sentinel values)
From the Python manual:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,str
, andtuple
) and some non-sequence types likedict
,file
objects, and objects of any classes you define with an__iter__()
method or with a__getitem__()
method that implements Sequence semantics.Iterables can be used in a for loop and in many other places where a sequence is needed (
zip()
,map()
, …). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop.
Slices
As you know by now, x[·]
is equivalent to x.__getitem__(·)
. Some objects
can take a slice as an argument. A slice is an object of three elements:
start
(optional)stop
step
(optional)
We can create a slice object by running slice(3, 100, 8)
. Let us create a
list of the first 1000 integers and then fetching every 8th element up to 100,
starting from the 3rd element:
lst = list(range(1000)) # [0, 1, 2, ..., 999]
spec = slice(3, 100, 8)
print(lst[spec])
# [3, 11, 19, 27, 35, 43, 51, 59, 67, 75, 83, 91, 99]
Python offers syntactic sugar for slicing used in subscripts:
lst = list(range(1000)) # [0, 1, 2, ..., 999]
print(lst[3:100:8])
# [3, 11, 19, 27, 35, 43, 51, 59, 67, 75, 83, 91, 99]
This can come in handy if we, e.g. want to get every other element from a list.
Then we can write lst[::2]
. More typically, suppose that we want to pair up
elements like this:
lst = [1, 2, 3, 4, 5, 6, 7, 8]
# we want: [(1, 2), (3, 4), (5, 6), (7, 8)]
zip(lst[0::2],lst[1::2])
# [(1, 2), (3, 4), (5, 6), (7, 8)]
If we want to accept slices in our own class, simply use them as provided in the
__getitem__
function. The following should be enough to get you started.
class X:
def __getitem__(self, index):
if isinstance(index, int):
return index
elif isinstance(index, slice):
start, stop, step = index.start, index.stop, index.step
return start, stop, step
x = X()
x[5]
# 5
x[::]
# (None, None, None)
x[3:100:8]
# (3, 100, 8)
- Iterate over lists, sets, strings, tuples
- Iterate over dicts using raw iteration, over
dict.keys
anddict.items
- Iterate over
dict.items
with tuple unpacking. What happens when you usezip(*dict.items())
? - Create a class whose instances are iterable using
__getitem__
. RaiseIndexError
when you have no more items. - Create a class whose instances are iterators, i.e., implements
__next__
and an__iter__
that returns itself. Remember to raiseStopIteration
.
In Python, any function or method can throw an exception. An exception signals an exceptional situation, a situation that the function does not know how to deal with. It is worth mentioning already now that in Python, exceptions are sometimes also used to support flow control, for example the exception StopIteration is thrown when there are no futher items produced by an iterator. More about this later.
There is no way to completely avoid exceptional situations: the users can give the program malformed input, the filesystem or a file on the computer can be corrupt, the network connection can go down.
Suppose that you want to create the function int_divide
that always returns an
integer:
def int_divide(a: int, b:int) -> int:
if b == 0:
return 0 # ?? ... that's not correct‽
return a//b
Obviously, this is not a good idea, since 3/0 ≠ 0
. Indeed, 3/0
is
undefined, so the function should simply not return a value, but signal an
error. We could of course push the responsibility over to the callsite and say
that if the user calls this function with illegal arguments, the output is
undefined. However, this is not always possible. Consider this scenario:
def count_lines_in_file(filename : str) -> int:
return len(open(filename, 'r').readlines())
But what if the file doesn't exist? What should we then return? We could again try to force the responsibility over to the user, but in this situation, that would not necessarily work due to possible race conditions.
if os.exists(filename):
# between line 1 and 3, the file could be deleted, the filesystem unmounted
wc = count_lines_in_file(filename)
It is for these situations that exceptions exist. (Students asking about monads are kindly asked to leave the premises.)
A bit of warning: Never catch an exception you don't know how to deal with. A program that crashes is nearly always better than a program that is wrong but doesn't inform you. The best is of course to have a bugfree program, but that is often unattainable; the second best (since there will be bugs) is a program that crashes on errors, informing users and developers that something went wrong!
Eric Lippert categorises exceptions into four classes:
- fatal exceptions
- Exceptions you cannot do anything about (e.g. out of memory error, corruption, etc), so do nothing about them
- boneheaded exceptions
- Exceptions that you could avoid being raised, such as index errors, name errors, etc. Write your code properly so that they are not triggered, and avoid catching them
- vexing exceptions
- Exceptions that arise due often to poorly written library code that are raised in non-exceptional cases. Catch them, but it is vexing.
- exogenous exceptions
- Exceptions that you need to deal with, such as I/O errors, network errors, corrupt files, etc. Try your operation and try to deal with any exception that comes.
In the past, it was considered normal program flow in Python to use exceptions. The opinions on this matter is today under debate with many programmers arguing it to be an anti-pattern. You will often come across the advise "never use exceptions for program flow". An experienced developer can decide for themselves; In this course we recommend using exceptions for exceptional situations.
Exception handling in Python
The simplest way to trigger an exception in your terminal is to simply
write 1/0
. You are asking Python to divide a number by zero, which
Python determines it cannot deal with gracefully, and throws a division
by zero error.
Another easy way to trigger an exception is to run [][0]
, asking for
the first element of an empty list. Again, Python doesn't know what to
answer, so throws an IndexError
.
All exceptions in Python derive from BaseException
. For example,
ZeroDivisionError
is an ArithmeticError
which in turn is an
Exception
(which is a BaseException
). The IndexError
derives from
LookupError
which again is an Exception
. The exception hierarchy
allows for very fine-grained error handling, you can for example catch
any LookupError
(IndexError
or KeyError
or maybe one you define
yourself?), and avoid catching an ArithmeticError
in case you don't
know how to deal with such an error.
def throwing():
raise ValueError('This message explains what went wrong')
throwing()
The above will "crash" your Python instance. We can "catch" the error, and either suppress it, or throw a different exception, or even re-throw the same exception:
try:
throwing()
except ValueError:
print('Caught an exception')
#raise # <-- a single `raise` will re-throw the ValueError
To catch the specific error to get its message, we type
except ValueError as err
,
where err
is you variable name of choosing.
try:
throwing()
except ValueError as err:
print('Caught an error', str(err))
Warning again: The above is very bad practice in general; never do that!!
try...finally
: The try
statement has three optional blocks,
except
,else
, andfinally
.
Cleaning up: The finally
block is a block that is always* run.
try:
raise ValueError()
finally:
print('Goodbye')
Since the finally
block is always run, you should be very careful with
returning from the finally block. Quiz: what is returned?
def throwing():
try:
return True
finally:
return False
The only block we haven't considered up until this point is the else
block. The else
is executed if and only if the try
block does not
throw an exception.
def throwing(n):
value = float('inf')
try:
value = 1/n
except ZeroDivisionError:
print('Division by zero')
else:
print('Divided successfully')
finally:
print('Returning', value)
return value
Note that if you call x = throwing('a')
, an exception will leak
through - it is presented with an exception we have not considered - and the else
block is skipped, and x
will remain undefined.
- Write a program that reads input from the user until the user types
an integer. In case the user types a single
q
, the program should quit. - An
except
clause can have several handlers. Write a program that catchesIndexError
andValueError
and does different things depending on which error was thrown. - Define your own exception class and throw and catch it.
Explain nested functions and their scope
See the scope of non-local variables
Overwrite the variable in the inner function and see what happens with the variable in the outer function:
def fun():
a = 1
def infun():
a = 2
print(a)
infun()
print(a)
fun()
(prints 2
and 1
, obviously)
Now,
def fun():
a = 1
def infun():
nonlocal a
a = 2
print(a)
infun()
print(a)
fun()
Creating functions with special behaviour
A more typical example for closures is the following
def n_multiplier(n):
def mul(x):
return x * n
return mul
quadruple = n_multiplier(4)
print(quadruple(100)) # prints 400
- Create a function that defines an inner function and returns that function
- Create a function that defines a variable and an inner function and the inner function refers to the variable; return that function
- Experiment with the keywords
global
andnonlocal
. - Define two variables
a
andb
, change their value from inside a function. What happens witha
andb
? Try withglobal a
later. - Bind up a mutable variable. Change it outside the function. Observe the behavior.
The use case of context managers is any situation where you find yourself writing one line of code and thinking "now I need to remember to do [something]". The most usual example is the following:
fh = open('file.txt', 'r') # now I must remember to close it
value = int(fh.readlines()[0])
See how easy it is to forget to close fh
?
Indeed, we can try harder to remember to close it:
fh = open('file.txt', 'r')
value = int(fh.readlines()[0])
fh.close() # phew, I remembered
However, when the int
raises a ValueError
, the file isn't closed after all!
How do we deal with this situation? Well, we have to do the following:
fh = open('file.txt', 'r')
try:
value = int(fh.readlines()[0])
finally:
fh.close() # closes the file even if exception is thrown
(See the section on exception handling if the finally
keyword eludes you.)
The above scenario is handled with a context manager which uses the with
keyword:
with open('file.txt', 'r') as fh:
value = int(fh.readlines()[0])
Unsurprisingly, there are many more examples where we need to remember to release or clean up stuff. A couple of examples are
- open a database connection: remember to close it lest we risk losing commits
- acquire a lock: remember to release it lest we get a dead/livelock
- start a web session: forgetting to close might result in temporary lockout
- temporarily modifying state (
pushd
example in exercises)
The context manager
A context manager is a class which has the methods
__enter__(self)
__exit__(self, type, value, traceback)
The __enter__
method is called when the object is used in a context manager
setting, and the return value of __enter__
can be bound using the as <name>
assignment.
Of course, then, the __exit__
method is called whenever we exit the with
block. As you see, the __exit__
method takes a bunch of arguments, all
related to whether there was an exception thrown from within the with
block.
If you return True
from the __exit__
method, you suppress any exception
thrown from the with
block.
Here is how to implement open(·, 'r')
by ourselves:
class Open:
def __init__(self, fname):
self._fname = fname
self._file = None
def __enter__(self):
self._file = open(self._fname, 'r')
print('\n\nFILE OPENED!\n\n')
return self._file
def __exit__(self, type, value, traceback):
self._file.close()
print('\n\nFILE CLOSED!\n\n')
with Open('myopen.py') as f:
print(''.join(f.readlines()))
Now, we can force an exception by mis-spelling readlines
and observe that the
file is actually closed.
with Open('myopen.py') as f:
print(''.join(f.readxxxlines()))
Defining context manager with contextlib
decorator
The contextlib
library gives us a way to write context managers without
needing to define a class, and without specifying __enter__
and __exit__
.
Using the @contextmanager
decorator, we can create a function that can yield
once, and whatever is above the yield
is interpreted as __enter__
and
whatever comes after yield
is interpreted as __exit__
. By yield
-ing a
value, we allow binding in the as [name]
expression.
import contextlib
@contextlib.contextmanager
def mgr():
print('hello')
try:
yield # yield something
finally:
print('goodbye')
def myfun():
print('pre')
with mgr():
x = 1
print(x)
int('ca')
x = 2
print(x)
print('post')
myfun()
- Create a context manager using the
contextmanager
decorator - Print before and after yield, observe
- Raise an exception and observe the post-print is present
- Implement the
open
context manager asmy_open
. - Implement the
pushd
decorator as a context manager. - Create a context manager using
__enter__
and__exit__
. Experiment with suppressing exceptions from leaking through. - Is it possible to have the name
pushd
as both a decorator and a context manager? - Implement
tmpdir
as a context manager.
As a programmer, we constantly need to install packages; this is one of the great benefits of open source software. We all benefit from the combined efforts of the software community!
At the same time as we would like to install all the packages, when doing actual development, we need to have full control of our environment, which packages do we actually use, which versions of these packages, and which do we not need?
Especially when we are working on several projects with different requirements, could this potentially become problematic. Enter virtual environments.
Virtual environments
A virtual environment is a what it sounds; it is like a virtual machine that you can activate (or enable), install a lot of packages, and then the packages are only installed inside that environment. When you are done, you can deactivate the environment, and you do no longer see the packages.
(By the way, a virtual environment is only a folder!)
There are, in other words, three steps:
- create a virtual environment
- activate the virtual environment
- deactivate the virtual environment
[trillian@iid ~]$ python3 -m venv my_evn
[trillian@iid ~]$ source my_evn/bin/activate
(my_evn) [trillian@iid ~]$ which python
/home/trillian/my_evn/bin/python
(my_evn) [trillian@iid ~]$ which pip
/home/trillian/my_evn/bin/pip
(my_evn) [trillian@iid ~]$ pip install numpy
Collecting numpy
...
Installing collected packages: numpy
Successfully installed numpy-1.18.1
(my_evn) [trillian@iid ~]$ python -c "import numpy as np; print(np.__version__)"
1.18.1
A module in Python is a folder with a file named __init__.py
We will create a very short package xl
containing one module called xl
.
The file tree in our project looks like this:
[trillian@iid ~/proj]$ tree
.
├── requirements.txt
├── setup.py
├── tests
│ ├── __init__.py
│ └── test_units.py
└── xl
└── __init__.py
2 directories, 5 files
You can for now ignore the tests, which we will come back to in the section about Test Driven Development.
The setup.py
file is the one that makes Python able to build this as a
package, and it is very simple:
# setup.py
import setuptools
setuptools.setup(
name='xl',
packages=['xl'],
description='A small test package',
author='Trillian Astra (human)',
)
Run python setup.py install
to install it (remember to activate your virtual
environment first).
- Create two virtual environments where we install different versions of
numpy
- Write a module
- Write a
setup.py
file - Create a virtual environment, install package, delete virtual environment
- Add dependencies to module (
xlrd
,pandas
) - Implement
entry_points
to call a function inxl
. - Make
xl
read an excel file (input argument) and output its columns. pip install
a package directly from GitHub.- Install
black
and run on your module. Why can it be good to useblack
in a project?
class X:
pass
x = X()
x() # raises TypeError: 'X' object is not callable
Okay, so x
, which is of type X
is not callable. What is callable?
Clearly functions, methods, and constructors? Even type
s are callable!
Can we create a class of, e.g., signals that you could call? Yes, indeed, by simply implementing __call__
:
class Signal:
def __init__(self, val):
self.val = val
def __call__(self):
return self.val
s = Signal(4)
s() # returns 4 !
Occasionally we want to create functions, but do not care to name them. Suppose for some reason that you would like to pass the Euclidean distance function into a function call. Then you could be tempted to do something like this:
def the_function_we_currently_are_in(values, ...):
def dist(a, b):
return math.sqrt((a.x - b.x)**2 + (a.y - b.y)**2)
return do_things(values, dist)
However, the name of the function dist
is not really necessary. In
addition, it is slightly annoying to write functions inside other functions. A
lambda is an anonymous function, i.e. a function without a name, that we
specify inline. With a lamda, the function call would look like this:
return do_things(
values, lambda a, b: math.sqrt((a.x - b.x) ** 2 + (a.y - b.y) ** 2)
)
A lambda expression is an inline function declaration with the form
lambda [arguments_list] : expression
and the lambda expression returns a function.
>>> type(lambda : None)
function
We can assign bind the function to a name as usual:
>>> dist = lambda a, b: abs(a - b)
>>> dist(5, 11)
6
If we want to sort a list by a special key, e.g. x², we can simply use the
lambda lambda x: x**2
as input to sorted
, i.e.
>>> import random
>>>
>>> sorted([random.randint(-5, 5) for _ in range(10)], key=lambda x: x**2)
[-1, 1, 3, 3, 4, -5, -5, 5, -5, 5]
(Note that sorted
is a stable sort in Python ...)
Varargs, or *args, **kwargs
.
You will occasionally need to create functions that are variadic, i.e., a
function that takes any number of arguments, and in fact, any kind of keyword
argument. How can we make a function, say, log
, which could accept both
log("hello")
and also log("hello", a, b, c=x, d=y, e=z)
and so on?
Enter varargs. Consider this function:
def summit(v1, v2=0, v3=0, v4=0, v5=0, v6=0, v7=0, v8=0):
return v1+v2+v3+v4+v5+v6+v7+v8
It almost does the work, but not completely. It only handles eight arguments, and it only handles a few keyword argument.
In Python, we actually implement the function like this:
def summit(v1, *vals):
return v1 + sum(vals)
Now, we can call it like this:
>>> summit(2, 3, 4, 5, 6, 7, 8)
35
Let's inspect:
>>> def summit(v1, *vals):
>>> print(type(vals))
>>> return v1 + sum(vals)
>>>
>>>
>>> summit(2, 3, 4, 5, 6, 7, 8)
<class 'tuple'>
35
As you can see, vals
becomes a tuple
of the values the user provides.
However, it doesn't fix all our problems:
>>> summit(2, x=2)
TypeError: summit() got an unexpected keyword argument 'x'
For this, we use the **
operator:
def summit(v1, *vals, **namedvals):
return v1 + sum(vals) + sum([val for _,val in namedvals.items()])
Calling it:
>>> summit(2, 3, 4, x=2, y=1000)
1011
You can also do the "opposite", namely using the *
and **
operators on the
callsite. Recall our dist(a, b)
from above. If we have a list
pair = [Pos(1,0), Pos(0,1)]
,
we can call dist
simply like this: dist(*pair)
.
You will very often see zip
being called with the *
operator, we leave the
decoding of this as an exercise to the reader.
Keyword-only
In Python 3.6, the asterisk keyword-only symbol *
as a separator between the
parameters and the keyword-only parameters. A keyword-only parameter is a
parameter that has to be given to a function with the keyword:
The following function call uses keyword-only arguments in the call:
dist(a=Pos(0,1), b=Pos(1,0))
As opposed to this call which uses positional arguments:
dist(Pos(0,1), Pos(1,0))
To force the user to call functions with keyword-only arguments, we use the asterisk:
def dist(a, b, *, scalar):
return scalar * math.sqrt((a.x - b.x)**2 + (a.y - b.y)**2)
Now Python would not allow you to call this function with positional-only
arguments; scalar
has to be declared using the keyword:
>>> dist(Pos(1,0), Pos(0,1), 2.71828)
# TypeError: dist() takes 2 positional arguments but 3 were given
>>> dist(Pos(1,0), Pos(0,1), scalar=2.71828)
3.844228442327537
Positional-only
In Python 3.8, they extended the idea about keyword-only parameters to also include positional-only parameters.
The idea is then that it could be made illegal to call
dist(a = Pos(1,0))
forcing the user to call the function without the keyword, in other words,
dist(Pos(1,0))
.
Similar to the asterisk, the slash is being used as a delimiter between the positional-only, and the no-positional-only.
def dist(a, b, /):
...
Combining the two ideas, yields this result:
def dist(a, b, /, *, scalar):
return scalar * math.sqrt((a.x - b.x)**2 + (a.y - b.y)**2)
Now, the only way to call dist
is as dist(p1, p2, scalar=val)
.
Intermezzo: a quiz
def run_simulation(realisations=[], ignore=[]):
ignore.append(1) # Always ignore first realization
return sum(realisations) - sum(ignore)
def main():
assert run_simulation([1,2,3]) == 5
print(run_simulation([1,2,3]))
if __name__ == '__main__':
main()
Exercise:
What is printed from the above program?
Sentinel values
A sentinel value is a parameter (or return value) that is uniquely identified,
and that typically convey the meaning of not specified or non-existing.
Usually, the None
value is the one we use:
def compute(vals, cache=None):
if cache is None:
cache = {}
for v in vals:
if v in cache:
return cache[v]
return expensive_compute(v)
Here, not specifying cache
is the same as using no cache, or {}
as an input.
None
is used as a sentinel value.
However, occasionally, None
is not a good choice, as None
could be a
reasonable value. In that case, we can use object()
as a sentinel value:
SENTINEL = object()
def fun(val, default=SENTINEL):
pass
The implementation is left as an exercise for the reader
- Create a function that takes a list (mutable obj) as default argument.
- Create a class whose instances are callable.
- Experiment with
filter
,map
,reduce
onlambda
s. - Create functions that take keyword-only arguments.
- Experiment with things like this:
zip(*[[1,2,3], 'abc', [3,4,5]])
- Spell out in detail what happens here:
>>> zip( *[(1,2), (3,4), (5,6)] ) [(1, 3, 5), (2, 4, 6)] >>> zip( [(1,2), (3,4), (5,6)] ) [((1, 2),), ((3, 4),), ((5, 6),)]
- Implement
min(iterable, default=None)
that tries to find a minimal element in the iterable, and returnsdefault
ifdefault
is provided, otherwise it raisesValueError
. What happens ifiterable = [None]
?
A decorator is a "metafunction"; a way to alter the behavior of a function that you write. A decorator can change the behavior completely, change the inputs, and the return value of the function.
Consider the very simple decorator timeit
which prints the time a function
uses to return:
@timeit
def fib(n):
return 1 if n <= 2 else fib(n-1) + fib(n-2)
Calling fib(35)
should result in the following in the terminal:
>>> fib(35)
fib took 2.1 sec to complete on input 35
9227465
First we observe that
@timeit
def fib(n):
pass
is syntactic sugar for
def fib(n):
pass
fib = timeit(fib)
The main idea is that we implement timeit
something like this:
def timeit(fib):
def new_fib(n):
start = now()
result = fib(n)
stop = now()
print(stop-start)
return result
return new_fib
- Use
lru_cache
to memoizefib
- Create a decorator that hijacks a function, printing and returning
- Create a decorator that takes a string and hijacks a function, printing the string and returning the string
- Create a decorator that prints a string before the execution of the original function
- Create a decorator that takes a string and prints the string before and after the execution of the original function
- Create a decorator that takes a function as an argument, calls the function before and after the execution of the original function
- Create a decorator that takes a function
f
and returnsf(val)
where val is the output of the original function - Create a class that acts like a decorator (see also callable objects)
- Use the
decorator
functool. - Create a decorator
pushd
that changescwd
before and after function call. - Use the
singledispatch
functionality fromfunctools
to overload several functions - Think about how you would implement
singledispatch
yourself. - Use
functools.wraps
to define a decorator. - Write a decorator that takes arbitrary arguments and keyword arguments.
- Implement
lru_cache
.
When we create a class Pos
with members x
and y
, we allow a user to update
x
and y
at will. In other words, it would be totally reasonable for a user
of Pos
to do the following:
location = Pos(0, 2)
location.x = 1
But occasionally, we don't want people to touch our private parts, in which case
we can call a member _x
(or even __x
). This tells the user that if you
modify the content of _x
(or __x
), we no longer guarantee that the object
will work as expected. This is called the visibility of the member x
.
Now, in Java, one would add two methods (per member) called getMember
and
setMember
(in this case get_x
and set_x
), but in Python, we can actually
add an object that appears to be a member, x
, but whose altering triggers a
function call.
To make this a more concrete example, consider the class called Square
, which
has two properties, width
and height
. Obviously, since this is a square,
they should be the same. So implementing it like this would be a bad idea:
class Square:
def __init__(self, width, height):
if width != height:
raise ValueError('heigh and width must be the same')
self.width = width
self.height = height
def get_area(self):
return self.width * self.height
However, a user may do the following:
s = Square(3,3)
s.width = 4
s.get_area() # returns 12
Using private members with getters and setters looks like this, which is much better:
class Square:
def __init__(self, width, height):
if width != height:
raise ValueError('heigh and width must be the same')
self._width = width
self._height = height
def set_width(self, width):
self._width = width
self._height = width
def set_height(self, height):
self._width = height
self._height = height
def get_height(self):
return self._height
def get_width(self):
return self._width
def get_area(self):
return self._width * self._height
Now, the user cannot make an illegal square, unless they access the private members.
However, the getters and setters belong to the Java community, in Python we can do something that looks nicer.
class Square:
def __init__(self, width, height):
if width != height:
raise ValueError('heigh and width must be the same')
self._width = width
self._height = height
@property
def width(self):
return self._width
@width.setter
def width(self, width):
self._width = width
self._height = width
@property
def height(self):
return self._height
@height.setter
def height(self, height):
self._height = height
self._width = height
@property
def area(self):
return self.width * self.height
Using the @property
decorator allows a user to write
>>> s = Square(5,5)
>>> s.area
25
>>> s.width
5
>>> s.height
5
>>> s.width = 100
>>> s.area
10000
>>> s.height
100
Note that using properties is a great way to make a public member variable private after users have started to (ab)use your (leaky) implementation!
Other examples where you want to hide your privates are in classes where you want to keep some additional book-keeping. For example if you implement a collection and you allow people to add elements, but you want to keep track of this.
Comparing objects
One interesting thing about our Square
implementation above, is that if we
make to "identical" objects, s = Square(10,10)
and t = Square(10, 10)
, we
can observe that s != t
. This is "counter-intuitive" on first sight,
especially since every (both) members of Square
have the same value.
To be able to compare objects like this (equality), we implement the __eq__
method:
class Square:
# ...
def __eq__(self, other):
return self.width == other.width and self.height == other.height
This adds the following property to the class:
- squares that are the same are equal
- squares that are not the same are non-equal
Which sounds like all you want. But it has a bug:
's' == s
makes your application crash!
When Python is asked to evaluate a == b
, it first checks if a.__eq__(b)
returns True
or False
. However, a.__eq__
can choose to return a special
symbol NotImplemented
which tells Python to instead check b.__eq__(a)
.
In the case above, 's' == s
, the str.__eq__
methods returns
NotImplemented
, so Python calls s.__eq__('s')
which in turn checks s.width == 's'.width
. However, the str
object has no property width
, so an
AttributeError
is raised.
Here is a better implementation:
class Square:
# ...
def __eq__(self, other):
if type(self) != type(other):
return NotImplemented # leave decision to `other`
return self.width == other.width and self.height == other.height
In Python 3.7, a concept called dataclasses
was introduced to the standard
library. A dataclass
is what it sounds like, a way to create classes
primarily for storing data. Here is a simple implementation of Position
as a
dataclass
:
@dataclasses.dataclass(frozen=True, eq=True)
class Position:
x : float
y : float
def __add__(self, other):
return Position(self.x + other.x, self.y + other.y)
(The rest of the functionality is left as an exercise for the reader.)
p = Position(0.2, 0.3)
print(p)
# Position(x=0.2, y=0.3)
As you can (or will) see, the dataclass
comes with a Pandora's box of pleasant
surprises.
If you have to make your classes mutable, consider implementing the class using design by contract (DbC) (also known as contract-driven development).
In contract-driven development, we try to specify (in code) the preconditions
and postconditions of a method call, as well as a datainvariant for each type.
Suppose that you have a class that keeps, e.g., fields keys
, values
, size
,
and name
, and that len(keys)
should always be at most size
, and that
len(keys)
should always be the same as len(values)
. In addition, let's say
that name
should always be a non-empty string. In this case, our class could
have such a method:
class RollingDict:
def _datainvariant(self):
assert len(self.keys) == len(self.values)
assert len(self.keys) <= self.size
assert isinstance(self.name, str) and self.name
return True
Now, we can, for each method call, call the _datainvariant
before and after
the method call:
class RollingDict:
def insert(self, k, v):
assert self._datainvariant()
self.keys.append(k)
self.values.append(v)
assert self._datainvariant()
Some third-party libraries exist that allows the datainvariant to be a decorator, and where you can also specify pre- and postconditions as decorators:
@contract(a='int,>0', b='list[N],N>0', returns='list[N]')
def my_function(a, b):
...
- Create a Position class with
dist
,norm
,__add__
, with@property
- Add
repr
,str
andhash
,eq
- Implement the same class with
@dataclass
decorator - Implement
RollingDict
from Design by Contract above, removing the oldest element if its size grows above allowed size. - Implement the
_datainvariant
call as a decorator.
When you try to print
an object obj
, Python calls str(obj)
, which
looks for two functions, in order:
__str__
__repr__
The former function, __str__
is meant to provide a string representation of
an object that is usable for an end user. The __str__
function may choose
to discard parts of the state of obj
, and try to make the string "nice to look
at".
However, the latter function, __repr__
is meant as a debugging string
representation, and is intended for developers to look at. If possible, it is
a good idea to make the object constructable using only the information provided
by __repr__
. The __repr__
function can explicitly be called by using the
built-in associated function repr
.
Take a look at how Python implements __str__
and __repr__
for datetime
:
import datetime
now = datetime.datetime.now()
str(now)
# '2020-02-03 09:03:43.147668'
repr(now)
# 'datetime.datetime(2020, 2, 3, 9, 3, 43, 147668)'
Observe that the latter returns a string that you could actually eval
(warning: bad practice), and which would return an identical object (see also:
__eq__
).
now == eval(repr(now))
# True
Format strings
You have probably seen that it is possible to concatenate two strings using +
:
'Hello, ' + 'world'
# 'Hello, world'
There are obvious shortcomings, especially when dealing with non-strings (they have to be manually casted), and when interleaving variables into larger strings:
x = 3.14
y = 2.71828
print('Here x (' + str(x) + ') is almost pi and y (' + str(y) + ') is ...')
# Here x (3.14) is almost pi and y (2.71828) is ...
As you can see, quite annoying to write, as well as read. A minor improvement
that have seen, is that it is possible to use the string modulo operator %
:
name = 'Arthur Dent'
print('Hello, %s, how are you?' % name)
# Hello, Arthur Dent, how are you?
The modulo operator introduces its own mini-language for dealing with integers, floating point numbers (e.g. rounding), for padding and centering strings, etc.
print('Integer: %2d, rounding float: %3.2f' % (1, 3.1415))
# Integer: 1, rounding float: 3.14
print('Percent: %.2f%%, (E/e)xponential: %5.2E' % (2.718281828459045, 149597870700))
# Percent: 2.72%, (E/e)xponential: 1.50E+11
However, the modulo operator gets confusing to work with when you have many arguments, and especially if some arguments are repeated. That is why the format strings where introduced:
print('Here x is {x}, and x*y, {x}*{y} = {mulxy}'.format(x=2, y=3, mulxy=2*3))
# Here x is 2, and x*y, 2*3 = 6
As you can see, it supports repeated arguments, and the arguments do not have to be in the same order:
print('a = {a}, b = {b}'.format(b=4.12, a='A'))
# a = A, b = 4.12
It even nicely handles different types.
Occasionally (predominantly while debugging), we can even throw our entire environment into the format string:
print('x={x}, y={y}, a={a}'.format(**locals()))
# 'x=12, y=13, a=A str'
However, this is not good practice.
f
-strings
As of Python 3.6, we get something which is even nicer than format strings,
namely f
-strings. When prefixing a string with a single f
, you ask
Python to replace all expressions in {·}
with the evaluated expression:
x = 3.14
out = f'Here x is {x}, and x**2 is {x**2}'
print(out)
# Here x is 3.14, and x**2 is 9.8596
This is very convenient to write, and even more convenient to read, as we now can read everything inline and do not have to jump back and forth between braces and the format arguments.
You can even see now how easy it is to make nice __str__
and __repr__
strings by just writing:
def __repr__(self):
return f'Class(a={self.a}, b={self.b})'
Note that by default, f
-strings use str
, but can be forcesd to use repr
by
specifying the conversion flag !r
(recall now
from above):
f'{now}'
# '2020-02-03 09:05:54.206678'
f'{now!r}'
# 'datetime.datetime(2020, 2, 3, 9, 5, 54, 206678)'
And of course, you can use the same type specifiers with f
-strings to limit
the number of decimals using a single :
in the expression:
e = 2.718281828459045
f'{e:.5f}'
# '2.71828'
Finally, in Python 3.8, f-strings support = for self-documenting expressions and debugging. Notice how the returned string prints the variable name as well.
x = 3.14
y = 2.71828
a = 'A str'
print(f'My vars are {x=}, {y=}, {a=}')
# My vars are x=3.14, y=2.71828, a='A str'
- Print a Pandas dataframe and a Numpy matrix.
- Print a function.
- Print a class (the class, not an object).
- Create a class and define the
repr
andstr
methods. What are the differences? - Use
str(·)
on the object and observe. - Use
repr(·)
on the object and observe. Conclude. - Center a short string using
f
-strings, pad left usingf
-strings. - Return non-string in
str
. - Print a class. Which method is being called?
- Create a class with only one of the two methods, see what happens.
- Create a very simple
Complex
class and implement__eq__
,__str__
, and__repr__
. Ensure thateval(repr(c)) == c
.
The most basic types are int
, bool
, and None
. Integers have infinite
precision, and bool
are a subtype of int
holding only the values 0
and
1
, albeit named False
and True
, respectively. Since True
is just a
different name for 1
, the fact that 2 + True*3 == 5
should not surprise you
much. However, such hogwash should rarely be used.
The None
keyword is an object, and in fact the only object (a singleton, see
id(None)
), of type NoneType
. It is often used as a sentinel value (see
Calling, lambdas, and functions). Note that None
is not the same as 0
,
False
, ''
or anything else except None
. For every element None == x
if
and only if x is None
. Whenever we want to check if a variable is None
, we
use the is
operator:
if x is None:
print('x was None')
Note that instead of writing not x is None
, we prefer the more readable:
if x is not None:
print('x is not None')
Floating point numbers, however, are much more messy than the beautiful int
,
bool
, and None
. The issue being, of course, that
Squeezing infinitely many real numbers into a finite number of bits requires an approximate representation.
— What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Unfortunately, the curse of infinity materializes as follows:
>>> 1.2 - 1.0
0.19999999999999996
A Python float
is typically backed by a C type double
, and we can get some
information about the float
on our computer and on our environment by
inspecting sys.float_info
:
sys.float_info(
max=1.7976931348623157e+308,
max_exp=1024,
max_10_exp=308,
min=2.2250738585072014e-308,
min_exp=-1021,
min_10_exp=-307,
dig=15,
mant_dig=53,
epsilon=2.220446049250313e-16,
radix=2,
rounds=1)
We have seen many times the bool
, int
, float
, and None
types.
However, there is one more type that you don't see too often:
1.4142 + 0.7071j
# (1.4142+0.7071j)
complex(1,2)
creates the complex number 1 + 2i (denoted (1+2j)
)
type(1.4142 + 0.7071j)
# complex
and it supports all the arithmetic operations you would expect, such as +
,
-
, *
, /
, **
(exponentiation), however, since the complex numbers do not
have a total order, you can neither compare them (using <
), nor round them
(using any of round
, math.ceil
, math.floor
triggers an error). The latter
also means that whole division is not defined (//
).
a = 1.4142 + 0.7071j
((1 + (3*a)) ** 2).conjugate()
# (22.984941069999994-22.242254759999994j)
A complex number cannot be cast to an int
, (TypeError: can't convert complex to int
), but you can get the real part and the imaginary part out by
calling their respective properties (see Object oriented programming members
and Calling, lambdas, and functions):
print(a.real)
# 1.4142
print(a.imag)
# 0.7071
In addition to the basic types, there are two more types that occasionally
come in handy decimal
, and fraction
.
decimal.Decimal(1.1)
→Decimal('1.100000000000000088817841970012523233890533447265625')
fractions.Fraction(42,12)
→Fraction(7, 2)
- Find the truthiness values for the basic types
- Create a function
complex_str
that prints a complex number (a+bi
) using i instead of j. - Parse a complex number from the user
- Without the REPL, what does
-10 // 3
yield? - Without the REPL, what does
-10 % 3
yield? - Without the REPL, what does
10 % -3
yield?
In functional programming we put a higher focus on input and output of
functions, with little storing of information. We are familiar with
many examples that are "purely functional", e.g. dist(a, b) → abs(a-b)
, max
, +
, etc. These functions take some argument, and
return a new value without modifying the input argument.
Here is a non-functional example:
def __iadd__(a: Pos, b: Pos) -> None:
a.x += b.x
a.y += b.y
Here is the functional "equivalent":
def __add__(a: Pos, b: Pos) -> Pos:
return Pos(a.x + b.x, a.y + b.y)
While both examples are easy to grasp and understand, the latter is much simpler to reason about; In the second example, the state never changes. This makes the second version much easier to test as well!
An object which can be modified (like in the first example) is called a mutable object. The problem with mutable objects in Python is that you never really know who holds a reference to the object, which means that the object can be modified right under your own nose from far away.
An object which cannot be modified is called immutable. You will experience that immutable data is much easier to reason about and to deal with. An immutable object can safely be passed around to other functions and threads without worrying that they might change its content.
As a correctness-focused developer you should strongly prefer immutable data structures over mutable. Your code will be safer with fewer bugs, easier to understand, and easier to test.
Notice that even though a tuple is immutable, if its data is mutable, the tuple will only be reference-immutable: it will forever point to the same objects, but the object may be changed (under your nose).
>>> l = [1,2,3]
>>> t = (0, l, 4)
>>> print(t)
(0, [1, 2, 3], 4)
>>> l[0] = 5
>>> print(t)
(0, [5, 2, 3], 4)
In this example, the list l
is the "same" list, but, being mutable,
its content changed.
Functional-style programming
If your program has no side-effects, it is called a purely functional program.
It is sometimes not easy to come up with the functional "equivalent" (if it
exists), so let us look at one example. Suppose you have a list lst
and you
are no longer interested in keeping the first element. You want to define a
function that removes the first element:
def remove_first(lst):
lst.pop(0)
return lst
However, there is a different way of looking at the problem, namely that we
implement a function rest
that returns the list containing all but the first
elements.
def rest(lst):
return lst[1:]
Now you can simply write lst = rest(lst)
, and you have the list without the
first element. The benefit is that since there are no side-effects, you do not
care whether other places have references to lst
, and in addition the latter
function is simpler to test.
Some of the benefits are:
- concurrency — the
rest
function above is thread-safe, whereasremove_first
is not - testability — functional programs are easier to test since you only test that the output is as expected given the correct input (they are always idempotent wrt input)
- modularity — functional style programming often forces you to make better design decisions and splitting functions up into their atomic parts
- composability — When a function takes a type and returns the same (or
another) type, it is very easy to compose several functions,
e.g.
sum(filter(map(·, ·), ·))
It is now time to revisit this exercise from Calling, lambdas, and functions:
- Experiment with
filter
,map
,reduce
onlambda
s.
There are some modules in the standard library that are good to be aware of, and especially, we will mention
Bisect
The bisect
is a module for keeping a list sorted, so this is not a very
"functional" functionality, however, it also offers the function bisect.index
,
or binary search.
Itertools
The itertools
module is a module containing a wide array of iterator building
blocks which is perfect for functional programming. If you ever need advanced
iteration functionality, chances are that they are already implemented in
itertools
, some examples are dropwhile
, groupby
, takewhile
,
zip_longest
, count
, cycle
, repeat
, and not to mention product
,
permutations
, combinations
, and combinations_with_replacement
.
Functools
The
functools
module is for higher-order functions: functions that act on or return other functions. In general, any callable object can be treated as a function for the purposes of this module.
The most commonly used from functools
is the lru_cache
, which is an exercise
in the Decorators section.
Tuples
You should by now be aware of the tuple
type: the "immutable list".
A tuple behaves the same as a list, it is iterable, it has an order, you
can slice, etc, just as you would with a list. However, a tuple cannot
be changed once it is created:
>>> t = (1,2,3)
>>> t[0] = 0
TypeError: 'tuple' object does not support item assignment
You can neither change its content, nor its size. A tuple is a great way to store vectors and other short lists that you do not want changed. However, once you start adding more data, it can become problematic. Suppose that you decide to store Employees as tuples:
employee = ('Alice', 1980, 'junior software developer')
# must remember which index corresponds to which field
position = employee[1] # d'oh
As you can see, it becomes difficult to remember how to interpret the
tuple. Enter namedtuple
. The namedtuple
is a very neat data
structure which is essentially a tuple, but instead of using indices as
keys to look up, we pick our own names:
from collections import namedtuple
Employee = namedtuple('Employee', ['name', 'date_of_birth', 'position'])
alice = Employee('Alice', 1980, 'senior software developer')
position = alice.position
You can even see that due to her use of namedtuple
, Alice has been
promoted. Being immutable, we cannot change the content of the tuple:
alice.position = 'fired'
AttributeError: can't set attribute
Keys of a dictionary
There is another reason for why immutability of a data structure is good; the hash function. Hashing an object is a way of assigning a unique* integer value to an object:
>>> hash("hello")
-8980137943331027800
>>> hash("hellu")
-3140687082901657771
This makes it possible to create a very simple way of making a set, or
hash maps (aka dictionary). This (homemade) set has O(1) (constant
time) lookup, insertion, and deletion. We leave it as an exercise to
the reader to fix bugs and complete the implementation with __len__
and __repr__
. The latter function should make it clear why this set
is "unordered".
class Set:
_size = 127
def __init__(self):
self._data = [None for _ in range(Set._size)]
def add(self, elt):
self._data[hash(elt) % Set._size] = elt
def remove(self, elt):
self._data[hash(elt) % Set._size] = None
def __contains__(self, elt):
return self._data[hash(elt) % Set._size] is not None
This implementation of set
works very well (except for the bugs), and
see what happens if we try to add a tuple, a string and a list:
s = Set()
s.add('hello')
print(s)
s.add((1,2,3))
print(s)
s.add([4])
output:
Set(['hello'])
Set(['hello', (1, 2, 3)])
TypeError: unhashable type: 'list'
Command–query separation
Since we occasionally need to work with mutable data, it can be good to try to write your functions and methods so that they are of one of two types, either a command, or a query. In QCS, a command is a function which changes its input data, and does not return anything, whereas a query is a function which returns a value, but does not moduify its input data.
As with functional programming and immutable data, query functions are much simpler to reason about, and to test. They are also "completely safe" to call; Since there are no side-effects, there can be no unintended side-effects.
There are examples where it's beneficial to write a function that both
modifies its input and returns a value (such as pop
), but it can very
often be avoided.
Fluent style programming
There is a programming style referred to as fluent programming that allows for very smooth creation and "modification" of objects, at the same time improving readability. In this style, you always return an object (even if it is the same object you got as input) to enable chaining of operations. An API that allows for this chaining is called a fluent interface.
Suppose that you have an SQL-like query object where you want to select, order, limit, and modify the data. In a "traditional" style, you would do something like this (pseudocode):
data = select("*")
only_cats = data.where("type = cat")
ascending_cats = only_cats.order_by("age")
youngest_cats = ascending_cats.limit(10)
result = youngest_cats.filter(str.upper)
Although the example is contrived, the following fluent style example illustrates the idea behind a fluent interface. Suppose that all function calls above returned a new query instance:
result = (
select("*")
.where("type = cat")
.order_by("age")
.limit(10)
.filter(str.upper)
)
It is not only easier to read and understand, but it neither overwrites any variables, nor does it introduce any "temporary" variables.
- Use
namedtuple
for aPos
type - Implement the
Pos
class immutable. - Implement the
Pos
class immutable usingdataclasses
, add__add__
anddistance
(Euclidean). - Create lists
a = b = [1,2,3]
and experiment witha.append
andb.append
. - Implement the
sum
function as a recursive function. - Implement the
product
function as a recursive function. - Implement the
reduce
function as a recursive function. - Complete the implementation of
Set
with__len__
and__repr__
and make it iterable. - Fix the obvious bug in
Set
. (Hint:for i in range(128): s.add(i)
. Is1000 in s
true?) - Make your own implementation of
Dictionary
. - Implement a fluent interface for a light bulb with properties hue, saturation, and lightness.
We already know what an iterable and an iterator is. In general,
programming has much to do with collections of things, and our treatment of
these collections. A container is the most general type of collection, with
only one "requirement", namely that we can ask whether an object o
is
contained in the container c
, or in Python: o in c
.
In Python, a collection is a sized iterable. To be sized means that we can find out a container's size. Why couldn't we find out any iterable's size programmatically?
This question, in
, can also be asked about iterables and iterators. Why?
How can you implement this in
for an object you can iterate over?
To make Python be able to answer in
-questions for your home-made classes, you simple need to implement the method __getitem__(self, o)
.
There are many other specializations we could imagine for a container and iterables. Some examples:
- We want to iterate backwards: implement
__reversed__
- We want to know a container's size: implement
__len__
- Until now, we haven't been able to actually modify a collection: implement
__[get|set|del]item__
andinsert
- Implement a function
my_in(iterable, object)
which checks if an object is contained in an iterable. - Implement a class that implements the
__getitem__
method. - Experiment with the above examples, with
len
,reversed
,in
,iter
,next
, etc. - Implement a
multiset
collection. - Implement a linked list.
SQL, or Structured Query Language, is a language for reading from and writing to databases, which usually are collections of tables of data, and the data we are reading and writing are relational data, meaning that there exists relations within the database we are interested in.
Python comes bundled with
sqlite3
,
a wrapper around the public domain database sqlite
. This database is a
local database living exclusively in a file on your file system.
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()
# Create table
c.execute('''CREATE TABLE stocks
(date text, trans text, symbol text, qty real, price real)''')
# Insert a row of data
c.execute("INSERT INTO stocks VALUES ('2006-01-05','BUY','RHAT',100,35.14)")
# Save (commit) the changes
conn.commit()
# We can also close the connection if we are done with it.
# Just be sure any changes have been committed or they will be lost.
conn.close()
The data you’ve saved is persistent and is available in subsequent sessions:
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()
A short note about SQL: Very often, it may be beneficial to simply go for an ORM like SQLAlchemy, Peewee, PonyORM, however, before introducing an ORM in your project, The Vietnam of Computer Science is mandatory reading.
Data Access Object
When working with data, the data is almost always stored in a way which is not easy to work with as an end-user and higher-order programmer. For example, to find a specific book, you need specified join and select queries, sometimes also pagination. If you don't take care to design an API, before you start coding, you might find that your query strings (SQL) are scattered around your entire project, making it hard (and error prone) to change the design of your database.
A data access object (DAO) is an API that makes your data model concrete by abstracting away all the SQL specific tasks. This also makes it possible to easily change from (e.g.) SQLite to Postgres, or even to a completely different backend like a different API or a file. The DAO is a clear separation between the data backend and the more abstract functionality.
- import
sqlite3
- create book database with one table, books,
author
,year
,title
,publisher
,genre
- create a DAO for the book database
- normalize the database to 1NF, 2NF, 3NF, ...
sqlite
sqlite3
(DB-API in Python)- ORMs
- The Vietnam of Computer Science
In test driven development (TDD), we write the tests before writing the actual implementation. This helps us create a better design, and to better think about the problem before starting writing code.
In this session, we will simply solve a bunch of problems, writing as strong tests as possible before implementing, and seeing how that helps us become better programmers.
In the real world, TDD is even more beneficial as the design is often non-trivial, and it can help us design our API.
- Implement
FizzBuzz
with a test-driven development style - Implement
fromroman
that parses a string such as'vii'
and returns a number, e.g. 7 - Implement a simple calculator that takes input such as
'2 + (3 * 4)'
and returns its value. For simplicity, you may use polish notation - Write a password strength function that approves or rejects a password if it has or does not have at least one upper and one lower case letter, a non-leading non-trailing digit and a non-leading non-trailing special character.
A class can inherit from an existing class. In this instance we call the class that inherits the subclass of the class it inherits from, the superclass.
class A:
pass
class B(A):
pass
But a class can inherit from several classes:
class C(A, B):
pass
The inheritance order determines the method resolution order of methods calls
on objects. See C.__mro__
:
(__main__.C, __main__.A, __main__.B, object)
)
Multiple inheritance is too difficult (maybe not for you, but for your colleagues), so there's rarely a need to use it.
- Make a class that inherits from two classes. Test it with several MROs.
- Create shared variable names, and play with and without
super
- Method resolution order by Guido
It is important to be up to date on changes to the language and as quickly as possible move on to the newest released runtime.
Therefore it is crucial that everyone who programs Python compiles the newest version from time to time and test new functionality, and verify that your old systems still work.
- PEP 539, new C API for thread-local storage
- PEP 545, Python documentation translations
- New documentation translations: Japanese, French, and Korean.
- PEP 552, Deterministic pyc files
- PEP 553, Built-in breakpoint()
- PEP 557, Data Classes
- PEP 560, Core support for typing module and generic types
- PEP 562, Customization of access to module attributes
- PEP 563, Postponed evaluation of annotations
- PEP 564, Time functions with nanosecond resolution
- PEP 565, Improved DeprecationWarning handling
- PEP 567, Context Variables
- Avoiding the use of ASCII as a default text encoding (PEP 538, legacy C locale coercion and PEP 540, forced UTF-8 runtime mode)
- The insertion-order preservation nature of dict objects is now an official part of the Python language spec.
- Notable performance improvements in many areas.
- PEP 572, Assignment expressions
- PEP 570, Positional-only arguments
- PEP 587, Python Initialization Configuration (improved embedding)
- PEP 590, Vectorcall: a fast calling protocol for CPython
- PEP 578, Runtime audit hooks
- PEP 574, Pickle protocol 5 with out-of-band data
- Typing-related: PEP 591 (Final qualifier), PEP 586 (Literal types), and PEP 589 (TypedDict)
- Parallel filesystem cache for compiled bytecode
- Debug builds share ABI as release builds
- f-strings support a handy = specifier for debugging
- continue is now legal in finally: blocks
- on Windows, the default asyncio event loop is now ProactorEventLoop
- on macOS, the spawn start method is now used by default in multiprocessing
- multiprocessing can now use shared memory segments to avoid pickling costs between processes
- typed_ast is merged back to CPython
- LOAD_GLOBAL is now 40% faster
- pickle now uses Protocol 4 by default, improving performance
- PEP 584 Union Operators in dict
- PEP 585 Type Hinting Generics In Standard Collections
- PEP 593 Flexible function and variable annotations
- PEP 602 Python adopts a stable annual release cadence
- PEP 616 String methods to remove prefixes and suffixes
- PEP 617 New PEG parser for CPython
- BPO 38379 garbage collection does not block on resurrected objects;
- BPO 38692
os.pidfd_open
added that allows process management without races and signals; - BPO 39926 Unicode support updated to version 13.0.0
- BPO 1635741 memory leak fixes
- A number of Python builtins (
range
,tuple
,set
,frozenset
,list
) are now sped up usingvectorcall
PEP 590
Copyright 2020 Equinor ASA, (cc-by-4.0)