Taming the Snake

Understanding how Python works in N simple steps

(with N still growing)


Step 0. What this talk is about (and what it isn't)

Four basic things you need to learn to become a good programmer:

  1. Learn how to express algorithms in a computer programming language
  2. Learn how to design programs
  3. Learn how your tool (the programming language) works and what it can offer you
  4. Learn what others already have built for you

I will only teach you how Python works.

(And this is only the first part of it.)


Step 1. Everything is an object

In Python, everything is an object.

Examples:

42, 4.3, 'Hello world', True, False, None, [0, 1, 2, 3], {'key': 'value', 'other key': 'other value'}
('This', 'is', 'a', 'tuple'), np.eye(10)

Really everything!

math.sin, lambda x: x, class C, math, func.__code__

To understand Python, we should first understand what objects are!


Step 2. The three properties of an object

Objects have:

  1. Identity
  2. State
  3. Methods

Step 3. The Identity of an object

In [1]:
a = [0, 1, 2] 
b = [0, 1, 2]
a == b
Out[1]:
True
In [2]:
a is b
Out[2]:
False
In [3]:
id(a)
Out[3]:
140008350782240
In [4]:
id(b)
Out[4]:
140008350781880
In [5]:
(a is b) == (id(a) == id(b))
Out[5]:
True
In [6]:
x = 6
y = 6
x is y
Out[6]:
True
In [7]:
x = 666
y = 666
x is y
Out[7]:
False
In [8]:
id(a) is id(a)
Out[8]:
False

Do not use is unless you have a good reason!

Reasonable exceptions:

x is True
x is False
x is None

This works because True, False, None are signletons in Python, i.e. there is only one object True, etc., in the whole Python universe.


Step 4. Understand assignment

In [9]:
import numpy as np
a = np.ones(10)
b = np.zeros(10)
print(a)
print(b)
[ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
In [10]:
a is b
Out[10]:
False
In [11]:
a = b
print(a)
print(b)
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
In [12]:
a[0] = 1
print(a)
print(b)
[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
In [13]:
a is b
Out[13]:
True

Definition of assignment

a = b

means

assign the name a to the object with name b

Let's repeat!

Definition of assignment

a = b

means

assign the name a to the object with name b


Step 5. The state of an object

Objects have state (data) associated to them, which can change over the lifetime of an object.

The state is stored in the objects attributes.

In [14]:
from email.mime.text import MIMEText
m = MIMEText('Hi, this is an email!')
m
Out[14]:
<email.mime.text.MIMEText instance at 0x7f563c04cf38>
In [15]:
print(m.as_string())
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Hi, this is an email!
In [16]:
m._payload
m._headers
Out[16]:
[('Content-Type', 'text/plain; charset="us-ascii"'),
 ('MIME-Version', '1.0'),
 ('Content-Transfer-Encoding', '7bit')]

The underscore means, _payload and _headers are private attributes.

You, the user, should not mess around with them.

Methods can change attributes:

In [17]:
m.add_header('From', 'stephan.rave@uni-muenster.de')
m.add_header('To', 'evil.overlord@the.hell.com')
m.add_header('Subject', 'World domination')
print(m.as_string())
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: stephan.rave@uni-muenster.de
To: evil.overlord@the.hell.com
Subject: World domination

Hi, this is an email!
In [18]:
m._headers
Out[18]:
[('Content-Type', 'text/plain; charset="us-ascii"'),
 ('MIME-Version', '1.0'),
 ('Content-Transfer-Encoding', '7bit'),
 ('From', 'stephan.rave@uni-muenster.de'),
 ('To', 'evil.overlord@the.hell.com'),
 ('Subject', 'World domination')]

We can also change attributes, even private ones:

(Do not try this at home!)

In [19]:
m._payload = 'We need to talk!'
print(m.as_string())
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: stephan.rave@uni-muenster.de
To: evil.overlord@the.hell.com
Subject: World domination

We need to talk!
In [20]:
m.my_attribute = 666
m.my_attribute
Out[20]:
666

Step 6. Understand the basics of attribute lookup

So where do all these attributes come from?

In [21]:
m.__dict__
Out[21]:
{'_charset': us-ascii,
 '_default_type': 'text/plain',
 '_headers': [('Content-Type', 'text/plain; charset="us-ascii"'),
  ('MIME-Version', '1.0'),
  ('Content-Transfer-Encoding', '7bit'),
  ('From', 'stephan.rave@uni-muenster.de'),
  ('To', 'evil.overlord@the.hell.com'),
  ('Subject', 'World domination')],
 '_payload': 'We need to talk!',
 '_unixfrom': None,
 'defects': [],
 'epilogue': None,
 'my_attribute': 666,
 'preamble': None}
In [22]:
m.__dict__['favourite_song'] = 'Hotel california'
m.__dict__['_payload'] = 'WE NEED TO TALK!!!'
In [23]:
m.favourite_song
Out[23]:
'Hotel california'
In [24]:
print(m.as_string())
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: stephan.rave@uni-muenster.de
To: evil.overlord@the.hell.com
Subject: World domination

WE NEED TO TALK!!!

Attribute lookup

a.b

means

a.__dict__['b']
In [25]:
a = np.eye(10)
a.secret_answer = 42
Writing out traceback for vim ...
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-25-9d909bc47556> in <module>()
      1 a = np.eye(10)
----> 2 a.secret_answer = 42

AttributeError: 'numpy.ndarray' object has no attribute 'secret_answer'
In [26]:
a.__dict__
Writing out traceback for vim ...
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-26-87da7e691200> in <module>()
----> 1 a.__dict__

AttributeError: 'numpy.ndarray' object has no attribute '__dict__'
In [27]:
(42).__dict__
Writing out traceback for vim ...
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-27-f22e1a648ac9> in <module>()
----> 1 (42).__dict__

AttributeError: 'int' object has no attribute '__dict__'
In [28]:
[0, 1, 2].__dict__
Writing out traceback for vim ...
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-28-cb92ad523bf5> in <module>()
----> 1 [0, 1, 2].__dict__

AttributeError: 'list' object has no attribute '__dict__'

Attribute lookup

If a has a dict,

a.b

means

a.__dict__['b']

This is not the case for builtin types or C extension types.

In [29]:
class C(object):
    pass

c = C()
c.__dict__
Out[29]:
{}
In [30]:
c.secret_answer = 42
c.__dict__
Out[30]:
{'secret_answer': 42}
In [31]:
class C(object):
    secret_answer = 42
    
c = C()
c.secret_answer
Out[31]:
42
In [32]:
c.__dict__
Out[32]:
{}
In [33]:
c.__class__
Out[33]:
__main__.C
In [34]:
c.__class__.__dict__
Out[34]:
<dictproxy {'__dict__': <attribute '__dict__' of 'C' objects>,
 '__doc__': None,
 '__module__': '__main__',
 '__weakref__': <attribute '__weakref__' of 'C' objects>,
 'secret_answer': 42}>

Attribute lookup (simplified)

If a has a dict,

a.b

means

if 'b' in a.__dict__:
    return a.__dict__['b']
elif 'b' in a.__class__.__dict__:
    return a.__class__.__dict__['b']
elif 'b' in 'base class dicts':
    return base_class.__dict__['b']
else:
    raise AttributeError

This is not the case for builtin types or C extension types.


Step 7. Be careful with class attributes

In [35]:
class C(object):
    great_list_of_awesomeness = []

c1 = C()
c2 = C()
In [36]:
c1.great_list_of_awesomeness.append(42)
c1.great_list_of_awesomeness
Out[36]:
[42]
In [37]:
c2.great_list_of_awesomeness
Out[37]:
[42]

This might not be what you want!


Step 8. Understand methods

Methods are functions defined in class definitions as follows:

In [38]:
class C(object):
    
    def __init__(self, x):
        self.x = x
    
    def f(self, y):
        return self.x + y
    
c = C(11)
c.f(31)
Out[38]:
42
c.f(31)

translates to

C.f(c, 31)

i.e. calling a method on an object c magically inserts c as first argument of the method.

This also explains this strange error:

In [39]:
c.f()
Writing out traceback for vim ...
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-69c69864e152> in <module>()
----> 1 c.f()

TypeError: f() takes exactly 2 arguments (1 given)

__init__ is one of Python's many special methods.

It is called, after a new object has been created.

Thus

c = C(11)

translates to

c = newly_created_C_instance
c.__init__(c, 11)

We can add new methods to the class at any time:

In [40]:
def g(ego, y):
    return ego.x * y

C.mult = g
c.mult(2)
Out[40]:
22

Note that self as first argument name is pure convention.

You should really stick to it!

Adding functions directly to objects does not work:

In [41]:
def h(self, y):
    return self.x - y

c.h = h
c.h(-31)
Writing out traceback for vim ...
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-41-f498a20a0d9d> in <module>()
      3 
      4 c.h = h
----> 5 c.h(-31)

TypeError: h() takes exactly 2 arguments (1 given)

Oh no, the magic is not working!

In [42]:
import types
c.h = types.MethodType(h, c)
c.h(-31)
Out[42]:
42

Step 9. Understand immutable objects

Numbers, strings and tuples are immutable in Python:

In [43]:
t = (0, 1, 2)
t[1] = 2
Writing out traceback for vim ...
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-4ca911cddf58> in <module>()
      1 t = (0, 1, 2)
----> 2 t[1] = 2

TypeError: 'tuple' object does not support item assignment

There is absolutely no way to modify them!

So what about here:

In [44]:
x = 41
y = x
x += 1
In [45]:
y
Out[45]:
41
In [46]:
x is y
Out[46]:
False
In [47]:
x
Out[47]:
42

So, x refers to a new int object with value 42.

Think what would have happend, if the object had stayed the same!

This is, how inplace addition works:

In [48]:
class MyNumber(object):
    def __init__(self, value):
        self.value = value
    def __iadd__(self, other):
        return MyNumber(self.value + other)
In [49]:
n = MyNumber(12)
print(n.value)
print(id(n))
n += 7
print(n.value)
print(id(n))
12
140008016975120
19
140008149531472
n += 7

is really the same as

n = n.__iadd__(7)

What happens here?

In [50]:
class C(object):
    value = 42
    
c1 = C()
c2 = C()
c1.value = 666
In [51]:
c1.value
Out[51]:
666
In [52]:
c2.value
Out[52]:
42
In [53]:
print(c1.__dict__)
print(c2.__dict__)
print(C.__dict__)
{'value': 666}
{}
{'__dict__': <attribute '__dict__' of 'C' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'C' objects>, '__doc__': None, 'value': 42}

Step 10. Be careful with default arguments:

Default arguments are attributes of the function object:

In [54]:
def f(x, more_values=[]):
    more_values.append(x)
    print(more_values)
In [55]:
f.__defaults__
Out[55]:
([],)
In [56]:
print(dir(f))
['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
In [57]:
def f(x, more_values=[]):
    more_values.append(x)
    print(more_values)
In [58]:
f(1)
[1]
In [59]:
f(42, [4, 8, 15, 16, 23, 42])
[4, 8, 15, 16, 23, 42, 42]
In [60]:
f(1)
f(1)
[1, 1]
[1, 1, 1]
In [61]:
f.__defaults__
Out[61]:
([1, 1, 1],)
In [62]:
f.__defaults__ = ([],)
f(1)
[1]