From the very beginning of my journey as a software developer, I really liked to delve into the insides of programming languages. I was always interested in how this or that construction works, how this or that team works, what is under the hood of syntactic sugar, etc. Recently, I came across an interesting article with examples of how mutable- and immutable-objects in Python do not always obviously work. In my opinion, the key is how the behavior of the code changes depending on the type of data used, while maintaining the identical semantics and language constructs used. This is a great example of what you need to think not only when writing, but also when using. I invite everyone to read the translation.

ITKarma picture

Try to solve these three problems, and then check the answers at the end of the article.

Tip : the tasks have something in common, so refresh your mind about the solution to the first task, when you move to the second or third, it will be easier for you.

First task


There are several variables:

x=1 y=2 l=[x, y] x += 5 a=[1] b=[2] s=[a, b] a.append(5) 

What will be displayed when printing CDMY0CDMY and CDMY1CDMY?

Second task


Define a simple function:

def f(x, s=set()): s.add(x) print(s) 

What happens if you call:

>>f(7) >>f(6, {4, 5}) >>f(2) 

Third task


Define two simple functions:

def f(): l=[1] def inner(x): l.append(x) return l return inner def g(): y=1 def inner(x): y += x return y return inner 

What do we get after executing these commands?

>>f_inner=f() >>print(f_inner(2)) >>g_inner=g() >>print(g_inner(2)) 

How confident are you in your answers? Let's check your case.

Solution of the first task


>>print(l) [1, 2] >>print(s) [[1, 5], [2]] 

Why does the second list respond to a change to its first element CDMY2CDMY, and the first list completely ignores the same change to CDMY3CDMY?

Solution of the second problem


Let's see what happens:

>>f(7) {7} >>f(6, {4, 5}) {4, 5, 6} >>f(2) {2, 7} 

Wait a minute, shouldn't CDMY4CDMY be the last result?

Solution of the third problem


The result will be like this:

>>f_inner=f() >>print(f_inner(2)) [1, 2] >>g_inner=g() >>print(g_inner(2)) UnboundLocalError: local variable ‘y’ referenced before assignment 

Why didn't CDMY5CDMY issue CDMY6CDMY? Why does the internal function CDMY7CDMY remember the external scope and the internal function CDMY8CDMY does not remember? They are almost identical!

Explanation


What if I tell you that all these examples of weird behavior are related to the difference between mutable and immutable objects in Python?

Modifiable objects, such as lists, sets, or dictionaries, can be modified locally. Immutable objects, such as numeric and string values, tuples, cannot be changed; their "change" will lead to the creation of new objects.

Explanation of the first task


x=1 y=2 l=[x, y] x += 5 a=[1] b=[2] s=[a, b] a.append(5) >>print(l) [1, 2] >>print(s) [[1, 5], [2]] 

Since CDMY9CDMY is immutable, operation CDMY10CDMY does not change the original object, but creates a new one. But the first element of the list still refers to the original object, so its value does not change.

Because a mutable object, the CDMY11CDMY command changes the source object (rather than creating a new one), and the CDMY12CDMY list “sees” the changes.

Explanation of the second task


def f(x, s=set()): s.add(x) print(s) >>f(7) {7} >>f(6, {4, 5}) {4, 5, 6} >>f(2) {2, 7} 

Everything is clear with the first two results: the first CDMY13CDMY value is added to the initially empty set and CDMY14CDMY is obtained; then the CDMY15CDMY value is added to the CDMY16CDMY set and CDMY17CDMY is obtained.

And then strange things begin. The value CDMY18CDMY is not added to the empty set, but to {7}. Why? The initial value of the optional parameter CDMY19CDMY is calculated only once: on the first call, s will be initialized as an empty set. And since it is mutable, after calling CDMY20CDMY it will be changed “in place”. The second call to CDMY21CDMY will not affect the default parameter: it is replaced by many CDMY22CDMY, that is, CDMY23CDMY is another variable. Third call CDMY24CDMY uses the same CDMY25CDMY variable that was used during the first call, but it is not reinitialized as an empty set, but instead its previous value CDMY26CDMY is taken.

Therefore, you should not use mutable arguments as default arguments.In this case, the function must be changed:

def f(x, s=None): if s is None: s=set() s.add(x) print(s) 

Explanation of the third task


def f(): l=[1] def inner(x): l.append(x) return l return inner def g(): y=1 def inner(x): y += x return y return inner >>f_inner=f() >>print(f_inner(2)) [1, 2] >>g_inner=g() >>print(g_inner(2)) UnboundLocalError: local variable ‘y’ referenced before assignment 

Here we are dealing with closures: internal functions remember how their external namespaces looked at the time of their definition. Or at least they should remember, but the second function makes the poker interface and behaves as if it had not heard of its external namespace.

Why is this happening? When we execute CDMY27CDMY, the mutable object created when the function was defined changes. But the CDMY28CDMY variable still refers to the old address in memory. However, an attempt to change an immutable variable in the second function CDMY29CDMY leads to the fact that y starts to refer to another address in memory: the original y will be forgotten, which will lead to an UnboundLocalError error.

Conclusion


The difference between mutable and immutable objects in Python is very important. Avoid the weird behavior described in this article. In particular:

  • Do not use mutable arguments by default.
  • Do not attempt to change immutable closure variables in internal functions.
.

Source