Thursday 5 February 2009

variable assignment in python

Hello everyone, welcome to the blog that rarely has posts! I should really stop making jokes about this. I think it's time for another programming blog post.

The following confusion often arises with programmers new to python, who come to understand that assignments in python are done by reference, not by value. This is correct, after a fashion, and not really that different from other languages. But a little knowledge is often more dangerous than no knowledge at all. While the following statement is not all that confusing for a c programmer who knows nothing about python, it is confusing if he has learned that all assignments are reference assignments:

>>> a = 1
>>> b = a
>>> a = 2
>>> b
1

>>>

The expected result, of course, is 2, not 1. Doesn't a reference the same data as b after the second statement? So shouldn't b reflect the change that was made in a?

No, it, shouldn't, because integers are immutable, which means we can't change them. This might seem strange. We could change the value of a without problems, couldn't we? how can it be immutable then? Well, Let's look at what happens when a simple kind of assignment happens in a fresh python interpreter. What goes on inside if we do this:

Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)]
on linux2
Type "help", "copyright", "credits
" or "license" for more information.
>>> a = 1

Two things are being created here. First, a name. Second, some data. These things exist separately, something that becomes relevant later. Let's visualize the names and data in the interpreter at this time (cheesy graphics will follow):



Right, so we have one name, "a," pointing to some data, in this case an integer object with the value of 1. Pretty straightforward so far. Let's continue along this line. We'll execute the second statement in the above little piece of code:

>>> b = a
The result of this is that a second name is created, b. This is pointed at the same integer object as a, as we specified in the statement:



Now, let's shake things up: we're going to reassign a to something else:

>>> a = 2

Most people would think that the data would simply change it's value to 2, and the picture would remain basically the same. But this is where python catches you: integers are Immutable. That means their value can not change. Ever. So, what happens instead? A new integer object is created:



a is now pointing at the new integer object specified. But what about b? Well, we never told it to point to something else, so it is still pointing at the same old object that a used to be pointing at.

Here, then, is the fundamental aspect to grasp. You can make two names point to the same thing, but if that thing is immutable, it cannot be changed. Therefore, it makes no sense to think changes to one name would be reflected in the other. Because you cannot make changes, only make that name point to something else, which, indeed, messes up your synchronization.

In a language like C, creating two integers a and b will immedeately result in the creation of two integer objects in memory. The above behaviour of these two is expected, since the two names are pointing to two different objects. Upon learning that in python, the assignment b = a results in what is essentially the behaviour caused by the C statement int * b = &a, confusion arises. The missing gem here is the immutability, which makes the python behaviour sane again.

python: intuitive to the newbie, yet without being inconsistent.

3 comments:

KOSUHIK said...

SWEETT!!

Anonymous said...

Thanks for the helpful post!

Term Papers said...

I have been visiting various blogs for my term papers writing research. I have found your blog to be quite useful. Keep updating your blog with valuable information... Regards