Friday 28 March 2008

using reduce

Recently I've been reading this blog, about a guy who has set some goals for his life and is now trying to complete them. While I'm not especially interested in all that stuff, the guy is also trying to learn python. I have a sizable amount of experience with the language, so I've been sort of helping him along with his code, pointing out things that could be improved. recently he posted this code, about a function translating text to morse code. You can go over there and read that post, but basically he wrote this:
def translate(text):
text = text.lower()
for char in text:
print morse_code[char] + ' ',
morse_code represents a dictionary mapping characters to their morse code equivalents. Now this function is actually pretty pythonic and overall good, but as I commented, it is unnecessarily limited. The job of the function is to translate text, but it also prints it. What if you wanted to save the text to a file? to accomplish this, the 'what you do with the data' part (printing) should be moved out of the function. A function should do only one thing and return it's results.
You could easily modify this function: declare an empty string, add the characters to it in the loop, then return the result. But this makes the function a little bit unpythonic. There is a simpler, more elegant way to do it.

Cue reduce. Reduce is a really nice functional tool of python that in select situations simplifies the code A lot. It's not needed often, but when it is, marvel at it's simplicity. So, you ask eagerly, how do we use it?

reduce is a function used to reduce (aha) a sequence of values into a single value. It takes to arguments: a function, and a sequence. First, the function is called with the first and second arguments of the sequence. Then, it is called with the result of this call and the third argument. Then, with the result of this call and the next argument. This carries on until there is no more next sequence to apply the function to. At that point the result of the call is returned.
Let's look at a specific scenarion:
import operator
reduce(operator.add, [1, 2, 3, 4, 5])
operator.add is simply a function that adds its two arguments together. So, reduces takes the first and second elements, and calls the function with them. The resulting value is of course 1 + 2 = 3. Then, it takes this result and the next value, and calls the function again. The result is 3 + 3 = 6. As we go on like this, we can see that the result of reduce, in this case, is the sum of the numbers in the sequence. (note that, for adding a sequence of numbers, we have the sum function, which is better, cleaner and faster than this method)

using reduce to rewrite the translation method above, we can obtain the following:
def translate(text):
return reduce(str.__add__, (morse_code[c]+' ' for c in text.lower()))
Now there is a short function. I use a generator expression to obtain the translated text, and sum the characters together with the reduce function. the str.__add__ function is what is called behind the scenes if you add to strings together (i.e. 'str' + 'str2'). A generator expression is another one of those really useful tools to clear up your code. They are, however, an advanced subject. If you don't know what they are yet, first go look up list comprehensions. then, check out generators and finally generator expressions.

P.S.: while proofreading this post, I decided to check out the python documentation. It turns out that summing a sequence of strings is also such a common operation that it has a function just for that. Here is the correct, fastest and pythonic way to do write the above function:
def translate(text):
return ' '.join(morse_code[c] for c in text.lower())
the join function joins together the strings in the sequence, and uses the string it is called on as a separator. I didn't know about this function. The things we learn, right?

No comments: