Pythonic means idiomatic and tasteful

Pythonic isn’t just idiomatic Python — it’s tasteful Python. It’s less an objective property of code, more a compliment bestowed onto especially nice Python code. The reason Pythonistas have their own word for this is because Python is a language that encourages good taste; Python programmers with poor taste tend to write un-Pythonic code.

This is highly subjective, but can be easily understood by Pythonistas who have been with the language for awhile.

Here’s some un-Pythonic code:

def xform(item):
    data = {}
    data["title"] = item["title"].encode("utf-8", "ignore")
    data["summary"] = item["summary"].encode("utf-8", "ignore")
    return data

This code is both un-Pythonic and unidiomatic. There’s some code duplication which can very easily be factored out. The programmer hasn’t used concise, readability-enhancing facilities that are available to him by the language. Even lazy programmers will recognize this code’s clear downsides.

Here’s another version that is more idiomatic but is nonetheless still un-Pythonic:

class ItemTransformer(object):
    def __init__(self, item):
        self.item = item
 
    def encode(self, key):
        return self.item[key].encode("utf-8", "ignore")
 
    def transform(self):
        return dict(
            title=self.encode("title"), 
            summary=self.encode("summary"))

Nothing about this code is particularly unidiomatic. I might even see code like this in many popular open source projects. But it’s in poor taste. It’s un-Pythonic.

What is the code doing? It’s just taking an incoming dictionary, encoding its values using utf-8, and returning a new dictionary with those encoded values. There is no need to introduce an ItemTransformer object — it’s an extra abstraction and is just making the signal-to-noise ratio poorer. People coming from Java often write un-Pythonic code because Java is a language that does not reward good taste. The Pythonic view is: programming is hard enough — let’s not make it harder for ourselves.

Here’s a more Pythonic version:

def encode_news_item(item):
    def encoded(*keys):
        for key in keys: 
            yield (key, item[key].encode("utf-8", "ignore"))
    return dict(encoded("title", "summary"))

This code shows comfort with Python’s features, but does not abuse this comfort by obfuscating the code with mind-bending constructions. The programmer has reduced the problem to two comprehensible subproblems: creating a stream of tuples (key, encoded_value) and constructing the new dictionary from that stream. This leverages the elegant fact that in Python, dictionaries can be easily constructed from (key, value) tuples.

This version avoids code duplication while also making the last line (the return statement) a rough description of the entire function. “return a dictionary of the encoded values for the keys title and summary” Idiomatic, yes. But also tasteful, and thus Pythonic.

Even Pythonic code can be improved:

def encode_news_item(item):
    encoded = lambda key: item[key].encode("utf-8", "ignore")
    keys = ("title", "summary")
    return {key: encoded(key) for key in keys}

This code uses a dictionary comprehension and a lambda instead of a nested generator. It also follows the Python principle of “flat is better than nested”.

Though Pythonic code is often smaller than its un-Pythonic counterparts, the experienced Pythonista knows the road to hell is paved with good intentions:

def encode_news_item(item):
    keys = ("title", "summary")
    return dict(zip(keys, map(lambda key: item[key].encode("utf-8", "ignore"), keys)))

Remember: “readability counts!” Ick… time to revert this Python code to the much more readable version, just above it!

8 thoughts on “Pythonic means idiomatic and tasteful”

  1. unless i’m missing something:

    def encode_news_item(item):
    return dict([(key, item[key].encode(“utf-8”, “ignore”)) for key in (“title”, “summary”)])

  2. or how about implementing __iter__, so you can run dict on the item:

    def __iter__(self): return iter([(key, item[key].encode(“utf-8″, “ignore”)) for key in (“title”, “summary”)])

  3. I’ll write something like this:

    class Dict(dict):
    def __setitem__(self, key, value):
    super(Dict, self).__setitem__(key, value.encode(‘utf8’, ‘ignore’))

  4. yet another function version:

    def encode_dict_items(dic, encoding=’utf8′, errors=’ignore’):
    return dict((k, v.encode(encoding, errors)) for k, v in dic.items())

  5. Sometimes it takes time to come up with elegant solutions, but in a real world environment with deadlines, the “ugly” solution gets the job done, and that’s enuff.

  6. It’s worth noting that the least Pythonic code example posted is in fact the fastest. It hurts, but sometimes it pays to leave well enough alone.

Leave a Reply

Your email address will not be published. Required fields are marked *