Beautiful Code and a Beautiful Bug

I am teaching a technical course on the popular and ubiquitous version control system, Subversion, this Monday. I thought it might be fun to give my class a little “extra credit” reading from the O’Reilly book, Beautiful Code. In it, one of the original authors of Subversion, Karl Fogel, shares what he considers to be the most beautiful internal design within the codebase: the SVN delta editor. Though this API is not directly used in doing Subversion development, I thought it might be cool for students to have a deeper understanding of the thought that went into SVN’s codebase. But when trying to print up some copies of the chapter for the class, I got more than I bargained for…

I highly recommend the entire book. It is not so much a book about beautiful code as about passionate and opinionated programmers and their tastes. But this is a good thing. It was one of the few books about software that I have read in the last decade or so that actually gave me entirely positive feelings about my profession. There is so much raw creativity and thought captured in these few essays. What Brian Kernighan finds beautiful is entirely different from what Matz or simonpj find beautiful. And that’s the thing about a fundamentally creative craft like software. You put five software engineers in a room with a piece of code, and you’re lucky if you come out with only six different opinions about it. It’s like art, or writing. Taste matters.

I don’t recommend people read Beautiful Code to try to imitate some of the code described therein. Instead, I recommend you read it as a sociological or psychological study of what makes proud and bright software engineers tick. For example, for Kernighan it is the simplicity and minimalism that is embodied in UNIX. For Matz, it is the notion that the programming language should be as syntactically flexible as our real languages are. For simonpj, it’s that complicated can be made easy, given the right abstractions. And for Jon Bentley, in one of the more thought-provoking essays in the book, beauty and elegance was only perceived as the size of his code shrank.

The essay about the SVN Delta Editor not only illuminates the internals of SVN, but also illustrates the social dimension to software engineering and design. It is a story about programmers, debating an API, producing it, and then putting it into practice. It is about give and take, and an unteachable skill in problem size and complexity reduction. All this in C! There was a period of time in university where I actually programmed in C full-time, so I have a lot of respect for the elegance with which they crafted this powerful API. C gives you few tools (like OO or explicit interfaces) for doing this kind of work; they had to work in spite of the language’s features and plan carefully.

I have about twenty students in my class, so I was going to print up one copy and get it copied and stapled at a local print shop. (See note on copyright below.) I opened up my ebook PDF of Beautiful Code with acroread on UNIX. I navigated to the right chapter and realized that I wanted to print just that single chapter. I always remember being annoyed whenever I had to do this, for a number of reasons.

  1. PDF ebooks sometimes lack the proper “bookmark” information to navigate to the right section to print
  2. Since ebooks were once print copies, they tend to have page numbers at the bottom of each page. But since the ebook itself has a different page numbering scheme, all sorts of psychic dissonance occurs. You navigate to page 30 (in the print copy) but have to note that it’s actually page 42 in the ebook. You then navigate to page 45 (in the print copy) but have to note that it’s actually 57 in the ebook.
  3. OK, now I know what I need to print… I think. So now I have to enter one of those print ranges in the “Print” dialog. Is it 30-42? No, wait, it’s 42-45… I mean, 42-57 — that’s it! Is that inclusive or exclusive? 🙂 Oh, my…

It’s really not that bad, and it’s only an occasional annoyance, but it’s always there. I’m sure you know what I’m talking about.

I had recently upgraded to acroread and noticed that the UI was all spiffed up. And I noticed that this ebook had the right metadata for the bookmarks. I thought to myself, “Wouldn’t it be nice if acroread supported printing a chapter?” I right-clicked on the first entry in the chapter bookmark and was astonished. Lo and behold, my feature existed! (See the image to the right.) I clicked the “Print Pages…” button with a bit of discomfort. I don’t trust software too often, and am always suspicious when I find a feature I didn’t expect to be there. It’s like my inner programmer is saying, “Yea, right — too good to be true.”

A few minutes later, my chapter was printed. I looked it over, and brought it with my other materials to the local print shop. One hour later, I picked up my copies and brought them home to look them over.

I noticed something very strange. Instead of my copies containing pages 42-57, they contained pages 42, 43, 46, 51, 55, and 57. Damn it. There didn’t seem to be much of a rhyme or reason to the pages that were selected. What kind of sequence was this? I felt that there must be some pattern, some fibonacci-like, non-obvious sequence that applied to these pages. I suspected the first, and obvious, culprit: that the printer had made a mistake. Maybe it’s a human error. But then I looked over my original and indeed, the original only had those pages. Not a human error. I thought to myself, “How is this possible?”

Of course, I’ve probably given you enough information that you’ve already figured it out. Especially if you’re a programmer. We’re just wired to think this way. But in case you haven’t figured it out, I’ll indulge you.

When I went back into acroread, tracing back my steps, I noticed something about that menu item I clicked. It didn’t say Print chapter. Instead, it said, Print pages. Now, conceptually that seems like a small distinction, but I picked up on it.

I started to think like a programmer, rather than a user. This function with a for loop emerged from the program and hovered above it, almost magically. It said:

def print_pages(self, selected):
    to_print = []
    for bookmark in selected.self_and_bookmarked_children():
        to_print.append(bookmark.page())
    PrintSubsystem.queue_job(to_print)

Then I realized the pattern in the pages it picked. There was no pattern. This was a beautiful little bug. A butterfly.

You see, within the narrow world of this Print Pages function, the “feature” works as expected. But from a user’s perspective, it makes absolutely no sense. Rather than printing everything from that bookmark to the next bookmark at the same level (that is, rather than printing a chapter), it printed each individual page that happened to be physically bookmarked (or ‘sub-bookmarked’) in the PDF, at or below that level. This resulted in a bunch of pages being printed that happened to be the pages on which subsections began. But this left out most of the chapter, somewhat randomly.

The worst traits of our profession come out when it is at its least social. I have no doubt that this function that prints these pages was written by a single programmer in a windowless room, without any peer review, pair programming, or other check on his logic. I am sure that he was given the narrow and ill-defined requirement to enable an action to “print bookmark pages”. He needed to think, but instead, he decided to code. And coding got “it” done, for some very weird value of “it”. He was probably under time pressure. But one thing is certain to me: he was alone. No two programmers, debating the design and implementation of this feature, would let each other make this mistake.

The behavior it exhibited truly caught me by surprise. Strange as it sounds, I admired how easily I had been duped by this feature. The human error — the anti-social error — made by that programmer exhibited an odd and enigmatic computer behavior. A human inelegance created a strange sort of cruel machine elegance.

I found it ironic that in trying to print a chapter about beautiful design from a book called Beautiful Code, I came across this beautiful bug. I call the bug beautiful because it managed to fool me, to get me to suffer its wrath while thinking I was getting some convenience. It exhibited behavior that challenged me to identify a pattern, where there was none. It was so clever, it even cost me money (the printing charges). And even though I was a discerning programmer — skeptical of the feature, and so unsure of the software’s operation that I checked the output, albeit too briefly — this little bug managed to outsmart me.

My students will have to live without the chapter, or read it online on their own. I’m not upset about it. There can be beauty, even in failure.

A note about copyright: some readers on reddit and on my comments section suggested that I might be ignoring copyright issues by thinking that I could just photocopy a chapter from this book to distribute to my students. Trust me, I know about copyright. The content of this chapter happens to be available online for free under Creative Commons Attribution 3.0. Karl Fogel, the author of the article, has even given his informal blessing in my comments section. And finally, by most people’s interpretation of the rules of Fair Use, it was OK for me to copy a chapter for my classroom. I’m surprised no one suggested that this bug might be beautiful in another way: that it saved me from a copyright disaster. I don’t think it was that good… 🙂 If you are really interested in seeing a debate about this, you can read this thread on reddit. Warning, somewhat painful and longwinded. (Also, if you clicked the anchor link within the article to get to this note, simply click “back” in your browser to return and continue reading.)

11 thoughts on “Beautiful Code and a Beautiful Bug”

  1. “My students will have to live without the chapter, or read it online on their own.”

    That’s a bit of hyperbole there. You had time to write a whole blog post, but not enough time to click “Print Pages” a second time?

  2. “I have about twenty students in my class, so I was going to print up one copy and get it copied and stapled at a local print shop.”

    Are you really that unaware of copyright laws?

  3. Hans is right about the copyright law bit. You just take chapters from textbooks and distribute photocopies to your students?

  4. IANAL, but copying a single chapter from a book falls under the fair use doctrine and therefore does not have hard and fast rules for what is legal or not. If it is a single time, for educational purposes, non-fiction, and limited to a single class of 20 students, it is likely to be considered fair use and therefore legal. If it is as spontaneous as it sounds in the post, that weighs even more in favor of it being fair use. It is not as simple as “copying is illegal.”

  5. Yes, it’s either a single chapter from a book, or a percentage of the size, to avoid using a whole book which consists of a single chapter. At least that’s the licence we have at our university (in the UK).

  6. It’s a bit late to tell you now, but: the chapter is online here:

    http://www.red-bean.com/kfogel/beautiful-code/bc-chapter-02.html

    It’s under a Creative Commons Attribution license, so basically you can do whatever you want with it.

    (I think I have a PDF somewhere too; I could dig it up if you want.)

    I’m so glad you liked the chapter. And I completely agree with you about the book as a whole — it’s one of my favorite programming books now!

  7. Hans and Matte,

    You might be interested in http://questioncopyright.org/. Not everyone — not even every author! — agrees that restricting the spread of knowledge and culture is a good business model, let alone good social policy. Historically, the business copyright was designed to support was actually *distribution*, that is, publishing. It wasn’t invented for artists or by artists. It was designed by the publishing industry to support the inherently high up-front costs and the risk structure of publishing. It probably made some sense, too, back when applying ink to dead tree pulp (or cutting grooves in vinyl platters) was the only way to distribute information.

    Now we have a way of distributing information that is essentially zero-cost and 100% reliable (the copies are indistinguishable from the masters). Copyright law is beginning to look rather silly.

    Remember, it’s not about attribution: artists deserve credit for their work, but that can be protected by separate laws. Also, having lots of copies spread around the Internet actually protects attribution more effectively than any law could. That’s why we don’t have attribution problems in the open source world, for example. Quite the opposite: open source is one of the most conscientious crediting communities ever.

    Justin’s point about “fair use” is good, though the law is unfortunately quite fuzzy on exactly what constitutes fair use. But it would be even nicer for that phrase to go away entirely, and for all uses to be fair.

  8. This is why all requirements really need a section “Why?”. Why is this thing needed in the first place? Especially urgent if requirements are handled the brain dead way of throwing a bunch of documents over the wall to the programmer.

    The requirements are rarely “done” when they get to the developers. That should be the beginning of the next part of the conversation, when more consequences are revealed as things turn into design and code.

    In a tech design meeting last month our team rejected the whole premise of a feature when we realized it would result in a really complicated and awkward solution. We could only do that once we realized that the “why?” was actually that the user wanted a _simpler_ and _more_convenient_ way to use our service. So whatever we did would in fact not lead to the that goal.

    Without the “why?”, we would have wasted a lot of time creating something useless.

  9. Hi, I’m the author of this blog.

    @Foo, you wrote, “You had time to write a whole blog post, but not enough time to click ‘Print Pages’ a second time?” Well, first of all, if you had read the article, you’d note that clicking “Print Pages” again would have resulted in the same exact problem 🙂 But aside from that, the real issue is that the print shop was closed; it was too late to get them recopied.

    @Karl, thanks for posting here. It’s a small internet world! 🙂 I knew your article was available online, but it wouldn’t print very nicely from my browser — too many page breaks at the wrong places. Well, it probably would have printed better than it ended up printing from acroread!

    I did a little research about Fair Use, and since the article content itself was available via Creative Commons Attribution, I figured printing out the chapter from the ebook itself wouldn’t be an infringement. Even aside from that, Fair Use generally allows for distribution of multiple copies of a portion of a book for classroom use. The rule of thumb is that one chapter or 10% of the book is OK, whichever is less. This is just a rule of thumb I had from my university days. In this case, given that the article is actually specifically distributed under the CC license, I doubt my fair use of the ebook would be pursued by O’Reilly as an infringement. I’m glad Karl agrees.

    @Johan, that’s an interesting point. I think one of the problems is that in some development shops, requirements gathering / design is done in an ivory tower, and then the artifacts that are the result of that work are “thrown over the wall” to developers, who are mere implementors. It’s this kind of separation that often leads to bugs like the one described. That’s why I called this bug “anti-social”. I believe bugs like these are the result of a lack of collaboration among developers. I’m just speculating, but my feeling is that no group of developers — like Karl, Ben and Jim, discussing and debating the SVN delta editor — would have let this bug enter the mainline of acroread’s source code.

Leave a Reply