Cloud GNU: where are you?

This continues an article I wrote nearly three years ago, Common Criticisms of Linux, parsed and analyzed.

In the three years since I wrote that original piece, Linux has grown — albeit slowly — in desktop usage. After nearly 2 years of no growth (2008-2010, lingering around 1% of market), in 2011 Linux saw a significant uptick in desktop adoption (+64% from May 2011 to January 2012). However, Linux’s desktop share still about 1/5 of the share of Apple OS X and 1/50 the share of Microsoft Windows. This despite the fact that Linux continues to dominate Microsoft in the server market.

The proprietary software industry may be filled with vaporware, mediocre software, and heavyweight kludges, but there is certainly also a lot of good stuff that keeps users coming back.

However, I believe the 2011/2012 up-tick in Linux desktop usage reflects a different trend: the increasingly commoditized role that desktop operating systems (and by extension, desktop software) play in an omni-connected world of cloud software.

Why doesn’t Linux run software application X or Y?

For end users, the above was a core complaint for many years (approx. 2000-2009) when evaluating Linux. However, this complaint has faded in the last two years. Let’s reflect on the most common and useful pieces of software on desktop operating systems these days:

Continue reading Cloud GNU: where are you?

The Debian Manifesto

The Debian design process is open to ensure that the system is of the highest quality and that it reflects the needs of the user community. By involving others with a wide range of abilities and backgrounds, Debian is able to be developed in a modular fashion. Its components are of high quality because those with expertise in a certain area are given the opportunity to construct or maintain the individual components of Debian involving that area. Involving others also ensures that valuable suggestions for improvement can be incorporated into the distribution during its development; thus, a distribution is created based on the needs and wants of the users rather than the needs and wants of the constructor. It is very difficult for one individual or small group to anticipate these needs and wants in advance without direct input from others.

This amazing quote from 1994 (!!!) actually models the way I think about software engineering at Parse.ly.

A nice piece of nostalgia on Debian’s 19th birthday.

See also: A Brief History of Debian, The Debian Policy Manual, & The Debian Developers Map.

Build a web app fast: Python, HTML & JavaScript resources

Wanna build a web app fast? Know a little bit about programming but want to build a modern web app using two well-supported, well-documented, and universally accessible languages? You’ll love these Python, HTML/CSS, and JavaScript resources.

I’ve been sharing these documents with friends who ask me, “I want to start programming and build a web app, where do I start?”. These resources have also been useful to existing programmers who know C, C++ or Java, but who want to embrace dynamic and web-based programming.

Python Resources

Python is the core programming language used at Parse.ly. It also happens to be a quickly-growing language with wide adoption in the open source community, and it is a very popular choice for web startups.

I’ve written a blog post with some original materials for learning Python, import this — learning the Zen of Python with code and slides.

This is a good starting point, but you may also find these resources very helpful:

  • For absolute beginners, “Learn Python the Hard Way”. This teaches Python using a series of programming examples, but it really assumes you have no programming background whatsoever. After going through the examples in LPTHW, it may be a good idea to supplement your understanding with Think Python.
  • For existing programmers, “Dive into Python 3”. This teaches Python from the starting point that you have already programmed in a mainstream language like C or Java, and want to know what makes Python really cool/good. Similar audience to my “Zen of Python” slides. Note that this tutorial teaches Python 3, but most people still use Python 2.7. See Python2orPython3 on Python wiki to see the differences.
  • For advanced programmers, “Python Essential Reference, 4th Edition”. Unfortunately, this book costs money, but it’s basically the best book on Python on the market, and it’s very up-to-date. It’s very dense and weighs in at 717 pages, so this is only for those who want to go deep on Python.
  • For cheap advanced programmers, “Official Python Tutorial”. Though the Python tutorial doesn’t have the best narrative style nor the best real-world examples, for advanced programmers, it will teach the reality of the language in a comprehensible way. And, it’s free.

HTML/CSS Resources

In order to build up web applications, you’ll need to write your front-ends in HTML and CSS. These technologies have evolved over the years, but the basic principles remain from when they emerged nearly a decade ago. HTML is the markup language of the web, and you’ll see a lot of tutorials refer to HTML4, which is basically the markup standard all web browsers and websites work off. Don’t be confused by the HTML5 moniker, which often refers to much more than simply the markup — usually, it’s referring to a set of JavaScript APIs that are becoming standard in browsers, along with enhanced audio/video support and a few new “semantic markup” tags that have been added.

Since HTML is basically useless without CSS, you can get by with a short tutorial on HTML and then more advanced tutorials on CSS styling. Here’s what I recommend.

Learn the basics of HTML from MDC’s Introduction to HTML and Wikipedia’s page on HTML. This is a rare case where using Wikipedia is actually a perfect way to get the right background because half the battle with understanding HTML is understanding its history.

An excellent new guide to HTML & CSS together has been published by Shay Howe in 2013.

These look like a great first stop.

You can also use these dedicated resources for CSS specifically:

  • For absolute beginners: Use W3C’s official tutorial on Starting with HTML + CSS. This was written all the way back in 2004, but provides the basics with screenshots and real code examples, so is a great way to get started.
  • For existing programmers: Mozilla has done a great job putting together a quick and readable tutorial that gives you the basics at a glance.
  • For advanced programmers: You’ll want to buy the best book on the subject, CSS Mastery. It has the best explanation of the box model and browser rendering engine’s that I’ve seen, and covers all the edge cases nicely.
  • For cheap advanced programmers: You’ll need to look over the MDC (Mozilla) CSS Reference. Pay particularly close to articles on the Box Model and the Visual Formatting Model.

JavaScript Resources

Aside from Python, every Parse.ly engineer also knows JavaScript, even if it is only begrudgingly. For better or for worse, JavaScript has become the world’s most popular programming language.

JavaScript is definitely the language of the web. It is also a language that has, over the last few years, developed a nice bit of great documentation for learning the language. Here are some resources you can use to get up to speed:

  • For absolute beginners: “Eloquent JavaScript” introduces you to both modern programming techniques and JavaScript at the same time. It is thus a great book for beginners. There is also a print version available.
  • For existing programmers: The Mozilla Developer Network (MDN) contains the web’s best and most official documentation of HTML, CSS, and JavaScript. This guide, “A Re-Introduction to JavaScript”, presents the language to an audience that already knows how to program, and focuses specifically on the “gotcha” parts of the language.
  • For advanced programmers: A must-read is the short (but costly) “JavaScript: The Good Parts”. Douglas Crockford basically reintroduced the world to JavaScript as a modern programming language. He is a bit of a curmudgeon when it comes to programming style, but this makes sense since he is also the author of JSLint, an important tool used in JS development for static code checking.
  • For cheap advanced programmers: Douglas Crockford, author of the above “Good Parts” book, has also given a series of public video lectures on JavaScript at Yahoo! headquarters. These are freely available online and actually present much of the same content in “Good Parts”, just in a condensed form. Warning for the cheap: though the videos are very good, the book goes into more depth and spends less time on the history of the language. Also, Matt Might’s JavaScript, Warts and workarounds is an excellent summary to some of the most important “bad parts” of JavaScript.

JavaScript “frameworks”

Though knowing JS is important to do anything web-facing, you can also leverage some frameworks to help you out. The ones I recommend are the venerable jQuery JavaScript library and the Twitter Bootstrap HTML/CSS/JavaScript components. See:

jQuery adds common utilities for DOM manipulation, HTTP/JSON server requests, basic animations, and dynamic CSS.

Bootstrap builds on jQuery and adds a common, simple UI component library using pure HTML, CSS and JavaScript. This provides a grid system for layout; nicely-designed stylesheets for typography, tables, lists, and buttons, plus JavaScript components that add dynamic behavior such as tabs, dropdowns, modal dialogs, navigation bars, and more.

2018 note on JavaScript frameworks

This post was originally written in 2012 and updated over the years. I have made a strong effort over the years to ensure the suggestions in this post are timeless and work well regardless of when you are beginning your journey of learning web development.

2017-2018 was an inflection point for JavaScript frameworks in the open source community. You will read a lot in modern tutorials saying that, instead of jQuery and Bootstrap, you should use something like React or Vue.js. If you are an HTML, CSS, and JavaScript beginner, I strongly recommend against this.

These frameworks are complex, even for advanced programmers like me. They are primarily built to support a style of web development known as Single-Page Applications. This is an important frontend development practice and will be increasingly important as more and more web applications are built, but it won’t replace web sites. SPAs are overkill for building simple web sites and web applications. And, they are hard to learn for beginners.

If you move on from making websites to making full-fledged web applications, there will be a straightforward level-up path, and you won’t have wasted time learning about how HTML, CSS, and JavaScript work. Especially for making basic websites with dynamic behavior, it’s still extremely common to use “simple” JavaScript with jQuery and Bootstrap, and those sites are easier to reason about, to test, and to deploy. Don’t be confused by the modern tutorials that send you down the SPA rabbit hole.

One other 2017-2018 concern related to this regards which version of JavaScript to use on modern websites, to ensure good browser compatibility. The 2018 browser compatibility picture is much better than the 2012 one. However, the 2018 recommendation is to use a version of JavaScript known as “ECMAScript 5” (ES5), whose browser compatibility is summarized here. Essentially, ES5 is supported in Microsoft Internet Explorer 11 and later, as well as every other major browser platform (Chrome, Safari, Firefox, etc.) Thanks to a patch that Microsoft released in 2016 that urged IE8, IE9, and IE10 users to upgrade to IE11, that is the only “old browser” you likely need to worry about. Meanwhile, Microsoft is also pushing its users in the direction of its more modern browser engine, Microsoft Edge, which is the default browser of Windows 10.

Newer versions of JavaScript — which go by the monikers “ES6”, “ES2016”, and “ES2017” — should only be used if you use a Babel build toolchain, which is beyond the scope of this article. This is because JavaScript, as a language, is now evolving with annual releases, without any consideration for backwards compatibility to older browsers, except insofar that those browsers can be targeted via a code translation technique known as transpilation. However, if you use the new features without this toolchain in place, you will write code that will break on a large percentage of your users. So, stick with ES5 while you’re still in learning mode.

Continue reading Build a web app fast: Python, HTML & JavaScript resources

Fully Distributed Teams: are they viable?

It has become increasingly common for technology companies to run as Fully Distributed teams. That is, teams that collaborate primarily over the web rather than using informal, face-to-face communication as the main means of collaborating.

This has only become viable recently due to a mix of factors, including:

  • the rise of “cloud” collaboration services (aka “web 2.0” software) as exemplified by Google Apps, Dropbox, and SalesForce
  • the wide availability of high-speed broadband in homes that rivals office Internet connections (e.g. home cable and fiber)
  • real-time text, audio and video communication platforms such as IRC, Google Talk, and Skype

Thanks to these factors, we can now run Fully Distributed teams without a loss in general productivity for many (though not all) roles.

In my mind, there are three models for scaling number of employees in a growing company in the wild today. These are:

Continue reading Fully Distributed Teams: are they viable?

Speed and lightness

Last week, I decided to give myself the present of a 512GB SSD drive, which was available at a nearly 25% discount on NewEgg for a limited time.

The price-per-gigabyte for SSDs has finally fallen to nearly $1/GB, and the rewrite cycle problems that used to afflict these drives is now becoming a non-issue with the Linux kernel’s TRIM implementation and the updated firmware on these drives.

So, I took the plunge. My main development workstation was a Thinkpad T400, maxed out to 8GB of RAM, and with dual 500GB platter drives (via Thinkpad’s excellent Ultrabay extension). I was running Ubuntu 10.10 for a long time. I timed the SSD purchase with the release of Ubuntu 12.04 LTS — 10.10 no longer being supported, I figured I’d do a clean install on the new SSD and clean up my development workstation for the first time in a couple of years.

A couple of things occurred to me in this process. First of all, since 2009, I have moved more and more of my data into cloud services. I have moved the lion’s share of my “business and personal documents”, including photos, into Dropbox with my 50GB account. And I have moved my truly old files and digital keepsakes into NAS drives that I host in my little server room at home.

Continue reading Speed and lightness

On multi-form data

I read an excellent debrief on a startup’s experience with MongoDB, called “A Year with MongoDB”.

It was excellent due to its level of detail. Some of its points are important — particularly global write lock and uncompressed field names, both issues that needlessly afflict large MongoDB clusters and will likely be fixed eventually.

However, it’s also pretty clear from this post that they were not using MongoDB in the best way. For example, in a small part of their criticism of “safe off by default”, they write:

We lost a sizable amount of data at Kiip for some time before realizing what was happening and using safe saves where they made sense (user accounts, billing, etc.).

You shouldn’t be storing user accounts and billing information in MongoDB. Perhaps MongoDB’s marketing made you believe you should store everything in MongoDB, but you should know better.

In addition to that data being highly relational, it also requires the transactional semantics present in mature relational databases. When I read “user accounts, billing” here, I cringed.

Continue reading On multi-form data

XDDs: stay healthily skeptical and don’t drink the kool-aid

On my LinkedIn profile, I list one of my skills as “thought-driven development”. This is a little tongue-in-cheek; software engineering over the last few years has developed a lot of “XDDs,” such as test-driven development, behavior-driven development, model-driven development, etc. etc.

“Thought-driven development” doesn’t actually exist, but by it, I simply mean: perhaps we should think about what we’re doing, rather than reaching for a nearby methodology du jour.

In my last job, a colleague of mine used to also joke about “design-driven design” — perhaps the ultimate play on the XDDs since it is also a strange loop.

All this is not to say the XDDs aren’t useful — they definitely are. A lot of them have spawned entire groups of cross-platform open source projects. I am all for anything that makes the adoption of XYZ best practice easier for my team. But these techniques often require some lateral thinking to get to any real benefit.

When evaluating technologies like this, you have to take each little community with a grain of salt. Almost every programming framework / methodology / etc. that exists portends to offer some order of magnitude increase in software reliability / developer productivity / whatever else. And almost all, if not all, fail to do so, in practice.

Here is one anecdote to illustrate the point. From 2006-2008 at Morgan Stanley, the entire corporation was obsessed with Java’s Spring framework and its core “architectural pattern”, Inversion of Control. I can’t even begin to explain to you the number of man-hours that were wasted re-architecting existing, working software to meet this chimerical conception of component decoupling. I even contributed to this, urged by the zealots and their blind faithful. All of the reasons seemed great: decoupling code, using more interfaces, allowing for easier unit testing, being able to “rewire dependencies” and use fancy technologies like “aspect-oriented programming”.

Even Google got swept up in the madness and developed their own, competing framework called Guice. And in the end — after all that work — my diagnosis is that IoC is basically a non-starter, a complete waste of time.

A complicated framework that morphed into a programming methodology, developed exclusively to work around some annoying limitations of the Java language. Since it was applied without thinking, now everyone’s Java code has to suffer, and you can hardly pick up a Java web application today without being crushed by the weight of its IoC container’s XML configuration files. (Nevermind that most other communities, such as Python’s and Ruby’s, have hardly a clue what IoC is all about — a good enough indication that it is a waste of time.)

Every framework and approach should be judged on its true merits, that is the true cost/benefit analysis of applying that particular technology. Will it save us time? Will it simplify — not complicate — our code? Will it make our code more flexible / adaptable? Will it let us serve our users and customers better?

I regularly go back to old classics like the Mythical Man-Month to remember that nothing we do in software is truly new. I highly recommend you read it, and also its most famous essay, “No Silver Bullet”.

tl;dr stay healthily skeptical, don’t drink the kool-aid.

import this: learning the Zen of Python with code and slides

It’s hard to find me gushing more unapologetically than when I talk about the virtues of my favorite programming language, Python.

Indeed, my life for the last 3 years has been dominated by the language. In many ways, pursuing a startup and enduring the associated financial hardship was partially because I had become frustrated with using Java in my full-time work and wanted to convert hobby projects I was building outside of work hours into full-fledged projects.

Continue reading import this: learning the Zen of Python with code and slides

Groovy, the Python of Java

I was a bona fide Java programmer for 5 years before I started working on Aleph Point and Parse.ly. I truly believe that Python and JavaScript are fundamentally better languages than Java for a variety of reasons born out of experience with each of them. (Note: Before this gets marked as flamebait, please notice that not only was I Java programmer for more than 5 years, but I was also a Java open source contributor!) I have enormous respect for the Java open source community, which has produced some of the highest quality modules available anywhere.

Now, don’t get me wrong — Python also has batteries included, and usually, when I think that I’m missing a great module I used to use in Java, it already exists in a much more powerful form in Python’s Standard Library or the wealth of modules on PyPI, GitHub, and Bitbucket. However, I believe in not reinventing the wheel, and so if a great open source tool exists in Java, I will want to interact with it.

One of these modules which we use extensively at Parse.ly is Apache Solr, and its surrounding Lucene project modules. Lucene is an extremely mature framework for document indexing, and Solr is a powerful server-ization of that technology that fits well into complex, mixed language distributed systems. I know there are efforts — like Whoosh — to build fast search engines atop the Python language. And I applaud these efforts — more projects means more competition, and more competition means better products. However, I still believe that you go with the best of breed tools available for production software, and you try not to let religious arguments about programming language get in the way.

Lately, I have come across more and more Java open source projects that have no equivalent in Python, and which I would like to access. Knowing that I wanted to feel comfortable incorporating Java open source projects — beyond Solr, which was already nicely wrapped as a web service — I, at first, thought that I’d be forced to still live among the weeds of complex class and interface definitions, cumbersome Java IDEs, XML configuration files, and (IMO) time-wasting rabbit holes like dependency injection, configuration management, and classpath hell. And then I found Groovy.

Continue reading Groovy, the Python of Java

Persistent Folders: Or, why ideas don’t matter, and execution does

I’ll start off this post with a somewhat controversial claim: I invented Dropbox.

I’ll show why this claim doesn’t matter later, but for now, I’ll assure you that it’s true.

How many of you out there use Dropbox? If you don’t, you should — it’s an excellent tool. In its free version, it provides you with 2GB of storage “in the cloud”, using a new kind of folder called a “Dropbox”. What distinguishes a Dropbox from other folders on your computer? The following:

  • Every file put in your Dropbox is automatically (and securely) uploaded to Dropbox’s servers, ensuring you have an offsite backup of all data therein.
  • Multiple computers can gain access to a Dropbox, ensuring files are automatically synchronized across computers without having to use complication version control systems.
  • All files in your Dropbox are versioned, ensuring you can always recover an older version of a file in case you accidentally overwrite a good version.

Dropbox is supported on Windows, Mac OS X, and Linux, and now even has mobile applications, as well. Further, I have a special place in my heart for this service because I started using it almost 2 years ago, and it has acted as a file sharing and project management tool for my own startup’s internal operations at Parse.ly. I was therefore more than ecstatic to discover that this excellent tool and its smart founders had also made it through all of the hurdles necessary to get an early-stage company the financing it needs: they’ve raised over $7 million in financing and have over 3 million users.

But there is another reason I absolutely love Dropbox: because it was my idea. I invented it.

Continue reading Persistent Folders: Or, why ideas don’t matter, and execution does