Simple Lego Blocks for Big Data

Data engineers should abstract their code in the most lightweight way possible to facilitate downstream integration in a large-scale data system.

You want lego blocks, not puzzle pieces.

lego_blocks

The creators of the C programming language once famously said, “first make it work, then make it right, and, finally, make it fast.” This adage still applies today.

The difference is, we have tools to take working code and validate that it is right against reams of data. Many of these tools can also be used to make the working, right code run really fast across a cluster of machines, possibly even in real-time, as the data comes in.

But, making code work, then right, then fast, requires some discipline.

Continue reading Simple Lego Blocks for Big Data

Idiomatic Python Resources

Let’s say you’ve just joined my team and want to become an idiomatic Python programmer. Where do you begin?

Well, you can move up the learning curve quickly using resources from this blog:

I also have some good resources on web development with Python:

And on more advanced Python concepts, like dunders and functional programming:

Continue reading Idiomatic Python Resources

Programming: it’s weird

I read the Bloomberg piece, What Is Code?, an explanation of code artistry and programmer/hacker culture in 2015. I love this paragraph about “languages as liquid infrastructure”:

The point is that things are fluid in the world of programming, fluid in a way that other industries don’t seem to be. Languages are liquid infrastructure. You download a few programs and, whoa, suddenly you have a working Clojure environment. Which is actually the Java Runtime Environment. You grab an old PC that’s outlived its usefulness, put Linux on it, and suddenly you have a powerful Web server. Now you can participate in whole new cultures. There are meetups, gatherings, conferences, blogs, and people chatting on Twitter. And you are welcomed. They are glad for the new blood.

Java was supposed to supplant C and run on smart jewelry. Now it runs application servers, hosts Lisplike languages, and is the core language of the Android operating system. It runs on billions of things. It won. C and C++, which it was designed to supplant, also won. A lot of things keep winning because computers keep getting more plentiful. It’s weird.


Worse is better, is worse, is better, is worse, is better…

The 3 Best Python Books for Your Team

Python is the core programming language used at Parse.ly. It also happens to be a quickly-growing language with wide adoption among open source projects. It’s no wonder it’s quickly becoming the leading language for software teams.

I’ve written a couple of blog posts with original material for learning Python, including “import this: learning the Zen of Python with code and slides” and “Build a web app fast”.

Newcomers to Python are often overwhelmed by the wealth of information, available online and in print, for the language. I am often asked by others, “What are the best books for my Python team?” I plan to answer that question with this post, by highlighting what I consider to be the three best Python books on the market today.

Continue reading The 3 Best Python Books for Your Team

Picking tech stacks

I realize now that one of the hardest parts of running a successful startup is “betting” on tech stacks that, 3 years out, will have a groundswell of community support around them.

It’s still shocking to me that when I chose each of the following technologies as a central part of Parse.ly, they were so new/immature as to not even show up on a Google search trends box, but are now very popular technologies.

Continue reading Picking tech stacks

What entrepreneurship really looks like

In 2009, Jack & Russ hacked on an early prototype of SeatGeek for the Dreamit Ventures summer class in Philadelphia. The initial prototype came together in the last two weeks before demo day. I remember that Russ hadn’t shaved in weeks because they were spending every night hacking.

You see, before that, the founding pair knew they wanted to start a company, but they weren’t sure about the idea. They had brainstormed ideas ranging from “WebMD for pets” to “amateur art marketplaces”, finally landing at “Yelp for Bloggers”, an idea they called Scribnia. This got them into Dreamit Ventures.

Continue reading What entrepreneurship really looks like

The New Republic as a product

From a poster at Hacker News commenting on The New Yorker article, Inside the Collapse of The New Republic:

I think I’m exactly the audience that TNR wants. I’m well-educated, make a good living, largely agree with them politically, enjoy long-form journalism, and am familiar with the brand and its history.

Yet I don’t think I would ever subscribe to TNR. I just see a magazine as something that’s going to pile up in my house. I can read more than enough great content online for free. If I was going to subscribe to a magazine, I think that The New Yorker is a lot more interesting than The New Republic.

Take note, journalistas. This is how your readers view your stuff — not as a “public trust”, “a voice”, or “a cause”, as TNR was described by the exiting editors in their resignation letter.

For better or worse, readers view your stuff as a product. And a product, to be bought, let alone used, needs to be useful.

Continue reading The New Republic as a product

Solving problems with startups

Interesting insider Q&A with Paul Sutter, co-founder of Quantcast. Via Hacker News:

Q: What methodical process did you follow for your startup? Did you first test the market using tactics similar to the lean startup approach?

A: Basically, make a list of known problems that you’re well suited to solving, rank them by criteria, fail a lot, bang your head against the wall, and eventually things start to stick.

Continue reading Solving problems with startups

Web interest in Apache Storm, Kafka, Spark in the Python community

Apache Storm, Kafka, and Spark are gaining a lot of momentum in the data analysis and processing communities. I was curious whether the interest in using these technologies with Python, in particular, is growing. Based on these Google Trends reports, it seems like it is.

Continue reading Web interest in Apache Storm, Kafka, Spark in the Python community

Clojonic: Pythonic Clojure

In June 2012, I promised myself that I’d learn Clojure “as a mind expander”. As a long-time Python programmer who has been using Python full-time in my work at Parse.ly, I wanted to explore. I wrote then:

I don’t know whether Clojure programs will be better or worse than equivalent Python programs. But I know they will be different.

It took me awhile, but in January of this year, I started teaching myself the language.

Rich Hickey, and the “Cult of Personality”

My approach was to first learn the underpinnings of the language from books and online videos. If you embark on this for Clojure, you will inevitably run into the copious publicly-available material from the language’s creator, Rich Hickey.

In stark contrast to Guido van Rossum in the Python community, Rich Hickey is undeniably not just the Clojure language’s creator, but also a kind of spokesperson for a functional programming renaissance. Guido van Rossum generally lays low and lets the Python language and community speak for itself, and tries to avoid controversy. To him, Python is just a popular tool he happened to create, and it doesn’t represent any major paradigm shift in programming. It’s a positive evolutionary improvement supported by a great open source ecosystem and community. To Hickey, however, “traditional” programming languages — but especially popular ones with an object-oriented focus, such as Java and C++ — are just plain wrong. He proposes Clojure as an antidote of sorts.

You can get the gist of this from his motivating videos, such as Hammock-Driven Development, Are We There Yet?, and Simple Made Easy. For a thorough overview of Clojure as a language, you can also get a walkthrough by Hickey, given to a room full of Java developers, in Clojure for Java Programmers Part I and Part II.

Here is a summary of the viewpoint. Most languages are missing some important attributes that can help us tackle the most complex issues in programming projects:

Continue reading Clojonic: Pythonic Clojure