11 - 2015

Data engineers should abstract their code in the most lightweight way possible to facilitate downstream integration in a large-scale data system.

You want lego blocks, not puzzle pieces.

The creators of the C programming language once famously said, “first make it work, then make it right, and, finally, make it fast.” This adage still applies today.

The difference is, we have tools to take working code and validate that it is right against reams of data. Many of these tools can also be used to make the working, right code run really fast across a cluster of machines, possibly even in real-time, as the data comes in.

But, making code work, then right, then fast, requires some discipline.

Continue reading Simple Lego Blocks for Big Data

Let’s say you’ve just joined my team and want to become an idiomatic Python programmer. Where do you begin?

Well, you can move up the learning curve quickly using resources from this blog:

I also have some good resources on web development with Python:

And on more advanced Python concepts, like dunders and functional programming:

Continue reading Idiomatic Python Resources