5 years ago, I was bored

I wrote this to a friend five years ago, a few weeks after I had quit my job to embark on the crazy ride that has been Parse.ly’s founding story.

You said to me, “I am glad that you left because you sounded unhappy there.”

But you know, I wasn’t exactly unhappy.

I was just bored.

I’m eager to work on my own stuff. I had a good work environment and I learned a lot. I was making money, had flexibility about hours and work from home, and was respected on my team.

But I had a couple of realizations. First, I didn’t see a future for myself in financial firms. I just don’t like their core business enough; in fact, I think their core business is somewhat superfluous and that financial firms should be way, way smaller than they are. They should make less money, have less power, etc.

Second, my specific project had this split personality. On the one hand, it wanted to be this cutting edge framework to really empower application developers throughout the company. On the other, it was a lost project — lots of code, lots of ideas, but no solid product and no real customer.

Continue reading 5 years ago, I was bored

Disable Google Hangout’s auto-mute on typing

Note: This post is a few years old and I’m not sure if the advice still applies. But, if you’re interested in tips about how to optimize your work-from-home setup, I’ve written an extensive guide: Best remote work equipment in 2020.

Damnit, Google. Sometimes, you make product improvements that are awesome. Other times, you make “improvements” that are downright depressing regressions.

In an effort to stop the annoying sensation that happens when you are on a Google Hangout video conference and you hear nothing but your colleague’s “tap-tap-tap” on their loud programmer keyboards, Google added a feature to the software that automatically detects when someone is typing and auto-mutes them.

This is a nice idea, but what about when talking while typing is what you actually want to do? In this case, Google provides no recourse. And indeed, recently I gave a walkthrough to my team of a new code project, but constantly cut out because as I was showcasing ideas in code (and even simply navigating code with my keyboard using vim), Google would constantly mute me and make me cut out. Damnit, Google! You suck!

Well, Internet users unite! We have a working fix for this “feature”.

Continue reading Disable Google Hangout’s auto-mute on typing

Truth on tap

Some people have put together an alternative to Wikipedia called Conservapedia. But, I won’t grace it with a link. I’d rather not let the Internet become more dangerous as a form of mind control.

The site is meant to provide explanations of world-wide phenomena in conservative terms. This brings full circle the blurring notion of truth in the Internet Era, as was described quite well by Clay Shirky in his essay, “Truth without scarcity, ethics without force.”

For example, the many-thousand word article on “Public Schools” includes a section entitled “Gender Disparity”. It explains that “Public schools as of late have seen girls’ scores soar above boys’ because schools have been geared toward the needs of girls”. It goes on:

Schools seek to emasculate boys by preventing healthy roughhousing and having psychologists put boys on drugs such as Ritalin. Then boys often come to hate school because radical feminists seek to prevent men from being men and forcing males to go through counseling to “discuss their feelings” and other liberal hogwash treating all students as if they were female. Colleges, because of this trend, see a trend of 60/40 female to male ratio because of feminist drivel such as romance novels in literature and ineffective therapy and attempts to push feminine traits on boys and young men making them frustrated and fed up with the system unless they agree to the school’s desire to become effeminate.

Now, certainly, there are valid conservative arguments against public schools. You don’t have to look far to find them. You might feel that a public school is a poor use of taxpayer dollars, is a violation of parental child-rearing rights, or is a form of mass indoctrination.

But, a feminist conspiracy?

Continue reading Truth on tap

streamparse: Python + Apache Storm for real-time stream processing

Parse.ly released streamparse today, which lets you run Python code against real-time streams of data by integrating with Apache Storm.

We released it for our talk, “Real-time streams & logs with Apache Kafka and Storm” at PyData Silicon Valley 2014.

An initial release (0.0.5) was made. It includes a command-line tool, sparse, with the ability to set up and run local Storm-friendly Python projects.

Continue reading streamparse: Python + Apache Storm for real-time stream processing

The Log: a building block for large-scale data systems

A software engineer at LinkedIn has written a monster of a blog post about “The Log”, a building block for large-scale data systems. The concepts in this post are near and dear to my heart due to my work on precisely these kinds of problems at Parse.ly.

What is “a log”?

The log is similar to the list of all credits and debits and bank processes; a table is all the current account balances. If you have a log of changes, you can apply these changes in order to create the table capturing the current state. This table will record the latest state for each key (as of a particular log time). There is a sense in which the log is the more fundamental data structure: in addition to creating the original table you can also transform it to create all kinds of derived tables.

At Parse.ly, we just adopted Kafka widely in our backend to address just these use cases for data integration and real-time/historical analysis for the large-scale web analytics use case. Prior, we were using ZeroMQ, which is good, but Kafka is better for this use case.

We have always had a log-centric infrastructure, not born out of any understanding of theory, but simply of requirements. We knew that as a data analysis company, we needed to keep data as raw as possible in order to do derived analysis, and we knew that we needed to harden our data collection services and make it easy to prototype data aggregates atop them.

I also recently read Nathan Marz’s book (creator of Apache Storm), which proposes a similar “log-centric” architecture, though Marz calls it a “master dataset” and uses the fanciful term, “Lambda Architecture”. In his case, he describes that atop a “timestamped set of facts” (essentially, a log) you can build any historical / real-time aggregates of your data via dedicated “batch” and “speed” layers. There is a lot of overlap of thinking in that book and in this article.

full-stack

LinkedIn’s log-centric stack, visualized.

Continue reading The Log: a building block for large-scale data systems

Functional dynamic dispatch with Python’s new singledispatch decorator in functools

I just read about Python 3.4’s release notes. I found a nice little gem.

I didn’t know what “Single Dispatch Functions” were all about. Sounded very abstract. But it’s actually pretty cool, and covered in PEP 443.

What’s going on here is that Python has added support for another kind of polymorphism known as “single dispatch”. This allows you to write a function with several implementations, each associated with one or more types of input arguments. The “dispatcher” (called singledispatch and implemented as a Python function decorator) figures out which implementation to choose based on the type of the argument. It also maintains a registry of types to function implementations.

This is not technically “multimethods” — which can also be implemented as a decorator, as GvR did in 2005 — but it’s related. See the Wikipedia article on Dynamic Dispatch for more information.

Also, the other interesting thing about this change is that the library is already on Bitbucket and PyPI and has been tested to work as a backport with Python 2.6+. So you can start using this today, even if you’re not on 3.x!

Continue reading Functional dynamic dispatch with Python’s new singledispatch decorator in functools

How investors play the option

Paul Graham put up a new essay, one of his longest, called “How to raise money”. It gives a good glimpse into the mind game that is startup financing.

I thought it was particularly interesting how he documented three different ways in which investors say “no”, without really saying no.

Continue reading How investors play the option

Parse.ly: brand hacking

There’s some hoopla lately about “weird” startup names in the Wall Street Journal, with specific coverage of “.ly” domains in The Atlantic Wire:

The latest start-up boom has led to the creation of at least 161 companies that end in “ly,” “lee,” and “li,” which is, naming consultants tell us, 160 too many. There’s feedly, bitly, contactually, cloudly, along with a bunch of other company-LYS […] and all but the first ever “ly” name are “just lazy,” Nancy Friedman, a naming consultant, told The Atlantic Wire.

Continue reading Parse.ly: brand hacking