Open Source

Linux is free and your mind is valuable

One of my favorite old Linux jokes is, “Linux is free… if your time is worthless.” This quote is possibly adapted from a jwz interview dating back to 1998. In it, he said:

I think Linux is a great thing, because Linux is an alternative to [major operating systems], and because, of all the operating systems that are at all relevant today, Unix is the best of a bad lot. […] As we all know, Linux is only free if your time has no value, and I find that my time is better spent doing things other than the endless moving-target-upgrade dance.

That said, I’ve run Linux dating back to my earliest days studying formal operating systems, when I did it mainly so that I’d have access to vim, shells, a solid gcc toolchain, and good x86 hardware emulation and virtualization tools. Somewhere along the way, though, I made it a point to make my day-to-day computing, if not completely imbued by the open source movement, at least “keyhole accessible” to its thriving core, which continues to be Linux. So, for me, as a point of pride, it has been Linux on the desktop, Linux on my smartphones (that is, Android Linux), and Linux on my smart TVs (that is, webOS). As well as Linux in the obvious places, like my cloud VMs, my Raspberry Pi, my physical server, etc.

I’m not a purist, though. I still use plenty of proprietary software in my life. And plenty of proprietary hardware, too. To stay connected to the world of Apple, I keep an iPad running iPadOS and a Mac Mini running macOS. To stay connected to the world of Microsoft, I keep a dual-boot’able partition for Windows 11 Pro. But I’d say 90% of my computing happens on some Linux variant, and that makes me feel good about having a more direct relationship with computing, where I can always peel back some layers of the onion as needed.

My latest was to shift from a very-very old laptop (my trusty Lenovo X1 Carbon Gen 8) to a not-quite-so-old laptop (a Lenovo X1 Carbon Gen 11). I’ve been sticking with Lenovo as my Linux laptop for awhile now, covered in past posts like “Speed & lightness” and “Lenovo and the new Linux desktop experience”.

Linux is free to run on your laptop

By doing a little research beforehand, I got a model off the Lenovo outlet store with a screen, webcam, and Intel architecture that is well-supported by Linux (summary: avoid MIPI webcam and avoid mobile WWAN). Specifically, it runs perfect with Ubuntu 24.04 LTS out-of-the-box. The only “tweak” I had to make was a workaround for thermald which was erroneously refusing to start on my CPU architecture — the fix was a rather easy systemd invocation and is described here.

In typical Linux “rabbit hole” style, learning a little bit about thermald (with the help of the source code reading and summarizing features of ChatGPT4 projects, incidentally) made me better understand how my laptop approaches battery and cooling performance.

Another change that is more from the world of security is learning about SecureBoot + TPM, two semi-controversial changes to the laptop hardware landscape that Linux didn’t support well until somewhat recently. More on this available in this Twitter/X thread. So, yet again, in fortuitous Linux rabbit hole style, I learned something about modern “security” hardware I wasn’t paying close attention to beforehand.

Continue reading Linux is free and your mind is valuable

Linux backup workflow for hackers with restic, rclone, Backblaze B2

In 2017, CrashPlan was one of the most popular full-computer offsite/cloud backup tools for consumers. It had millions of paid users, usually paying around $10/month for a few terabytes of offsite storage.

But then… “On August 22, 2017, Code42 announced they were shutting down CrashPlan for Home, effective in October 2018. They were not accepting new subscriptions but would maintain existing subscriptions until the end of their existing subscription period, at which point the backups would be purged.”

Picking a backup tool is hard. If you outsource your backups to a commercial entity, you have to be convinced that entity will stand the test of time — and won’t undergo dramatic business model shifts — since, after all, your backup scheme is supposed to follow you around for life.

This is an ideal software category in which to choose open source software — plus a highly durable, interoperable, and financially-well-supported cloud storage option.

Thankfully, as of 2018 or so, I have this open source software + interoperable cloud storage solution working on my main Linux development machine. I’ve been using it for 5+ years and since I’m very happy with it, I’d like to share it with you all here.

The restic logo sets the tone for how you should think about backups!

As a hacker (that is, as a playful programmer), you inevitably have important files on your desktop that don’t get automatically backed up some other way. Yes, you probably have your source repos backed up in GitHub or Gitlab, and you probably have your phone backed up in Apple iCloud or Google One. Maybe you are even organized enough to have digital copies of your personal records, usually in DOCX or PDF format, in Dropbox or Google Drive, assuming you trust the data privacy policies of these providers.

But you still have millions of other files on your desktop computers that include: artifacts not checked into source control (or not yet pushed to remote); operating system and application configuration; photographs and videos from your non-phone camera gear; screencasts and Zoom/GMeet video recordings; paranoia-driven backups of data exfiltrated from cloud providers like Gmail and Google Drive; and so on. Perhaps you even have sensitive/important medical or tax/financial records that you’ve been nervous to stick in a cloud data store.

This post will cover a setup that works well in practice, while also having some interesting technical properties worth discussing.

Continue reading Linux backup workflow for hackers with restic, rclone, Backblaze B2

Good Python Software

I’m glad to say that the last few months have been a return to the world of day-to-day coding and software craftsmanship for me.

To give a taste of what I’ve been working on, I’m going to take you on a tour through some damn good Python software I’ve been using day-to-day lately.

Python 3 and `subprocess.run()`

I have a long relationship with Python, and a lot of trust in the Python community. Python 3 continues to impress me with useful ergonomic improvements that come up in real-world day-to-day programming.

One such improvement is in the subprocess module in Python’s standard library. Sure, this module can can show its age — it was originally written over 20 years ago. But as someone who has been doing a lot of work at the level of UNIX processes lately, I’ve been enjoying how much it can abstract away, especially on Linux systems.

Here’s some code that calls the UNIX command exiftool to strip away EXIF metadata from an image, while suppressing stdout and stderr via redirection to /dev/null.

Continue reading Good Python Software

Core Python

When I describe my programming background these days, I say that I code “primarily in Python, JavaScript, Clojure, C… and Zig!” I put Python first in that list for good reason.

This is a post about the core Python language, but also the ways in which Python is evolving its single-core and multi-core CPU performance.

Python has been my go-to programming tool for a long time. When I started to build out my last company and shipped the production core of its product, Python 2.7 had just stabilized, creating an excellent “core language.” This is a language that I truly respected, as evidenced by my style guide. And, as I discussed in my Python technical book review round-up, this core language was best described by David Beazley in the first half of his Python Essential Reference book, which was also turned into an excellent standalone volume (which includes Python 3.x coverage), Python Distilled.

Many, many useful open source projects, companies, and projects were built atop that Python 2.7 core foundation of a language. Its community truly flourished.

Continue reading Core Python

Dependency rejection

Sam Altman once said: “Minimize your own cognitive load from distracting things that don’t really matter. It’s hard to overstate how important this is, and how bad most are at it. Get rid of distractions in your life. Develop very strong ways to avoid letting crap pile up.”

In programming, there is a technique called “dependency injection.” It’s a way of worrying, up front, about how to split the modules in your program from other code. You aim to give yourself the future benefit of being able to swap out a module dependency later.

In some communities, this little technique led to a temporary fad of constructing programs within a baroque superstructure via “dependency injection frameworks,” sometimes called “inversion of control frameworks.” With these, programmers fretted about a future that usually never arrived, and their worries expressed themselves as defensive code, all in the name of future-proofing. This meant in addition to spending time adopting a dependency, they also had a meta-distraction: working that dependency into their “management framework,” alongside the others. This sort of thing took some programmers very far away from the core “essential complexity” at the heart of their code.¹

Dependencies seem to be all around us, both in the real world, and in programming. And they are perniciously distracting in just this way. Have you ever noticed how rare it is for you to just do something?

If so, you might have been worrying, up front, about dependencies.²

In the back of your mind, you wonder: has this been solved? The world is full of solutions in search of problems, and the internet is always nearby. You convince yourself to research if some of these happen to apply to your problem.

Even if you don’t find something off-the-shelf, you still might think to yourself: I could just do this, but is this a good use of my time? Lacking a ready-made solution, you then turn to delegation: whom shall I ask to solve this? After all, the market usually obliges with eager participants willing to sell their labor for a price.

But whether your proposed solution is an object or a person (and whether it is free or paid), there’s one thing it always is: a dependency.

Continue reading Dependency rejection

How Python programmers can uncontroversially approach build, dependency, and packaging tooling (+ a note on Zig)

A few years back, I published The Elements of Python Style, a popular Python code style guide. Since publishing it, friends of mine in the Python community have wondered if I might consider adding a section about package installation, dependency management, and other similar “standard tooling” recommendations.

This is a reasonable request, since Python lacks much in the way of standard tooling. I took a stab at this in a pull request here, but then abandoned my attempt. The length of this “section” started to approach the length of the overall style guide itself! So, I gave up on that. I decided to turn the section into this blog post here, instead. Then, I’ll share one thought about how emerging programming language communities, such as Zig’s, could learn from the Python experience.

On “Standard” Tools

There’s a zoo of tooling options out there, and no “standard” Python tooling beyond the python executable, and, perhaps, pip (for installing packages, which was semi-formalized in Python 3.x with PEP 453 and PEP 508). Here, we’ll discuss an opinionated (yet uncontroversial) approach to “standard” tooling with Python.

Continue reading How Python programmers can uncontroversially approach build, dependency, and packaging tooling (+ a note on Zig)

Turning n/2 + 1

When I turned 27, I wrote the following in my birthday post:

I don’t need stuff. I just need time. Of course, that’s the bittersweet part of one’s birthday. That even as you come to realize the importance of time, the day acts as a reminder of how our time on this earth is limited. 1 day passes, and only n-1 left to make a difference.

The average life expectancy for a US male born in 1984 is 75. I just turned 38 today. Therefore, it’s fair to say, I just turned n/2 + 1.

That is perhaps a bit too fatalistic and reductive. The number n is not guaranteed to be 75. “Don’t be so morbid!” someone might exclaim to me. “After all, many people live to 80, 90, even 100. And medicine improves all the time.”

Well, yes, this is true. But, it’s also likely — and increasingly so — that I might die any minute. Freak accidents, a late-discovered birth defect. Or, just losing the medical lottery in middle age. So, I return to the wisdom of my youth: “I don’t need stuff. I just need time.”

But, toward what end? That has been the interesting riddle of approaching n/2 with the following undeniable privileges:

good health
professional satisfaction
financial security
confidence in my irreversible life choices

Many approach this same milestone with none of the above, and many would love for any one of them to be squared away.

The honest truth is, I find myself heavy with the weight of these privileges. History is, as Harold Bloom once put it when describing literature, “a conflict between past genius and present aspiration, in which the prize is […] survival.” Here, he was referring to well-crafted stories. “Survival” meant their perpetuation through the ages via timeless literary relevance, something he referred to as “canonical inclusion”.

But what of my field, software?

Continue reading Turning n/2 + 1

Parse.ly, Automattic: the long view

In 2009, I quit my first programming job after college to work on a startup. That startup eventually became Parse.ly. I’ve written about Parse.ly’s startup beginnings and evolution elsewhere on this blog, including:

It is 2021 now, more than 12 years since the company’s original founding. And much has changed.

Parse.ly “the startup” was a rollercoaster, like all startups are, but it was, ultimately, a success. In 2009, we were a tiny 3-person team hacking away on prototypes at a startup accelarator in Philadelphia. In 2012, Parse.ly had its first handful of customers for the content analytics system that became our core product, and shifted into enterprise SaaS as a business model. In 2013, we raised “Series A” style financing to pursue the ambition of defining and leading the content analytics category.

By 2017, it was clear that Parse.ly had done just that: we had built a valuable product in the market and a beautiful SaaS business model, where our R&D efforts were aligned with our customer needs. There was only a question of total market size. As a result, in 2018 we shifted our efforts to expanding Parse.ly’s market — acting as a content measurement layer not just for major media and entertainment companies, but for all content management systems and all content-rich websites.

By the end of 2019 and heading into early 2020, it was clear that Parse.ly was succeeding in this new vision, and was going to be a SaaS company “in it for the long-term”, serving our customers for years to come.

Continue reading Parse.ly, Automattic: the long view

GNU parallel is underrated

I’m always surprised to learn that a friend who has used Linux for a long time, in both server and desktop contexts, might not have heard of GNU parallel.

If you use GNU parallel together with pv (pipe viewer), UNIX shell pipelines, and Python fileinput module, you get a pretty powerful parallel job running framework with testable independent pieces. It’s one of my favorite tricks when writing tools for my own benefit.

Let’s break it down.

Continue reading GNU parallel is underrated

Learning about babashka (bb), a minimalist Clojure for building CLI tools

A few years back, I wrote Clojonic: Pythonic Clojure, which compares Clojure to Python, and concluded:

My exploration of Clojure so far has made me realize that the languages share surprisingly more in common than I originally thought as an outside observer. Indeed, I think Clojure may be the most “Pythonic” language running on the JVM today (short of Jython, of course).

That said, as that article discussed, Clojure is a very different language than Python. As Rich Hickey, the creator of Clojure, put it in his “A History of Clojure”:

Most developers come to Clojure from Java, JavaScript, Python, Ruby and other OO languages. [… T]he most significant […] problem [in adopting Clojure] is learning functional programming. Clojure is not multiparadigm, it is FP or nothing. None of the imperative techniques they are used to are available. That said, the language is small and the data structure set evident. Clojure has a reputation for being opinionated, opinionated languages being those that somewhat force a particular development style or strategy, which I will graciously accept as meaning the idioms are clear, and somewhat inescapable.

There is one area in which Clojure and Python seem to have a gulf between them, for a seemingly minor (but, in practice, major) technical reason. Clojure, being a JVM language, inherits the JVM’s slow start-up time, especially for short-lived scripts, as is common for UNIX CLI tools and scripts.

As a result, though Clojure is a relatively popular general purpose programming language — and, indeed, one of the most popular dynamic functional programming languages in existence — it is still notably unpopular for writing quick scripts and commonly-used CLI tools. But, in theory, this needn’t be the case!

Continue reading Learning about babashka (bb), a minimalist Clojure for building CLI tools

Python 3 and subprocess.run()

On “Standard” Tools

Python 3 and `subprocess.run()`