Learning about babashka (bb), a minimalist Clojure for building CLI tools

A few years back, I wrote Clojonic: Pythonic Clojure, which compares Clojure to Python, and concluded:

My exploration of Clojure so far has made me realize that the languages share surprisingly more in common than I originally thought as an outside observer. Indeed, I think Clojure may be the most “Pythonic” language running on the JVM today (short of Jython, of course).

That said, as that article discussed, Clojure is a very different language than Python. As Rich Hickey, the creator of Clojure, put it in his “A History of Clojure”:

Most developers come to Clojure from Java, JavaScript, Python, Ruby and other OO languages. [… T]he most significant […] problem  [in adopting Clojure] is learning functional programming. Clojure is not multiparadigm, it is FP or nothing. None of the imperative techniques they are used to are available. That said, the language is small and the data structure set evident. Clojure has a reputation for being opinionated, opinionated languages being those that somewhat force a particular development style or strategy, which I will graciously accept as meaning the idioms are clear, and somewhat inescapable.

There is one area in which Clojure and Python seem to have a gulf between them, for a seemingly minor (but, in practice, major) technical reason. Clojure, being a JVM language, inherits the JVM’s slow start-up time, especially for short-lived scripts, as is common for UNIX CLI tools and scripts.

As a result, though Clojure is a relatively popular general purpose programming language — and, indeed, one of the most popular dynamic functional programming languages in existence — it is still notably unpopular for writing quick scripts and commonly-used CLI tools. But, in theory, this needn’t be the case!

If you’re a regular UNIX user, you probably have come across hundreds of scripts with a “shebang”, e.g. something like #!/usr/bin/env python3 at the top of Python 3 scripts or #!/bin/bash for bash scripts. But I bet you have rarely, perhaps never, come across something like #!/usr/bin/env java or #!/usr/bin/env clojure. It’s not that either of these is impossible or unworkable. No, they are simply unergonomic. Thus, they aren’t preferred.

The lack of ergonomics stems from a number of reasons inherent to the JVM, notably slow start-up time and complex system-level classpath/dependency management.

Given Clojure’s concision, readability, and dynamism, it might be a nice language for scripting and CLI tools, if we could only get around that slow start-up time problem. Could we somehow leverage the Clojure standard library and a subset of the Java standard library as a “batteries included” default environment, and have it all compiled into a fast-launching native binary?

Well, it turns out, someone else had this idea, and went ahead and implemented it. Enter babashka.

babashka

To quote the README:

Babashka is implemented using the Small Clojure Interpreter. This means that a snippet or script is not compiled to JVM bytecode, but executed form by form by a runtime which implements a sufficiently large subset of Clojure. Babashka is compiled to a native binary using GraalVM. It comes with a selection of built-in namespaces and functions from Clojure and other useful libraries. The data types (numbers, strings, persistent collections) are the same. Multi-threading is supported (pmapfuture). Babashka includes a pre-selected set of Java classes; you cannot add Java classes at runtime.

Wow! That’s a pretty neat trick. If you install babashka — which is available as a native binary for Windows, macOS, and Linux — you’ll be able to run bb to try it out. For example:

$ bb
Babashka v0.1.3 REPL.
Use :repl/quit or :repl/exit to quit the REPL.
Clojure rocks, Bash reaches.
 
user=> (+ 2 2)
4
user=> (println (range 5))
(0 1 2 3 4)
nil
user=> :repl/quit
$

And, the fast start-up time is legit. For example, here’s a simple “Hello, world!” in Clojure stored in hello.clj:

(println "Hello, world!")

Now compare:

$ multitime -n 10 -s 1 clojure hello.clj
...
        Mean        Std.Dev.    Min         Median      Max
user    1.753       0.090       1.613       1.740       1.954       
...
$ multitime -n 10 -s 1 bb hello.clj
...
        Mean        Std.Dev.    Min         Median      Max
user    0.004       0.005       0.000       0.004       0.012       
...

That’s a pretty big difference on my modern machine! That’s a median start-up time of 1.7 seconds using the JVM version, and a median start-up time of 0.004 seconds — that is, four one-thousandths of a second, or 4 milliseconds — using bb, the Babashka version! The JVM version is almost 500x slower!

How does this compare to Python?

$ multitime -n 10 -s 1 python3 hello.py
...
        Mean        Std.Dev.    Min         Median      Max
user    0.012       0.004       0.006       0.011       0.018       
...

So, bb‘s start-up is as fast as, perhaps even a little faster than, Python 3. Pretty cool!

All that said, the creator of Babashka has said, publicly:

It’s not targeted at Python programmers or Go programmers. I just want to use Clojure. The target audience for Babashka is people who want to use Clojure to build scripts and CLI tools.

Fair enough. But, as Rich Hickey said, there can be really good reasons for Python, Ruby, and Go programmers to take a peek at Clojure. There are some situations in which it could really simplify your code or approach. Not always, but there are certainly some strengths. Here’s what Hickey had to say about it:

[New Clojure users often] find the amount of code they have to write is significantly reduced, 2—5x or more. A much higher percentage of the code they are writing is related to their problem domain.

Aside from being a useful tool for this niche, bb is also just a fascinating F/OSS research project. For example, the way it manages to pull off native binaries across platforms is via the GraalVM native-image facility. Studying GraalVM native-image is interesting in itself, but bb makes use of this facility and makes its benefit accessible to Clojure programmers without resorting to complex build toolchains.

With bb now stable, its creator took a stab at rewriting the clojure wrapper script itself in Babashka. That is, Clojure programmers may not have realized that when they invoke clojure on Linux, what’s really happening is that they are calling out to a bash script that then detects the local JVM and classpath, and then execs out to the java CLI for the JVM itself. On Windows, that same clojure wrapper script is implemented in PowerShell, pretty much by necessity, and serves the same purpose as the Linux bash script, but is totally different code. Well, now there’s something called deps.clj, which eliminates the need to use bash and PowerShell here, and uses Babashka-flavored Clojure code instead. See the deps.clj rationale in the README for more on that.

If you want a simple real-world example of a full-fledged Babashka-flavored Clojure program that does something useful at the command-line, you can take look at clj-kondo, a simple command-line Clojure linter (akin to pyflakes or flake8 in the Python community), which is also by the same author.

Overall, Babashka is not just a really cool hack, but also a very useful tool in the Clojurist’s toolbelt. I’ve become a convert and evangelist, as well as a happy user. Congrats to Michiel Borkent on a very interesting and powerful piece of open source software!


Note: Some of my understanding of Babashka solidified when hearing Michiel describe his project at the Clojure NYC virtual meetup. The meeting was recorded, so I’ll update this blog post when the talk is available.

3 thoughts on “Learning about babashka (bb), a minimalist Clojure for building CLI tools”

  1. Clojure is not slow due to the JVM. Here’s a simple “Hello, world” in Java:

    real    0m0.097s
    user    0m0.075s
    sys     0m0.038s
    
  2. I think what you’re illustrating there is how long it takes to run “Hello, world” in Java if you pre-compile the .java file into a .class file and run that pre-compiled class file. And it’s true, in this case, the startup time can be fast. A nice 2018 analysis, complete with flame graphs, entitled “Clojure’s slow start — what’s inside?” goes into depth on the “true” reasons for slow “bootstrap” times for Clojure, which are the effective startup times when using Clojure as a scripting language or when merely trying to open up a REPL.

  3. Pingback: The Blog Chill

Leave a Reply