Clojonic: Pythonic Clojure¶
Andrew Montalenti, CTO
Pythonic iteration¶
nums = [45, 23, 51, 32, 5]
for idx, num in enumerate(nums):
print idx, num
# 0 45
# 1 23
# 2 51
# 3 32
# 4 5
Clojonic iteration (1)¶
(let [nums [45 23 51 32 5]]
(for [[idx num] (map-indexed vector nums)]
(println idx num)))
; 0 45
; 1 23
; 2 51
; 3 32
; 4 5
Clojonic iteration (2)¶
Clojonic iteration (3)¶
(defn enumerate [coll]
(map-indexed vector coll))
(let [nums [45 23 51 32 5]]
(for [[idx num] (enumerate nums)]
(println idx num)))
; 0 45
; 1 23
; 2 51
; 3 32
; 4 5
Clojonic iteration (4)¶
(defmacro enumerate [coll]
`(map-indexed vector ~coll))
(let [nums [45 23 51 32 5]]
(for [[idx num] (enumerate nums)]
(println idx num)))
; ---
(macroexpand '(enumerate [1 2 3]))
(clojure.core/map-indexed clojure.core/vector [1 2 3])
; Translation; code is *replaced* with:
(map-indexed vector [1 2 3])
defn... is a macro!¶
;; expand:
(macroexpand
'(defn named-function [some-value]
(println some-value)))
;; generated code:
(def named-function (clojure.core/fn ([some-value]
(println some-value))))
;; compiler source is something like:
(defmacro defn [name & fdecl]
(list 'def name (cons `fn fdecl)))
Sample Python program¶
"""twitter.py"""
import json
def with_twitter_data(rdr_fn):
with open("data/tweets.log") as rdr:
return list(rdr_fn(rdr))
def read_tweets(rdr):
for line in rdr:
apikey, timestamp, entry = line.split("|", 2)
yield apikey, timestamp, json.loads(entry)
with_twitter_data(read_tweets)
Clojure “syntax”¶
;; twitter.clj
(ns twitter
(:require [clojure.data.json :as json]
[clojure.java.io :as io]
[clojure.string :as str]))
(defn with-twitter-data [rdr-fn]
(with-open [rdr (io/reader "data/tweets.log")]
(doall (rdr-fn rdr))))
(defn read-tweets [rdr]
(for [line (line-seq rdr)]
(let [[apikey timestamp entry] (str/split line #"\|" 3)]
(vec [apikey timestamp (json/read-str entry)]))))
(with-twitter-data read-tweets)
Quick comparison¶
Idea | Python | Clojure |
---|---|---|
Binding | label = val |
(let [label val]) |
Unpacking | val1, val2 = |
Destructuring Form |
Iteration | for |
(for) macro |
Functions | def |
(defn) macro |
File Open | open() |
(io/reader) |
String Split | str.split() |
(str/split) |
JSON Parse | json.loads() |
(json/read-str) |
Namespaces | Modules | (ns) macro |
Imports | import |
(ns (:require)) |
Data Structs | {} [] (,) |
{} [] () |
Clojure unique stuff¶
Idea | Python | Clojure |
---|---|---|
Macros | N/A | Built-in |
Immutable Data Structs | N/A | Built-in |
Keywords | N/A | Built-in |
Lambdas / Blocks | Crippled | Built-in |
DSLs | With Metaclasses | With Macros |
Lazy Eval | Opt-in (generators) | Opt-out (doall) |
Code as Data | import ast |
Lisp forms |
Platform Interop | via C, Cython, etc. | via JVM |
Concurrency | Trad’l, processes | STM Impl |
Object Orientation | Trad’l, class-based | Multi dispatch |
Other examples¶
EMR cluster (lemur)¶
(defcluster pig-cluster
:master-instance-type "m1.large"
:slave-instance-type "m1.large"
:num-instances 2
:keypair "emr_jobs"
:enable-debugging? false
:bootstrap-action.1 [
"install-pig"
(s3-libs "/pig/pig-script")
["--base-path" (s3-libs "/pig/")
"--install-pig" "--pig-versions" "latest"]
]
:runtime-jar (s3-libs "/script-runner/script-runner.jar")
)
EMR Pig steps (lemur)¶
(defstep twitter-count-step
:args.positional [
(s3-libs "/pig/pig-script")
"--base-path" (s3-libs "/pig/")
"--pig-versions" "latest"
"--run-pig-script" "--args"
"-f" "s3://pystorm/url_counts.pig"
]
)
(fire! pig-cluster twitter-count-step)
Twitter Click Spout (Storm)¶
{"twitter-click-spout"
(shell-spout-spec
;; Python Spout implementation:
;; - fetches tweets (e.g. from Kafka)
;; - emits (urlref, url, ts) tuples
["python" "spouts_twitter_click.py"]
;; Stream declaration:
["urlref" "url" "ts"]
)
}
Twitter Count Bolt (Storm)¶
{"twitter-count-bolt"
(shell-bolt-spec
;; Bolt input: Spout and field grouping on urlref
{"twitter-click-spout" ["urlref"]}
;; Python Bolt implementation:
;; - maintains a Counter of urlref
;; - increments as new clicks arrive
["python" "bolts_twitter_count.py"]
;; Emits latest click count for each tweet as new Stream
["twitter_link" "clicks"]
:p 4
)
}
Running local Storm cluster¶
(defn run-local! []
(let [cluster (LocalCluster.)]
;; submit the topology configured above
(.submitTopology cluster
;; topology name
"test-topology"
;; topology settings
{TOPOLOGY-DEBUG true}
;; topology configuration
(mk-topology))
;; sleep for 5 seconds before...
(Thread/sleep 5000)
;; shutting down the cluster
(.shutdown cluster)
)
)
Command line parsing¶
(defn -main [& args]
(let [[opts args banner]
(cli args
["-h" "--help" "Show help"
:flag true :default false]
["-v" "--verbose" "Verbose output"
:flag true :default false]
["-s" "--spec" "Storm Topology spec .clj file"]
["-j" "--jar" "Storm Topology code .jar file"]
["-c" "--config" "Storm Environment config file"
:default "config.json"]
["-d" "--debug" "Enable Storm Topology debugging"
:default true]
["-e" "--env" "Environment, e.g. prod or local"
:default "local"]
)]
(println opts args banner)))
SQL table creation¶
(ns ring-sample.sqlite
(:require [clojure.java.jdbc :as jdbc]))
(def db {:classname "org.sqlite.JDBC"
:subprotocol "sqlite"
:subname "data/database.db"})
(defn make-table! []
(try
(jdbc/with-connection db
(jdbc/create-table
:accounts
[:id :integer]
[:apikey :text]
[:name :text]
[:seen :datetime]))
(catch Exception exc (println exc))))
(make-table!)
lein¶
lein
is like venv
, pip
, ipython
, setuptools
, and
buildout
all combined in one tool.
Through plugins, can also embed build tools that you’d put normally put in a Makefile.
Also includes “project quickstarts”, similar to django quickstart
and
cookiecutter
.
lein
is written in Clojure, but you also install Clojure itself using
lein
by declaring a dependency to Clojure in your project.clj
file.
project.clj example¶
;; project version
;; ^ ^
(defproject parsely-stormtest "0.0.1-SNAPSHOT"
;; code locations
:source-paths ["src/clj"]
:resource-paths ["multilang", "data"]
;; project dependencies
:dependencies [[org.apache.storm/storm-core "0.9.1"]
[org.clojure/clojure "1.5.1"]
[org.clojure/data.json "0.2.4"]
[org.clojure/tools.namespace "0.2.4"]]
;; invoked by lein run
:main parsely.stormtest
;; lein compile options
:min-lein-version "2.0.0"
:aot :all
)
Running¶
Functionality | How |
---|---|
Just the Clojure REPL | java -jar clojure.jar |
Clojure REPL w/ dependencies | lein repl |
CLI entry-point w/ dependencies | lein run |
Using¶
Functionality | How |
---|---|
Eval code in editor | nREPL plugins for vim / emacs / SublimeText |
Run-debug loop | Repeated lein run calls at CLI, or tests |
Interact with code | LightTable embeds Clojure really nicely |
Code notebook | session is a project akin to IPython Notebook |
Testing¶
Functionality | How |
---|---|
Unit test framework | Use clojure.test , then lein test |
Test-driven / BDD | midje , speclj , expectations |
Parametric | simulant , test.check |
Assertions | (is (.startsWith "abcde" "ab")) |