==========================
Clojonic: Pythonic Clojure
==========================
Andrew Montalenti, CTO
.. rst-class:: logo
.. image:: ./_static/parsely.png
:width: 40%
:align: right
Pythonic iteration
==================
.. sourcecode:: python
nums = [45, 23, 51, 32, 5]
for idx, num in enumerate(nums):
print idx, num
# 0 45
# 1 23
# 2 51
# 3 32
# 4 5
Clojonic iteration (1)
======================
.. sourcecode:: clojure
(let [nums [45 23 51 32 5]]
(for [[idx num] (map-indexed vector nums)]
(println idx num)))
; 0 45
; 1 23
; 2 51
; 3 32
; 4 5
Clojonic iteration (2)
======================
.. image:: ./_static/clojure_syntax.png
:width: 100%
:align: center
Clojonic iteration (3)
======================
.. sourcecode:: clojure
(defn enumerate [coll]
(map-indexed vector coll))
(let [nums [45 23 51 32 5]]
(for [[idx num] (enumerate nums)]
(println idx num)))
; 0 45
; 1 23
; 2 51
; 3 32
; 4 5
Clojonic iteration (4)
======================
.. sourcecode:: clojure
(defmacro enumerate [coll]
`(map-indexed vector ~coll))
(let [nums [45 23 51 32 5]]
(for [[idx num] (enumerate nums)]
(println idx num)))
; ---
(macroexpand '(enumerate [1 2 3]))
(clojure.core/map-indexed clojure.core/vector [1 2 3])
; Translation; code is *replaced* with:
(map-indexed vector [1 2 3])
defn... is a macro!
===================
.. sourcecode:: clojure
;; expand:
(macroexpand
'(defn named-function [some-value]
(println some-value)))
;; generated code:
(def named-function (clojure.core/fn ([some-value]
(println some-value))))
;; compiler source is something like:
(defmacro defn [name & fdecl]
(list 'def name (cons `fn fdecl)))
Sample Python program
=====================
.. sourcecode:: python
"""twitter.py"""
import json
def with_twitter_data(rdr_fn):
with open("data/tweets.log") as rdr:
return list(rdr_fn(rdr))
def read_tweets(rdr):
for line in rdr:
apikey, timestamp, entry = line.split("|", 2)
yield apikey, timestamp, json.loads(entry)
with_twitter_data(read_tweets)
Clojure "syntax"
================
.. sourcecode:: clojure
;; twitter.clj
(ns twitter
(:require [clojure.data.json :as json]
[clojure.java.io :as io]
[clojure.string :as str]))
(defn with-twitter-data [rdr-fn]
(with-open [rdr (io/reader "data/tweets.log")]
(doall (rdr-fn rdr))))
(defn read-tweets [rdr]
(for [line (line-seq rdr)]
(let [[apikey timestamp entry] (str/split line #"\|" 3)]
(vec [apikey timestamp (json/read-str entry)]))))
(with-twitter-data read-tweets)
Quick comparison
================
================== =================== ===================
Idea Python Clojure
================== =================== ===================
Binding ``label = val`` ``(let [label val])``
Unpacking ``val1, val2 =`` Destructuring Form
Iteration ``for`` ``(for)`` macro
Functions ``def`` ``(defn)`` macro
File Open ``open()`` ``(io/reader)``
String Split ``str.split()`` ``(str/split)``
JSON Parse ``json.loads()`` ``(json/read-str)``
Namespaces Modules ``(ns)`` macro
Imports ``import`` ``(ns (:require))``
Data Structs ``{} [] (,)`` ``{} [] ()``
================== =================== ===================
Clojure unique stuff
====================
+------------------------+---------------------+-----------------+
| Idea | Python | Clojure |
+========================+=====================+=================+
| Macros | N/A | Built-in |
+------------------------+---------------------+-----------------+
| Immutable Data Structs | N/A | Built-in |
+------------------------+---------------------+-----------------+
| Keywords | N/A | Built-in |
+------------------------+---------------------+-----------------+
| Lambdas / Blocks | Crippled | Built-in |
+------------------------+---------------------+-----------------+
| DSLs | With Metaclasses | With Macros |
+------------------------+---------------------+-----------------+
| Lazy Eval | Opt-in (generators) | Opt-out (doall) |
+------------------------+---------------------+-----------------+
| Code as Data | ``import ast`` | Lisp forms |
+------------------------+---------------------+-----------------+
| Platform Interop | via C, Cython, etc. | via JVM |
+------------------------+---------------------+-----------------+
| Concurrency | Trad'l, processes | STM Impl |
+------------------------+---------------------+-----------------+
| Object Orientation | Trad'l, class-based | Multi dispatch |
+------------------------+---------------------+-----------------+
==============
Other examples
==============
EMR cluster (lemur)
===================
.. sourcecode:: clojure
(defcluster pig-cluster
:master-instance-type "m1.large"
:slave-instance-type "m1.large"
:num-instances 2
:keypair "emr_jobs"
:enable-debugging? false
:bootstrap-action.1 [
"install-pig"
(s3-libs "/pig/pig-script")
["--base-path" (s3-libs "/pig/")
"--install-pig" "--pig-versions" "latest"]
]
:runtime-jar (s3-libs "/script-runner/script-runner.jar")
)
EMR Pig steps (lemur)
=====================
.. sourcecode:: clojure
(defstep twitter-count-step
:args.positional [
(s3-libs "/pig/pig-script")
"--base-path" (s3-libs "/pig/")
"--pig-versions" "latest"
"--run-pig-script" "--args"
"-f" "s3://pystorm/url_counts.pig"
]
)
(fire! pig-cluster twitter-count-step)
Twitter Click Spout (Storm)
===========================
.. sourcecode:: clojure
{"twitter-click-spout"
(shell-spout-spec
;; Python Spout implementation:
;; - fetches tweets (e.g. from Kafka)
;; - emits (urlref, url, ts) tuples
["python" "spouts_twitter_click.py"]
;; Stream declaration:
["urlref" "url" "ts"]
)
}
Twitter Count Bolt (Storm)
==========================
.. sourcecode:: clojure
{"twitter-count-bolt"
(shell-bolt-spec
;; Bolt input: Spout and field grouping on urlref
{"twitter-click-spout" ["urlref"]}
;; Python Bolt implementation:
;; - maintains a Counter of urlref
;; - increments as new clicks arrive
["python" "bolts_twitter_count.py"]
;; Emits latest click count for each tweet as new Stream
["twitter_link" "clicks"]
:p 4
)
}
Running local Storm cluster
===========================
.. sourcecode:: clojure
(defn run-local! []
(let [cluster (LocalCluster.)]
;; submit the topology configured above
(.submitTopology cluster
;; topology name
"test-topology"
;; topology settings
{TOPOLOGY-DEBUG true}
;; topology configuration
(mk-topology))
;; sleep for 5 seconds before...
(Thread/sleep 5000)
;; shutting down the cluster
(.shutdown cluster)
)
)
Command line parsing
====================
.. sourcecode:: clojure
(defn -main [& args]
(let [[opts args banner]
(cli args
["-h" "--help" "Show help"
:flag true :default false]
["-v" "--verbose" "Verbose output"
:flag true :default false]
["-s" "--spec" "Storm Topology spec .clj file"]
["-j" "--jar" "Storm Topology code .jar file"]
["-c" "--config" "Storm Environment config file"
:default "config.json"]
["-d" "--debug" "Enable Storm Topology debugging"
:default true]
["-e" "--env" "Environment, e.g. prod or local"
:default "local"]
)]
(println opts args banner)))
SQL table creation
==================
.. sourcecode:: clojure
(ns ring-sample.sqlite
(:require [clojure.java.jdbc :as jdbc]))
(def db {:classname "org.sqlite.JDBC"
:subprotocol "sqlite"
:subname "data/database.db"})
(defn make-table! []
(try
(jdbc/with-connection db
(jdbc/create-table
:accounts
[:id :integer]
[:apikey :text]
[:name :text]
[:seen :datetime]))
(catch Exception exc (println exc))))
(make-table!)
lein
====
``lein`` is like ``venv``, ``pip``, ``ipython``, ``setuptools``, and
``buildout`` all combined in one tool.
Through plugins, can also embed build tools that you'd put normally put in a
Makefile.
Also includes "project quickstarts", similar to ``django quickstart`` and
``cookiecutter``.
``lein`` is written in Clojure, but you also install Clojure itself using
``lein`` by declaring a dependency to Clojure in your ``project.clj`` file.
project.clj example
===================
.. sourcecode:: clojure
;; project version
;; ^ ^
(defproject parsely-stormtest "0.0.1-SNAPSHOT"
;; code locations
:source-paths ["src/clj"]
:resource-paths ["multilang", "data"]
;; project dependencies
:dependencies [[org.apache.storm/storm-core "0.9.1"]
[org.clojure/clojure "1.5.1"]
[org.clojure/data.json "0.2.4"]
[org.clojure/tools.namespace "0.2.4"]]
;; invoked by lein run
:main parsely.stormtest
;; lein compile options
:min-lein-version "2.0.0"
:aot :all
)
Running
=======
+---------------------------------+---------------------------+
| Functionality | How |
+=================================+===========================+
| Just the Clojure REPL | ``java -jar clojure.jar`` |
+---------------------------------+---------------------------+
| Clojure REPL w/ dependencies | ``lein repl`` |
+---------------------------------+---------------------------+
| CLI entry-point w/ dependencies | ``lein run`` |
+---------------------------------+---------------------------+
Using
=====
+---------------------+---------------------------------------------------+
| Functionality | How |
+=====================+===================================================+
| Eval code in editor | nREPL plugins for vim / emacs / SublimeText |
+---------------------+---------------------------------------------------+
| Run-debug loop | Repeated ``lein run`` calls at CLI, or tests |
+---------------------+---------------------------------------------------+
| Interact with code | ``LightTable`` embeds Clojure really nicely |
+---------------------+---------------------------------------------------+
| Code notebook | ``session`` is a project akin to IPython Notebook |
+---------------------+---------------------------------------------------+
Testing
=======
+---------------------+--------------------------------------------------+
| Functionality | How |
+=====================+==================================================+
| Unit test framework | Use ``clojure.test``, then ``lein test`` |
+---------------------+--------------------------------------------------+
| Test-driven / BDD | ``midje``, ``speclj``, ``expectations`` |
+---------------------+--------------------------------------------------+
| Parametric | ``simulant``, ``test.check`` |
+---------------------+--------------------------------------------------+
| Assertions | ``(is (.startsWith "abcde" "ab"))`` |
+---------------------+--------------------------------------------------+
Go forth!
=========
Install ``lein``:
* https://github.com/technomancy/leiningen#installation
Read a little bit more about Clojure:
- `Clojure tutorial for the non-Lisp programmer`_
- `Clojure is functional programming for the JVM`_
- `Clojure and Python side-by-side Sudoku`_
.. _Clojure tutorial for the non-Lisp programmer: http://moxleystratton.com/clojure/clojure-tutorial-for-the-non-lisp-programmer
.. _Clojure is functional programming for the JVM: http://java.ociweb.com/mark/clojure/article.html
.. _Clojure and Python side-by-side Sudoku: http://jkkramer.com/sudoku.html
.. ifnotslides::
.. raw:: html
.. ifslides::
.. raw:: html