Dueling Rhetoric of Clojure and Haskell
by @eborden on November 01, 2017
Recently Lispcast wrote a post interpreting Rich Hickey’s controversial statements on static types. This post had some very interesting perspectives and some unfortunate misinformation. Within the post the author proposed a challenge to static typing communities.
Is it possible with our current understanding to build a statically typed language that rivals Clojure for its intended purpose? What would such a language look like? What types would it have? Who will build it?
This is an interesting question, and modern type systems are addressing this. These systems exist and are being improved daily. Row types in PureScript are one example.
I’m not going to answer the question directly, but abundant resources are available. You may find these concepts interesting: row type polymorphism, data generic programming, and extensible records. Instead, I’m going to respond to another challenge.
Of course you could build a type with all those properties and more in Haskell. A universal information model…That model would be something like the JSON datatype or a richer one like an EDN…Your library of functions…would look something like Clojure’s standard library. At some point you’re just re-implementing Clojure.
Screw it, let’s re-implement1 Clojure in Haskell.
Write you a Clojure for questionable good
To write a Clojure first we need to define our EDN
data type. EDN is
technically a subset of Clojure, but it’ll
do. It has the following types:
- nil
- booleans
- strings
- characters
- symbols
- keywords
- integers
- floating point numbers
- lists
- vectors
- maps
- sets
- tags
The EDN
data type looks like:
data EDN
= Nil
| Boolean Bool
| String Text
| Character Char
| Symbol Text
| Keyword Text
| Integer Int
| EdnFloat Float
| List [EDN]
| EdnVector (Vector EDN)
| Map (HashMap EDN EDN)
| EdnSet (HashSet EDN)
| Tag Text EDN
deriving (Eq, Hashable)
That is a pretty expansive sum type, which is our first sign that EDN
might be
complicated.
Std liberate me
Now let’s write a standard library function. How about map
? We can do that by
pattern matching on our input and handling the cases.
clmap :: (EDN -> EDN) -> EDN -> EDN
clmap f edn =
case edn of
Nil -> Nil
List xs -> List $ map f xs
EdnVector xs -> List . map f $ toList xs
EdnSet xs -> List . map f $ toList xs
Awesome, we can map
over our data, but what about all the other cases? You
can’t map
over many Clojure data types. Our clmap
is “partial”. Typically
we’d change the type of clmap
to return a Maybe EDN
, but Rich isn’t fond of
Maybe
, so let’s do something closer to Clojure’s semantics.
clmap :: (EDN -> EDN) -> EDN -> EDN
clmap f edn =
case edn of
Nil -> Nil
List xs -> List $ map f xs
EdnVector xs -> List . map f $ toList xs
EdnSet xs -> List . map f $ toList xs
Boolean{} -> error "can't clmap a boolean"
String{} -> error "guess you can't do this either"
Character{} -> error "this is starting to get silly"
Symbol{} -> error "you see where I'm headed?"
Keyword{} -> error "YOLO"
Integer{} -> error "everything is awful"
EdnFloat{} -> error "we are floating in a pit of despair"
Map{} -> error "if only there was a safer way to do this"
Tag{} -> error "please save me"
That is a lot of failure, our function is now matching all our cases, but it blows up when we make a mistake.
What about a simple function on one of the author’s favorite data types Map
?
clget :: EDN -> EDN -> EDN
clget key edn =
case edn of
Map ms -> case HashMap.lookup key ms of
Just x -> x
Nothing -> error "crap we didn't find it"
List{} -> error "oh no"
Nil -> error "NilPointerException, I guess?"
EdnVector{} -> error "here we go again."
EdnSet{} -> error "bear with me"
Boolean{} -> error "this will only take a second"
String{} -> error "we're half way there"
Character{} -> error "still feeling pretty depressing"
Symbol{} -> error "we can keep it together"
Keyword{} -> error "we are strong and disciplined"
Integer{} -> error "our resolve isn't cracking..."
EdnFloat{} -> error "we are floating in a pit of despair!!!!!!"
List{} -> error "whhyyyyyyyyyyyyyyy"
Tag{} -> error "time is a flat circle"
That was even rougher than the last time. That function fails in way more cases
than it succeeds. So maybe, just maybe, Maybe
isn’t that bad. Let’s refactor.
clmap :: (EDN -> EDN) -> EDN -> Maybe EDN
clmap f edn =
case edn of
Nil -> Just Nil
List xs -> Just . List $ fmap f xs
EdnVector xs -> Just . List . toList $ fmap f xs
EdnSet xs -> Just . List . fmap f $ toList xs
-- we are going to use a shortcut and utilize wild card pattern matching
_ -> Nothing
clget :: EDN -> EDN -> Maybe EDN
clget key edn =
case edn of
Map ms -> HashMap.lookup key ms
_ -> Nothing
Okay, we are no longer completely exploding. Our semantics have diverged from Clojure proper, but they are safer, and more visible. So what does this look like when we use it?
let
mkMap pairs = Map $ HashMap.fromList pairs
-- clj: (def cl-map {:foo (list "bar")})
clMap = mkMap [(Symbol "foo", List [String "bar"])]
-- clj: (map str/upper-case (get cl-map :foo))
clmap clToUpper =<< clget (Symbol "foo") clMap
-- Just (List [String "BAR"])
It isn’t terribly pretty, but it does what we want. We could write some utility functions to clean it up, but our proof of concept works, and we’ve made our failure cases more visible.
What have we learned?
The previous author raises some very valid concerns with utilizing Haskell for
“situated” programs. In Haskell we typically “concrete” data with record types,
but we don’t have to. With only a few lines of code we’ve encoded EDN
’s open
dynamic information model and shown that we can compositionally de-construct it
with a few functions. This lets us avoid having to pattern match over and over
again, but we still have to implement a lot of functions to get there. We
can probably do better, but we have shown we can quickly encode Clojure in Haskell.
Encoding Haskell in Clojure would very likely be a much more difficult task.
In which we do better
Let’s take this further.
Can we set keys?
Our MVP is pretty ugly and it quickly falls apart when we attempt to set
something.
We’d have to wrap and unwrap tons of types to do something trivial. This is the
experience that our friends at Lispcast had and it is unfortunate. Luckily the
Haskell ecosystem has a solution to getting/setting in nested types, it is
called lens
.
Lenses give us a way to talk about getters and setters, compose them, nest them
and they also allow us to talk about partiality in a composable way. They are
conceptually similar to Clojure’s transducers. Our EDN -> Maybe EDN
types
fall into the realm of Traversal
s. These are getters and setters over values
that have a failure case. Let’s write our clget
function as a Traversal
and
rename it clkey
since we’ll be getting and setting.
clkey :: EDN -> Traversal' EDN EDN
clkey key f edn =
case edn of
Map ms -> case HashMap.lookup key ms of
Nothing -> pure edn
Just val ->
let optionalInsert v = Map $ HashMap.insert key v ms
in optionalInsert <$> f val
_ -> pure edn
That wasn’t so hard. If the key exists we apply a function and wrap an insertion over it. Otherwise we return our original structure. It may seem a bit magical, but that function allows us to get or set the value if it exists. It took a few people solving “puzzles” to create this solution, but now we have its ambitious power.
Do we need clmap
?
It turns out we don’t need to reinvent the wheel here.
Haskell has its own map
function and we can leverage it. We are utilizing
Haskell data types under the hood; let’s create Traversal
s that let us peek
into our data type and modify things in regular old Haskell. These are more
powerful Traversal
s called Prism
s and their creation is even simpler.
_List :: Prism' EDN [EDN]
_List = prism' List $ \case
List xs -> Just xs
_ -> Nothing
_String :: Prism' EDN Text
_String = prism' String $ \case
String xs -> Just xs
_ -> Nothing
Cool, that pattern is pretty simple and now we can utilize Clojure’s information model while safely leveraging Haskell’s robust and reliable standard library.
let
-- clj: (def cl-map {:foo (list "bar")})
clMap = mkMap [(Symbol "foo", List [String "bar"])]
foo = clkey (Symbol "foo") . _List . ix 0 . _String
-- clj: (nth 0 (get cl-map :foo))
preview foo clMap
-- Just "bar"
-- clj specter: (setval [:foo (nthpath 0)] "baz" cl-map)
set foo "baz" clMap
-- Map (HashMap.fromList [(Symbol "foo", List [String "baz"])])
-- clj specter: (transform [:foo (nthpath 0)] str/upper-case cl-map)
over foo Text.toUpper clMap
-- Map (HashMap.fromList [(Symbol "foo", List [String "BAR"])])
This is starting to look a lot more enjoyable and in the real world we can use
Haskell’s type safe “macro” system to generate all these Prism
s for us,
saving us even more time and boilerplate.
What about being generic?
Clojure map
works on sequence like structures and our existing Prism
s
force us to be overly specific. Well there is a simple solution, write a new
Prism
for sequence like structures. We’ll use Haskell’s list as a canonical
sequence, we could probably be fancier, but let’s keep it simple like Clojure.
_Seq :: Prism' EDN [EDN]
_Seq = prism' List $ \case
List xs -> Just xs
EdnVector xs -> Just $ toList xs
EdnSet xs -> Just $ toList xs
Map ms -> Just $ toList ms
_ -> Nothing
let
clList = List [String "bar"]
clVector = EdnVector $ fromList [String "foo"]
clSet = EdnSet $ fromList [String "baz"]
clMap = Map $ HashMap.fromList [(Symbol "biz", String "bats")]
foo = _Seq . traverse . _String
over foo Text.toUpper clList
-- List [String "BAR"]
over foo Text.toUpper clVector
-- List [String "FOO"]
over foo Text.toUpper clSet
-- List [String "BAZ"]
over foo Text.toUpper clMap
-- List [String "BATS"]
Awesome, not only can we encode Clojure’s information model, we can also elegantly support its generic programming model with a higher level of safety and visible failure semantics. With a bit more work we could clean this up and have Clojure’s information model available as a DSL in a compact form, leveraging our existing ecosystem, with very little cost and high assurance. In fact I’ve done just that and so have others. Still others have endeavored to encode Haskell in a lisp like Racket.
What does it mean?
Much of the rhetoric that is currently flying around is a false dichotomy. The
Clojure ecosystem is interesting and its information model is trivial to
compose and use in a modern static type system. We can now utilize EDN
as
opposed to JSON
while still enjoying strong guarantees of the Haskell ecosystem.
But lets ask an important question. What do we gain with EDN
?
If EDN
is an improvement over JSON
, then it is marginal at best. At worst it is an
explosion in complexity and failure cases that provides no implicit method to
tame it. We can tame these in our DSL, but the Clojure ecosystem has no visible
recourse. Utilizing EDN
also promotes a lot of invisible coupling. Some
may tell you that dynamic types don’t couple, but that is incorrect and
shows a lack of understanding of coupling itself. Many functions over Map
exhibit external and stamp coupling. Even generic functions produce
their own form of coupling. With an explicit type system we don’t couple more,
but we make that coupling visible, understandable, and maintainable, even
addressable. Those types help us grow and evolve with our understanding of our
data instead of living in the opaque world of EDN -> EDN -> EDN
where failure
is implicit.
If we decide to “concrete” a world view where everything is awful and only marginal improvements are possible then I’d rather take an optimistic path; even if it comes at the cost of some navel gazing. I’d rather find boundless possibilities in my bellybutton lint than stare into the universe and ignore the majestic chance of a better world around every star.
But Seriously
This post is inflammatory and somewhat hyperbolic, but the truth is I don’t want to pick a fight with Clojurians. They have interesting ideas, cool technology and a lovely ecosystem. Let’s share the joy of discovery and creativity that made us all fall in love with programming in the first place.
Notes
-
No irony is lost here in fulfilling Greenspun’s 10th rule. As well we recognize the existence of Greenspun’s 10th Dual:
Any sufficiently complicated dynamically typed program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a type system.