learn the ultimate language and become a better programmer

Be a pal and buy print and ebooks from No Starch or Amazon

Follow @nonrecursive to hear about new content or subscribe:

Chapter 13

Creating and Extending Abstractions with Multimethods, Protocols, and Records

Take a minute to contemplate how great it is to be one of Mother Nature’s top-of-the-line products: a human. As a human, you get to gossip on social media, play Dungeons and Dragons, and wear hats. Perhaps more important, you get to think and communicate in terms of abstractions.

The ability to think in terms of abstractions is truly one of the best human features. It lets you circumvent your cognitive limits by tying together disparate details into a neat conceptual package that you can hold in your working memory. Instead of having to think the clunky thought “squeezable honking red ball nose adornment,” you only need the concept “clown nose.”

In Clojure, an abstraction is a collection of operations, and data types implement abstractions. For example, the seq abstraction consists of operations like first and rest, and the vector data type is an implementation of that abstraction; it responds to all of the seq operations. A specific vector like [:seltzer :water] is an instance of that data type.

The more a programming language lets you think and write in terms of abstractions, the more productive you will be. For example, if you learn that a data structure is an instance of the seq abstraction, you can instantly call forth a large web of knowledge about what functions will work with the data structure. As a result, you spend time actually using the data structure instead of constantly looking up documentation on how it works. By the same token, if you extend a data structure to work with the seq abstraction, you can use the extensive library of seq functions on it.

In Chapter 4, you learned that Clojure is written in terms of abstractions. This is powerful because in Clojure you can focus on what you can actually do with data structures and not worry about the nitty-gritty of implementation. This chapter introduces you to the world of creating and implementing your own abstractions. You’ll learn the basics of multimethods, protocols, and records.

Polymorphism

The main way we achieve abstraction in Clojure is by associating an operation name with more than one algorithm. This technique is called poly­morphism. For example, the algorithm for performing conj on a list is different from the one for vectors, but we unify them under the same name to indicate that they implement the same concept, namely, add an element to this data structure.

Because Clojure relies on Java’s standard library for many of its data types, a little Java is used in this chapter. For example, Clojure strings are just Java strings, instances of the Java class java.lang.String. To define your own data types in Java, you use classes. Clojure provides additional type constructs: records and types. This book only covers records.

Before we learn about records, though, let’s look at multimethods, our first tool for defining polymorphic behavior.

Multimethods

Multimethods give you a direct, flexible way to introduce polymorphism into your code. Using multimethods, you associate a name with multiple implementations by defining a dispatching function, which produces dispatching values that are used to determine which method to use. The dispatching function is like the host at a restaurant. The host will ask you questions like “Do you have a reservation?” and “Party size?” and then seat you accordingly. Similarly, when you call a multimethod, the dispatching function will interrogate the arguments and send them to the right method, as this example shows:

(ns were-creatures)
 (defmulti full-moon-behavior (fn [were-creature] (:were-type were-creature)))
 (defmethod full-moon-behavior :wolf
  [were-creature]
  (str (:name were-creature) " will howl and murder"))
 (defmethod full-moon-behavior :simmons
  [were-creature]
  (str (:name were-creature) " will encourage people and sweat to the oldies"))

(full-moon-behavior {:were-type :wolf
                      :name "Rachel from next door"})
; => "Rachel from next door will howl and murder"

(full-moon-behavior {:name "Andy the baker"
                      :were-type :simmons})
; => "Andy the baker will encourage people and sweat to the oldies"

This multimethod shows how you might define the full moon behavior of different kinds of were-creatures. Everyone knows that a werewolf turns into a wolf and runs around howling and murdering people. A lesser-known species of were-creature, the were-Simmons, turns into Richard Simmons, power perm and all, and runs around encouraging people to be their best and sweat to the oldies. You do not want to get bitten by either, lest you turn into one.

We create the multimethod at . This tells Clojure, “Hey, create a new multimethod named full-moon-behavior. Whenever someone calls full-moon-behavior, run the dispatching function (fn [were-creature] (:were-type were-creature)) on the arguments. Use the result of that function, aka the dispatching value, to decide which specific method to use!”

Next, we define two methods, one for when the value returned by the dispatching function is :wolf at ➋, and one for when it’s :simmons at ➌. Method definitions look a lot like function definitions, but the major difference is that the method name is immediately followed by the dispatch value. :wolf and :simmons are both dispatch values. This is different from a dispatching value, which is what the dispatching function returns. The full dispatch sequence goes like this:

  1. The form (full-moon-behavior {:were-type :wolf :name "Rachel from next door"}) is evaluated.
  2. full-moon-behavior’s dispatching function runs, returning :wolf as the dispatching value.
  3. Clojure compares the dispatching value :wolf to the dispatch values of all the methods defined for full-moon-behavior. The dispatch values are :wolf and :simmons.
  4. Because the dispatching value :wolf is equal to the dispatch value :wolf, the algorithm for :wolf runs.

Don’t let the terminology trip you up! The main idea is that the dispatching function returns some value, and this value is used to determine which method definition to use.

Back to our example! Next we call the method twice. At ➍, the dispatching function returns the value :wolf and the corresponding method is used, informing you that "Rachel from next door will howl and murder". At ➏, the function behaves similarly, except :simmons is the dispatching value.

You can define a method with nil as the dispatch value:

(defmethod full-moon-behavior nil
  [were-creature]
  (str (:name were-creature) " will stay at home and eat ice cream"))

(full-moon-behavior {:were-type nil
                     :name "Martin the nurse"})
; => "Martin the nurse will stay at home and eat ice cream"

When you call full-moon-behavior this time, the argument you give it has nil for its :were-type, so the method corresponding to nil gets evaluated and you’re informed that "Martin the nurse will stay at home and eat ice cream".

You can also define a default method to use if no other methods match by specifying :default as the dispatch value. In this example, the :were-type of the argument given doesn’t match any of the previously defined methods, so the default method is used:

(defmethod full-moon-behavior :default
  [were-creature]
  (str (:name were-creature) " will stay up all night fantasy footballing"))

(full-moon-behavior {:were-type :office-worker
                     :name "Jimmy from sales"})
; => "Jimmy from sales will stay up all night fantasy footballing"

One cool thing about multimethods is that you can always add new methods. If you publish a library that includes the were-creatures namespace, other people can continue extending the multimethod to handle new dispatch values. This example shows that you’re creating your own random namespace and including the were-creatures namespace, and then defining another method for the full-moon-behavior multimethod:

(ns random-namespace
  (:require [were-creatures]))
(defmethod were-creatures/full-moon-behavior :bill-murray
  [were-creature]
  (str (:name were-creature) " will be the most likeable celebrity"))
(were-creatures/full-moon-behavior {:name "Laura the intern" 
                                    :were-type :bill-murray})
; => "Laura the intern will be the most likeable celebrity"

Your dispatching function can return arbitrary values using any or all of its arguments. The next example defines a multimethod that takes two arguments and returns a vector containing the type of each argument. It also defines an implementation of that method, which will be called when each argument is a string:

(ns user)
(defmulti types (fn [x y] [(class x) (class y)]))
(defmethod types [java.lang.String java.lang.String]
  [x y]
  "Two strings!")

(types "String 1" "String 2")
; => "Two strings!"

Incidentally, this is why they’re called multimethods: they allow dispatch on multiple arguments. I haven’t used this feature very often, but I could see it being used in a role-playing game to write methods that are dispatched according to, say, a mage’s major school of magic and his magic specialization. Either way, it’s better to have it and not need it than need it and not have it.

Note Multimethods also allow hierarchical dispatching. Clojure lets you build custom hierarchies, which I won’t cover, but you can learn about them by reading the documentation at http://clojure.org/multimethods/.

Protocols

Approximately 93.58 percent of the time, you’ll want to dispatch to methods according to an argument’s type. For example, count needs to use a different method for vectors than it does for maps or for lists. Although it’s possible to perform type dispatch with multimethods, protocols are optimized for type dispatch. They’re more efficient than multimethods, and Clojure makes it easy for you to succinctly specify protocol implementations.

A multimethod is just one polymorphic operation, whereas a protocol is a collection of one or more polymorphic operations. Protocol operations are called methods, just like multimethod operations. Unlike multimethods, which perform dispatch on arbitrary values returned by a dispatching function, protocol methods are dispatched based on the type of the first argument, as shown in this example:

(ns data-psychology)
(defprotocol Psychodynamics
  "Plumb the inner depths of your data types"
  (thoughts [x] "The data type's innermost thoughts")
  (feelings-about [x] [x y] "Feelings about self or other"))

First, there’s defprotocol at ➊. This takes a name, Psychodynamics ➋, and an optional docstring, "Plumb the inner depths of your data types" ➌. Next are the method signatures. A method signature consists of a name, an argument specification, and an optional docstring. The first method signature is named thoughts ➍ and can take only one argument. The second is named feelings-about ➎ and can take one or two arguments. Protocols do have one limitation: the methods can’t have rest arguments. So a line like the following isn’t allowed:

(feelings-about [x] [x & others])

By defining a protocol, you’re defining an abstraction, but you haven’t yet defined how that abstraction is implemented. It’s like you’re reserving names for behavior (in this example, you’re reserving thoughts and feelings-about), but you haven’t defined what exactly the behavior should be. If you were to evaluate (thoughts "blorb"), you would get an exception that reads, “No implementation of method: thoughts of protocol: data-psychology/Psychodynamics found for class: java.lang.String.” Protocols dispatch on the first argument’s type, so when you call (thoughts "blorb"), Clojure tries to look up the implementation of the thoughts method for strings, and fails.

You can fix this sorry state of affairs by extending the string data type to implement the Psychodynamics protocol:

 (extend-type java.lang.String
   Psychodynamics
   (thoughts [x] (str x " thinks, 'Truly, the character defines the data type'")
   (feelings-about
    ([x] (str x " is longing for a simpler way of life"))
    ([x y] (str x " is envious of " y "'s simpler way of life"))))

(thoughts "blorb")
 ; => "blorb thinks, 'Truly, the character defines the data type'"

(feelings-about "schmorb")
; => "schmorb is longing for a simpler way of life"

(feelings-about "schmorb" 2)
; => "schmorb is envious of 2's simpler way of life"

extend-type is followed by the name of the class or type you want to extend and the protocol you want it to support—in this case, you specify the class java.lang.String at ➊ and the protocol you want it to support, Psychodynamics, at ➋. After that, you provide an implementation for both the thoughts method at ➌ and the feelings-about method at ➍. If you’re extending a type to implement a protocol, you have to implement every method in the protocol or Clojure will throw an exception. In this case, you can’t implement just thoughts or just feelings; you have to implement both.

Notice that these method implementations don’t begin with defmethod like multimethods do. In fact, they look similar to function definitions, except without defn. To define a method implementation, you write a form that starts with the method’s name, like thoughts, then supply a vector of parameters and the method’s body. These methods also allow arity overloading, just like functions, and you define multiple-arity method implementations similarly to multiple-arity functions. You can see this in the feelings-about implementation at ➍.

After you’ve extended the java.lang.String type to implement the Psychodynamics protocol, Clojure knows how to dispatch the call (thoughts "blorb"), and you get the string "blorb thinks, 'Truly, the character defines the data type'" at ➎.

What if you want to provide a default implementation, like you did with multimethods? To do that, you can extend java.lang.Object. This works because every type in Java (and hence, Clojure) is a descendant of java.lang.Object. If that doesn’t quite make sense (perhaps because you’re not familiar with object-oriented programming), don’t worry about it—just know that it works. Here’s how you would use this technique to provide a default implementation for the Psychodynamics protocol:

(extend-type java.lang.Object
  Psychodynamics
  (thoughts [x] "Maybe the Internet is just a vector for toxoplasmosis")
  (feelings-about
    ([x] "meh")
    ([x y] (str "meh about " y))))
  
(thoughts 3)
; => "Maybe the Internet is just a vector for toxoplasmosis"

(feelings-about 3)
; => "meh"

(feelings-about 3 "blorb")
; => "meh about blorb"

Because we haven’t defined a Psychodynamics implementation for numbers, Clojure dispatches calls to thoughts and feelings-about to the implementation defined for java.lang.Object.

Instead of making multiple calls to extend-type to extend multiple types, you can use extend-protocol, which lets you define protocol implementations for multiple types at once. Here’s how you’d define the preceding protocol implementations:

(extend-protocol Psychodynamics
  java.lang.String
  (thoughts [x] "Truly, the character defines the data type")
  (feelings-about
    ([x] "longing for a simpler way of life")
    ([x y] (str "envious of " y "'s simpler way of life")))
  
  java.lang.Object
  (thoughts [x] "Maybe the Internet is just a vector for toxoplasmosis")
  (feelings-about
    ([x] "meh")
    ([x y] (str "meh about " y))))

You might find this technique more convenient than using extend-type. Then again, you might not. How does extend-type make you feel? How about extend-protocol? Come sit down on this couch and tell me all about it.

It’s important to note that a protocol’s methods “belong” to the namespace that they’re defined in. In these examples, the fully qualified names of the Psychodynamics methods are data-psychology/thoughts and data-psychology/feelings-about. If you have an object-oriented background, this might seem weird because methods belong to data types in OOP. But don’t freak out! It’s just another way that Clojure gives primacy to abstractions. One consequence of this fact is that, if you want two different protocols to include methods with the same name, you’ll need to put the protocols in different namespaces.

Records

Clojure allows you to create records, which are custom, maplike data types. They’re maplike in that they associate keys with values, you can look up their values the same way you can with maps, and they’re immutable like maps. They’re different in that you specify fields for records. Fields are slots for data; using them is like specifying which keys a data structure should have. Records are also different from maps in that you can extend them to implement protocols.

To create a record, you use defrecord to specify its name and fields:

(ns were-records)
(defrecord WereWolf [name title])

This record’s name is WereWolf, and its two fields are name and title. You can create an instance of this record in three ways:

 (WereWolf. "David" "London Tourist")
; => #were_records.WereWolf{:name "David", :title "London Tourist"}

 (->WereWolf "Jacob" "Lead Shirt Discarder")
; => #were_records.WereWolf{:name "Jacob", :title "Lead Shirt Discarder"}

 (map->WereWolf {:name "Lucian" :title "CEO of Melodrama"})
; => #were_records.WereWolf{:name "Lucian", :title "CEO of Melodrama"}

At ➊, we create an instance the same way we’d create a Java object, using the class instantiation interop call. (Interop refers to the ability to interact with native Java constructs within Clojure.) Notice that the arguments must follow the same order as the field definition. This works because records are actually Java classes under the covers.

The instance at ➋ looks nearly identical to the one at ➊, but the key difference is that ->WereWolf is a function. When you create a record, the factory functions ->RecordName and map->RecordName are created automatically. At ➌, map->WereWolf takes a map as an argument with keywords that correspond to the record type’s fields and returns a record.

If you want to use a record type in another namespace, you’ll have to import it, just like you did with the Java classes in Chapter 12. Be careful to replace all dashes in the namespace with underscores. This brief example shows how you’d import the WereWolf record type in another namespace:

(ns monster-mash
  (:import [were_records WereWolf]))
(WereWolf. "David" "London Tourist")
; => #were_records.WereWolf{:name "David", :title "London Tourist"}

Notice that were_records has an underscore, not a dash.

You can look up record values in the same way you look up map values, and you can also use Java field access interop:

(def jacob (->WereWolf "Jacob" "Lead Shirt Discarder"))
 (.name jacob) 
; => "Jacob"

 (:name jacob) 
; => "Jacob"

 (get jacob :name) 
; => "Jacob"

The first example, (.name jacob) at ➊, uses Java interop, and the examples at ➋ and ➌ access :name the same way you would with a map.

When testing for equality, Clojure will check that all fields are equal and that the two comparands have the same type:

 (= jacob (->WereWolf "Jacob" "Lead Shirt Discarder"))
; => true

 (= jacob (WereWolf. "David" "London Tourist"))
; => false

 (= jacob {:name "Jacob" :title "Lead Shirt Discarder"})
; => false

The test at ➊ returns true because jacob and the newly created record are of the same type and their fields are equal. The test at ➋ returns false because the fields aren’t equal. The final test at ➌ returns false because the two comparands don’t have the same type: jacob is a WereWolf record, and the other argument is a map.

Any function you can use on a map, you can also use on a record:

(assoc jacob :title "Lead Third Wheel")
; => #were_records.WereWolf{:name "Jacob", :title "Lead Third Wheel"}

However, if you dissoc a field, the result’s type will be a plain ol’ Clojure map; it won’t have the same data type as the original record:

(dissoc jacob :title)
; => {:name "Jacob"} <- that's not a were_records.WereWolf

This matters for at least two reasons: first, accessing map values is slower than accessing record values, so watch out if you’re building a high-performance program. Second, when you create a new record type, you can extend it to implement a protocol, similar to how you extended a type using extend-type earlier. If you dissoc a record and then try to call a protocol method on the result, the record’s protocol method won’t be called.

Here’s how you would extend a protocol when defining a record:

 (defprotocol WereCreature
   (full-moon-behavior [x]))

 (defrecord WereWolf [name title]
  WereCreature
  (full-moon-behavior [x]
    (str name " will howl and murder")))

(full-moon-behavior (map->WereWolf {:name "Lucian" :title "CEO of Melodrama"}))
; => "Lucian will howl and murder"

We’ve created a new protocol, WereCreature , with one method, full-moon-behavior ➋. At ➌, defrecord implements WereCreature for WereWolf. The most interesting part of the full-moon-behavior implementation is that you have access to name. You also have access to title and any other fields that might be defined for your record. You can also extend records using extend-type and extend-protocol.

When should you use records, and when should you use maps? In general, you should consider using records if you find yourself creating maps with the same fields over and over. This tells you that that set of data represents information in your application’s domain, and your code will communicate its purpose better if you provide a name based on the concept you’re trying to model. Not only that, but record access is more performant than map access, so your program will become a bit more efficient. Finally, if you want to use protocols, you’ll need to create a record.

Further Study

Clojure offers other tools for working with abstractions and data types. These tools, which I consider advanced, include deftype, reify, and proxy. If you’re interested in learning more, check out the documentation on data types at http://clojure.org/datatypes/.

Summary

One of Clojure’s design principles is to write to abstractions. In this chapter, you learned how to define your own abstractions using multimethods and prototypes. These constructs provide polymorphism, allowing the same operation to behave differently based on the arguments it’s given. You also learned how to create and use your own associative data types with defrecord and how to extend records to implement protocols.

When I first started learning Clojure, I was pretty shy about using multi­methods, protocols, and records. However, they are used often in Clojure libraries, so it’s good to know how they work. Once you get the hang of them, they’ll help you write cleaner code.

Exercises

  1. Extend the full-moon-behavior multimethod to add behavior for your own kind of were-creature.
  2. Create a WereSimmons record type, and then extend the WereCreature protocol.
  3. Create your own protocol, and then extend it using extend-type and extend-protocol.
  4. Create a role-playing game that implements behavior using multiple dispatch.