Skip to content

Guide & Tutorial

davidnolen edited this page Jul 22, 2024 · 36 revisions

Lost in Translation

Writing software often feels like an act of translation. Stories, requirements, and specifications must be translated into working code. Some programming languages have types to add a bit of "crispness" (thanks Simon-Peyton Jones) to the translation. While a variety of interesting properties can captured in types, often types are used in a provincial manner - they provide confidence to the programmer but the value to other stakeholders remains curiously difficult to prove. Dynamic languages like Clojure may offer a different approach in the form of tools like clojure.spec, but somehow an artisanal odor persists.

Irrespective of their professed beliefs in typing disciplines, a good number of programmers embrace some form of unit testing. The dogmatic spirit of Test Driven Development contains a pearl of common sense - it's probably a good idea to think about what we want before we go asking for trouble. Yet, as software systems grow in complexity and age inexorably into some high level enemy in Elden Ring, what value do unit tests give us in following the larger plot? Picking any old random unit test does not offer a fractal view, rather only the end state of some Game of Life. Like a monthly receipt of JS Bach's expenses, it gives a sense how the person lived, but not much clarification into how the results were achieved.

Documentation and diagrams fulfill important roles as satellite photographs, but given must of us are employed as architects and engineers and not Cold War spies, we need something a bit more tangible.

A Motivating Example

Often stateful property based testing tutorials use data structures as an initial motivating example. Unfortunately this is akin teaching folks to write via the poems of Emily Dickinson.

A great many of us are building stateful systems designed around largely uncoordinated users. A user may take various actions but often those actions are limited or expanded based on previous actions taken. This might seem like an ambitious place to start, but if we focus on the fact that we are intentionally constructing a toy, we can be productive in short order.

Finally, it might appear that property based testing would require some careful thought about specific properties one might want to verify, but in practice we don't really need much more than equality to get great results.

Thinking about models

Imagine we have a long-lived production system and that we would like to gain some reasoning power over this system. From this vantage point, details like programming language, database technology, user interface matter very little. We simply want to model how we believe a portion of the system works.

To keep the tutorial short let's imagine the following set of requirements:

  1. There are two users, User A and User B
  2. There are two rooms, Room 1 and Room 2
  3. There is a key
  4. Between the two rooms there is a door
  5. The door may be open, closed, or locked
  6. Either user can take the key if they are in the room where it lies
  7. Once the key is taken by a user it cannot be taken by the other user
  8. A user in possession of the key may drop the key
  9. A user in possession of the key can unlock the door
  10. Any user can open the unlocked door regardless of the room they are in
  11. Any user can close the open door regardless of the room they are in
  12. Any user in possession of the key can lock the closed door

While this may seem like a fair number of requirements, we'll see how to encode this in a little over a hundred lines, and then see how we can generate as many valid sequences of steps for the model as we like. Finally we can apply this tool to a faux legacy system and the model can give us a minimal sequence for which some requirement is violated.

Defining an initial state

Since we are in a near Platonic realm we can define the state naively. The point is to capture our understanding of the system. Implementation details be damned, we have a tool for thinking!

Here is a very naively designed state:

(def init-state
  {:user-a  #{}
   :user-b  #{}
   :door    :locked
   :room-1  #{:key :user-a :user-b}
   :room-2  #{}})

This emphasizes legibility above other concerns - it's easy to see who has the key, what room the users are in, and the door state.

The next step is to define the model.

Defining the model

A model is just a Clojure map of Clojure functions. The keys of the map are just keywords of the command names, so pick sensible names:

(def model
  {:open-door   open-spec
   :close-door  close-spec
   :lock-door   lock-spec
   :unlock-door unlock-spec
   :take-key    take-key-spec
   :drop-key    drop-key-spec
   :move        move-spec})

We are not going to examine all of the command specs, just a few to get a feel for the practice of writing models.

Defining a command spec

Given the above initial state (where the door is locked and the key is on the ground in Room A), it should be clear only one action is possible - someone has to take the key. So let's start with take-key-spec:

(def take-key-spec
  {:run?       (fn [state] (user-and-key-in-same-room? state))

   :args       (fn [state]
                 (gen/tuple
                   (gen/elements
                     (disj (get state (room-with-key state)) :key))))

   :next-state (fn [state {[user] :args :as command}]
                 (-> state
                   (update user conj :key)
                   (update (user->room state user) disj :key)))

   :valid?     (fn [state {[user] :args :as command}]
                 (= (user->room state user)
                    (room-with-key state)))})

The first question to ask is whether a command is available to be chosen. This is the job of :run?. This function will take the current value of the state and examine it and either return true or false. This command can run if there is some user in the same room where the key is on the floor.

We will talk about :args last, it involves clojure.test.check.generators which are much less scary than they sound.

So let's move on to :next-state. This function takes the current state and the command that was generated along with the generated :args. This is the information we need to update the state. In this case we move the key to the chosen user and remove the key from the room.

:valid? at first glance might seem strange. If we use a command sequence in a test, we may fail a assertion. In that case shrinking begins. Shrinking is just a funny way to describe automatically searching for a minimal case. It's much easier to understand the two or three commands required for a failure rather than the original thirty that triggered it. It may seem seem that the search for a minimal command might take a long time, but recall that many later commands rely on previous commands in order to be chosen.

Imagine User A dropped the key at some point but during shrinking this command gets discarded. Then User B can no longer pick it up. Oh, and User B can no longer unlock the door. All these invalidated commands get pruned, accelerating the shrinking process.

Generating :args

Finally :args. We use clojure.test.check.generators to create arguments, this is required if we want shrinking and replayablity to work as expected.

If we must choose between a few options we use gen/elements. We wrap this in gen/tuple because that's natural - arguments are sequential, and later this will be convenient.

We don't need it in this particular example but in some cases you will need gen/bind. This covers the case where say some User A has abilities that other users do not. That is, the arguments for the command depend on each other, they cannot be made independently. More about that in a bit.

Again you only need to understand generators when dealing with :args and after a little bit of practice, it becomes quite natural.

Let's look at drop-key-spec next.

Defining another command spec

(def drop-key-spec
  {:run?       (fn [state] (some-user-with-key? state))

   :args       (fn [state] (gen/tuple (gen/return (user-with-key state))))

   :next-state (fn [state {[user] :args :as command}]
                 (-> state
                   (update user disj :key)
                   (update (user->room state user) conj :key)))

   :valid?     (fn [state {[user] :args}]
                 (= user (user-with-key state)))})

Take a moment to read this. Note gen/return is used if there is only a single value to choose from. In this case, only one person can be in posession of the key, so gen/return is exactly what we need. The rest of the commands look the same, we'll look at one more spec to understand gen/bind.

Using gen/bind

Here is move-spec:

(def move-spec
  {:freq       2

   :run?       (fn [state] (= :open (:door state)))

   :args       (fn [state]
                 (gen/bind (gen/elements [:user-a :user-b])
                   (fn [user]
                     (gen/tuple
                       (gen/return user)
                       (gen/return (next-room (user->room state user)))))))

   :next-state (fn [state {[user] :args :as command}]
                 (let [prev-room (user->room state user)
                       room (prev-room next-room)]
                   (-> state
                     (update prev-room disj user)
                     (update room conj user))))

   :valid?     (fn [state] (door-open? state))})

We specify :freq 2 to make room moves a little more likely to be chosen when possible. Everything but :args should be self explanatory now. In truth, move does not really need to specify the room since there are only two possiblities, but it makes the generated command sequences much easier to read and motivates using gen/bind.

As we said above, we use gen/bind because the room we will choose depends on the user. An interesting excercise for the reader would be to add three rooms, where in some cases a user can have two choices.

Generating commands

Now let's generate some commands:

(require '[clojure.test.check.generators :as gen])
(gen/generate (fugato/commands model init-state))

We pass a model, an initial state and the output might look something like this:

({:command :take-key, :args [:user-b]}
 {:command :unlock-door, :args [:user-b]}
 {:command :drop-key, :args [:user-b]}
 {:command :take-key, :args [:user-a]}
 {:command :open-door, :args [:user-b]}
 {:command :move, :args [:user-b :room-2]}
 {:command :close-door, :args [:user-a]})

It feels a bit magical!

Testing it against real world code

We're going to write the simplest possible property check - equality:

(defspec model-eq-reality 10
  (prop/for-all [commands (fugato/commands model world 10 1)]
    (= (run world commands) (-> commands last meta :after))))

We use defspec to define a property based test. prop/for-all takes a generator and we make an assertion. We specify that we want at least 10 commands and we don't want to shrink beyond a single command. How run works isn't all that important, you can choose to implement this in whatever way you see fit. You run the commands against your code base and you return a simplified state of the system that matches the structure of the "toy" modeled state.

fugato conveniently provides the :before and :after model state on each command as Clojure metadata. This makes the equality check really easy to do.

Shrinking in Action

What happens if we introduce a bug into our production code?

For example imagine we change our real world take-key to not remove the key from the room when the user takes it. Running the test will then output:

{:shrunk {:total-nodes-visited 7, 
          :depth 2, 
          :pass? false, 
          :result false, 
          :result-data nil, 
          :time-shrinking-ms 10, 
          :smallest [[{:command :take-key, :args [:user-a]}]]},           
 :failed-after-ms 13, 
 :num-tests 1, 
 :seed 1721143868810, 
 :fail [({:command :take-key, :args [:user-a]} 
         {:command :drop-key, :args [:user-a]} 
         {:command :take-key, :args [:user-a]} 
         {:command :drop-key, :args [:user-a]} 
         {:command :take-key, :args [:user-a]}
         {:command :unlock-door, :args [:user-a]}
         {:command :lock-door, :args [:user-a]})], 
 :result false, 
 :result-data nil, 
 :failing-size 0, 
 :pass? false, 
 :test-var "model-eq-reality"}

Ok now imagine we messed up a user privileges. In this case we wrote code to only allow :user-a to move between rooms.

{:shrunk
 {:total-nodes-visited 10,
  :depth 3,
  :pass? false,
  :result false,
  :result-data nil,
  :time-shrinking-ms 45,
  :smallest
  [[{:command :take-key, :args [:user-a]}
    {:command :unlock-door, :args [:user-a]}
    {:command :open-door, :args [:user-b]}
    {:command :move, :args [:user-b :room-1]}]]},
 :failed-after-ms 90,
 :num-tests 5,
 :seed 1721578242532,
 :fail
 [({:command :take-key, :args [:user-a]}
   {:command :unlock-door, :args [:user-a]}
   {:command :open-door, :args [:user-a]}
   {:command :move, :args [:user-a :room-2]}
   {:command :move, :args [:user-b :room-1]}
   {:command :move, :args [:user-b :room-2]}
   {:command :close-door, :args [:user-a]}
   {:command :drop-key, :args [:user-a]}
   {:command :take-key, :args [:user-b]}
   {:command :open-door, :args [:user-b]}
   {:command :drop-key, :args [:user-b]}
   {:command :move, :args [:user-b :room-1]}
   {:command :take-key, :args [:user-a]})],
 :result false,
 :result-data nil,
 :failing-size 4,
 :pass? false,
 :test-var "model-eq-reality"}

Even though the failing sequence is 13 steps long, we can shrink to the basic 4 steps. All of this is possible just by making the most elementary assertion - that our modeled state matches the state of the code running under test. While evolving a code base, this kind of safety delivers outsized value along with traditional unit tests.

Ok, great! But ... it's not that obvious what went wrong here from the sequence alone. How do we go about debugging this?

Let's imagine our initial state looks like the following:

(def init-state
  {:user-a #{}
   :user-b #{}
   :door   :locked
   :room-1 #{:user-a :key}
   :room-2 #{:user-b}})

We can easily write a quick snippet using the :smallest case from the report:

(require '[clojure.data :refer [diff]])

(let [xs [{:command :take-key, :args [:user-a]}
          {:command :unlock-door, :args [:user-a]}
          {:command :open-door, :args [:user-b]}
          {:command :move, :args [:user-b :room-1]}]]
  (clojure.data/diff
    (fugato/execute model init-state xs) ;; A 
    (run world xs))) ;; B

This will output:

({:room-2 nil,         ;; only in A
  :room-1 #{:user-b}}
 {:room-2 #{:user-b}}  ;; only in B
 {:room-1 #{:user-a}, 
  :door :open, 
  :user-b #{}, 
  :user-a #{:key}})

Here we see clearly that our model expects :user-b to be in :room-1. But our actual code has :user-b still in :room-2 even after a :move was issued.

No Silver Bullet

As powerful as this seems there are tradeoffs. You must be confident that your model itself is correct, and for trivial programs it's hard to justify doing the work twice. However, a significant portion of software development is software maintenance. As time passes, it becomes more and more challenging to keep all the details at hand, or that some seemingly innocuous change won't have ripple effects even when you appear to have decent test coverage.

fugato provides a nimble way to think about a software system at a higher level and execute that understanding without getting bogged down in the details.

We've had have a fair amount of success in using it to reason about global properties of our production systems. Hopefully you can too!