-
Notifications
You must be signed in to change notification settings - Fork 0
Guide & Tutorial
Writing software often feels like an act of translation. Stories,
requirements, and specifications must be translated into working
code. Some programming languages have types to add a bit of
"crispness" (thanks Simon-Peyton Jones) to the translation. While a
variety of interesting properties can captured in types, often types
are used in a provincial manner - they provide confidence to the
programmer but the value to other stakeholders remains curiously
difficult to prove. Dynamic languages like Clojure may offer a
different approach in the form of tools like clojure.spec
, but
somehow an artisanal odor persists.
Irrespective of their professed beliefs in typing disciplines, a good number of programmers embrace some form of unit testing. The dogmatic spirit of Test Driven Development contains a pearl of common sense - it's probably a good idea to think about what we want before we go asking for trouble. Yet, as software systems grow in complexity and age inexorably into some high level enemy in Elden Ring, what value do unit tests give us in following the larger plot? Picking any old random unit test does not offer a fractal view, rather only the end state of some Game of Life. Like a monthly receipt of JS Bach's expenses, it gives a sense how the person lived, but not much clarification into how the results were achieved.
Documentation and diagrams fulfill important roles as satellite photographs, but given must of us are employed as architects and engineers and not Cold War spies, we need something a bit more tangible.
Often stateful property based testing tutorials use data structures as an initial motivating example. Unfortunately this is akin teaching folks to write via the poems of Emily Dickinson.
A great many of us are building stateful systems designed around largely uncoordinated users. A user may take various actions but often those actions are limited or expanded based on previous actions taken. This might seem like an ambitious place to start, but if we focus on the fact that we are intentionally constructing a toy, we can be productive in short order.
Finally, it might appear that property based testing would require some careful thought about specific properties one might want to verify, but in practice we don't really need much more than equality to get great results.
Imagine we have a long-lived production system and that we would like to gain some reasoning power over this system. From this vantage point, details like programming language, database technology, user interface matter very little. We simply want to model how we believe a portion of the system works.
To keep the tutorial short let's imagine the following set of requirements:
- There are two users, User A and User B
- There are two rooms, Room 1 and Room 2
- There is a key
- Between the two rooms there is a door
- The door may be open, closed, or locked
- Either user can take the key if they are in the room where it lies
- Once the key is taken by a user it cannot be taken by the other user
- A user in possession of the key may drop the key
- A user in possession of the key can unlock the door
- Any user can open the unlocked door regardless of the room they are in
- Any user can close the open door regardless of the room they are in
- Any user in possession of the key can lock the closed door
While this may seem like a fair number of requirements, we'll see how to encode this in a little over a hundred lines, and then see how we can generate as many valid sequences of steps for the model as we like. Finally we can apply this tool to a faux legacy system and the model can give us a minimal sequence for which some requirement is violated.
Since we are in a near Platonic realm we can define the state naively. The point is to capture our understanding of the system. Implementation details be damned, we have a tool for thinking!
Here is a very naively designed state:
(def init-state
{:user-a #{}
:user-b #{}
:door :locked
:room-1 #{:key :user-a :user-b}
:room-2 #{}})
This emphasizes legibility above other concerns - it's easy to see who has the key, what room the users are in, and the door state.
The next step is to define the model.
A model is just a Clojure map of Clojure functions. The keys of the map are just keywords of the command names, so pick sensible names:
(def model
{:open-door open-spec
:close-door close-spec
:lock-door lock-spec
:unlock-door unlock-spec
:take-key take-key-spec
:drop-key drop-key-spec
:move move-spec})
We are not going to examine all of the command specs, just a few to get a feel for the practice of writing models.
Given the above initial state (where the door is locked and the key is
on the ground in Room A), it should be clear only one action is
possible - someone has to take the key. So let's start with
take-key-spec
:
(def take-key-spec
{:run? (fn [state] (user-and-key-in-same-room? state))
:args (fn [state]
(gen/tuple
(gen/elements
(disj (get state (room-with-key state)) :key))))
:next-state (fn [state {[user] :args :as command}]
(-> state
(update user conj :key)
(update (user->room state user) disj :key)))
:valid? (fn [state {[user] :args :as command}]
(= (user->room state user)
(room-with-key state)))})
The first question to ask is whether a command is available to be chosen.
This is the job of :run?
. This function will take the current value
of the state and examine it and either return true
or false
. This
command can run if there is some user in the same room where the key is on the
floor.
We will talk about :args
last, it involves
clojure.test.check.generators
which are much less scary than they sound.
So let's move on to :next-state
. This function takes the current
state and the command that was generated along with the generated
:args
. This is the information we need to update the state. In this
case we move the key to the chosen user and remove the key from the room.
:valid?
at first glance might seem strange. If we use a command sequence in a
test, we may fail a assertion. In that case shrinking begins. Shrinking is just
a funny way to describe automatically searching for a minimal case. It's
much easier to understand the two or three commands required for a failure
rather than the original thirty that triggered it. It may seem seem that the
search for a minimal command might take a long time, but recall that many later
commands rely on previous commands in order to be chosen.
Imagine User A dropped the key at some point but during shrinking this command gets discarded. Then User B can no longer pick it up. Oh, and User B can no longer unlock the door. All these invalidated commands get pruned, accelerating the shrinking process.
Finally :args
. We use clojure.test.check.generators
to create
arguments, this is required if we want shrinking and replayablity to
work as expected.
If we must choose between a few options we use gen/elements
. We wrap
this in gen/tuple
because that's natural - arguments are
sequential, and later this will be convenient.
We don't need it in this particular example but in some cases you will
need gen/bind
. This covers the case where say some User A has abilities
that other users do not. That is, the arguments for the command depend
on each other, they cannot be made independently. More about that in a
bit.
Again you only need to understand generators when dealing with :args
and after a little bit of practice, it becomes quite natural.
Let's look at drop-key-spec
next.
(def drop-key-spec
{:run? (fn [state] (some-user-with-key? state))
:args (fn [state] (gen/tuple (gen/return (user-with-key state))))
:next-state (fn [state {[user] :args :as command}]
(-> state
(update user disj :key)
(update (user->room state user) conj :key)))
:valid? (fn [state {[user] :args}]
(= user (user-with-key state)))})
Take a moment to read this. Note gen/return
is used if there is only
a single value to choose from. In this case, only one person can be in
posession of the key, so gen/return
is exactly what we need. The
rest of the commands look the same, we'll look at one more spec to
understand gen/bind
.
Here is move-spec
:
(def move-spec
{:freq 2
:run? (fn [state] (= :open (:door state)))
:args (fn [state]
(gen/bind (gen/elements [:user-a :user-b])
(fn [user]
(gen/tuple
(gen/return user)
(gen/return (next-room (user->room state user)))))))
:next-state (fn [state {[user] :args :as command}]
(let [prev-room (user->room state user)
room (prev-room next-room)]
(-> state
(update prev-room disj user)
(update room conj user))))
:valid? (fn [state] (door-open? state))})
We specify :freq
2 to make room moves a little more likely to be
chosen when possible. Everything but :args
should be self explanatory
now. In truth, move does not really need to specify the room since there
are only two possiblities, but it makes the generated command
sequences much easier to read and motivates using gen/bind
.
As we said above, we use gen/bind
because the room we will choose
depends on the user. An interesting excercise for the reader would be
to add three rooms, where in some cases a user can have two choices.
Now let's generate some commands:
(require '[clojure.test.check.generators :as gen])
(gen/generate (fugato/commands model init-state))
We pass a model, an initial state and the output might look something like this:
({:command :take-key, :args [:user-b]}
{:command :unlock-door, :args [:user-b]}
{:command :drop-key, :args [:user-b]}
{:command :take-key, :args [:user-a]}
{:command :open-door, :args [:user-b]}
{:command :move, :args [:user-b :room-2]}
{:command :close-door, :args [:user-a]})
It feels a bit magical!
We're going to write the simplest possible property check - equality:
(defspec model-eq-reality 10
(prop/for-all [commands (fugato/commands model world 10 1)]
(= (run world commands) (-> commands last meta :after))))
We use defspec
to define a property based test. prop/for-all
takes a generator and we make an assertion. We specify that we want at least
10 commands and we don't want to shrink beyond a single command. How run
works
isn't all that important, you can choose to implement this
in whatever way you see fit. You run the commands against your
code base and you return a simplified state of the system
that matches the structure of the "toy" modeled state.
fugato conveniently provides the :before
and :after
model state
on each command as Clojure metadata. This makes the equality check really easy
to do.
What happens if we introduce a bug into our production code?
For example imagine we change our real world take-key
to
not remove the key from the room when the user takes it. Running
the test will then output:
{:shrunk {:total-nodes-visited 7,
:depth 2,
:pass? false,
:result false,
:result-data nil,
:time-shrinking-ms 10,
:smallest [[{:command :take-key, :args [:user-a]}]]},
:failed-after-ms 13,
:num-tests 1,
:seed 1721143868810,
:fail [({:command :take-key, :args [:user-a]}
{:command :drop-key, :args [:user-a]}
{:command :take-key, :args [:user-a]}
{:command :drop-key, :args [:user-a]}
{:command :take-key, :args [:user-a]}
{:command :unlock-door, :args [:user-a]}
{:command :lock-door, :args [:user-a]})],
:result false,
:result-data nil,
:failing-size 0,
:pass? false,
:test-var "model-eq-reality"}
Ok now imagine we messed up a user privileges. In this case we wrote
code to only allow :user-a
to move between rooms.
{:shrunk
{:total-nodes-visited 10,
:depth 3,
:pass? false,
:result false,
:result-data nil,
:time-shrinking-ms 45,
:smallest
[[{:command :take-key, :args [:user-a]}
{:command :unlock-door, :args [:user-a]}
{:command :open-door, :args [:user-b]}
{:command :move, :args [:user-b :room-1]}]]},
:failed-after-ms 90,
:num-tests 5,
:seed 1721578242532,
:fail
[({:command :take-key, :args [:user-a]}
{:command :unlock-door, :args [:user-a]}
{:command :open-door, :args [:user-a]}
{:command :move, :args [:user-a :room-2]}
{:command :move, :args [:user-b :room-1]}
{:command :move, :args [:user-b :room-2]}
{:command :close-door, :args [:user-a]}
{:command :drop-key, :args [:user-a]}
{:command :take-key, :args [:user-b]}
{:command :open-door, :args [:user-b]}
{:command :drop-key, :args [:user-b]}
{:command :move, :args [:user-b :room-1]}
{:command :take-key, :args [:user-a]})],
:result false,
:result-data nil,
:failing-size 4,
:pass? false,
:test-var "model-eq-reality"}
Even though the failing sequence is 13 steps long, we can shrink to the basic 4 steps. All of this is possible just by making the most elementary assertion - that our modeled state matches the state of the code running under test. While evolving a code base, this kind of safety delivers outsized value along with traditional unit tests.
Ok, great! But ... it's not that obvious what went wrong here from the sequence alone. How do we go about debugging this?
Let's imagine our initial state looks like the following:
(def init-state
{:user-a #{}
:user-b #{}
:door :locked
:room-1 #{:user-a :key}
:room-2 #{:user-b}})
We can easily write a quick snippet using the :smallest
case from the report:
(require '[clojure.data :refer [diff]])
(let [xs [{:command :take-key, :args [:user-a]}
{:command :unlock-door, :args [:user-a]}
{:command :open-door, :args [:user-b]}
{:command :move, :args [:user-b :room-1]}]]
(clojure.data/diff
(fugato/execute model init-state xs) ;; A
(run world xs))) ;; B
This will output:
({:room-2 nil, ;; only in A
:room-1 #{:user-b}}
{:room-2 #{:user-b}} ;; only in B
{:room-1 #{:user-a},
:door :open,
:user-b #{},
:user-a #{:key}})
Here we see clearly that our model expects :user-b
to be in :room-1
. But
our actual code has :user-b
still in :room-2
even after a :move
was issued.
As powerful as this seems there are tradeoffs. You must be confident that your model itself is correct, and for trivial programs it's hard to justify doing the work twice. However, a significant portion of software development is software maintenance. As time passes, it becomes more and more challenging to keep all the details at hand, or that some seemingly innocuous change won't have ripple effects even when you appear to have decent test coverage.
fugato provides a nimble way to think about a software system at a higher level and execute that understanding without getting bogged down in the details.
We've had have a fair amount of success in using it to reason about global properties of our production systems. Hopefully you can too!