Skip to content

fosskers/cl-transducers

Repository files navigation

Transducers: Ergonomic, efficient data processing

I think Transducers are a fundamental primitive that decouples critical logic from list/sequence processing, and if I had to do Clojure all over I would put them at the bottom.

– Rich Hickey

Transducers are an ergonomic and extremely memory-efficient way to process a data source. Here “data source” means simple collections like Lists or Vectors, but also potentially large files or generators of infinite data.

Transducers…

  • allow the chaining of operations like map and filter without allocating memory between each step.
  • aren’t tied to any specific data type; they need only be implemented once.
  • vastly simplify “data transformation code”.
  • have nothing to do with “lazy evaluation”.
  • are a joy to use!

Example: While skipping every second line of a file, sum the lengths of only evenly-lengthed lines.

(in-package :transducers)

(transduce
  ;; How do we want to process each element?
  (comp (step 2) (map #'length) (filter #'evenp))
  ;; How do we want to combine all the elements together?
  #'+
  ;; What's our original data source?
  #p"README.org")
7026

This library has been confirmed to work with SBCL, CCL, ECL, Clasp and LispWorks 8.

⚠ ABCL cannot be used due to lack of support for tail-call elimination within labels.

⚠ Allegro support is partial for the same reason; large sources blow the stack.

Looking for Transducers in other Lisps? Check out the Emacs Lisp and Fennel implementations!

History and Motivation

Originally invented in Clojure and adapted to Scheme as SRFI-171, Transducers are an excellent way to think about - and efficiently operate on - collections or streams of data. Transduction operations are strict and don’t involve “laziness” or “thunking” in any way, yet only process the exact amount of data you ask them to.

This library draws inspiration from both the original Clojure and SRFI-171, while adding many other convenient operations commonly found in other languages.

Installation

This library is available on Quicklisp and Ultralisp. To download the main system:

(ql:quickload :transducers)

For the JSON extensions:

(ql:quickload :transducers/jzon)

Usage and Theory

Importing

Since this library reuses some symbol names also found in :cl, it is expected that you import transducers as follows in your defpackage:

(defpackage foo
  (:use :cl)
  (:local-nicknames (:t :transducers)))

You can then make relatively clean calls like:

(t:transduce (t:map #'1+) #'t:vector '(1 2 3))
;; => #(2 3 4)

However, many of the examples below use (in-package :transducers) for brevity in the actual function calls. You should still use a nickname in your own code.

Transducers, Reducers, and Sources

;; The fundamental pattern.
(transduce <transducer-chain> <reducer> <source>)

Data processing largely has three concerns:

  1. Where is my data coming from? (sources)
  2. What do I want to do to each element? (transducers)
  3. How do I want to collect the results? (reducers)

Each full “transduction” requires all three. We pass one of each to the transduce function, which drives the process. It knows how to pull values from the source, feed them through the transducer chain, and wrap everything together via the reducer.

  • Typical transducers are map, filter, and take.
  • Typical reducers are +, count, t:cons, and fold.
  • Typical sources are lists, vectors, strings, hash tables, and files.

Generators are a special kind of source that yield infinite data. Typical generators are repeat, cycle, and ints.

Let’s sum the squares of the first 1000 odd integers:

(in-package :transducers)

(transduce
 (comp (filter #'oddp)             ;; (2) Keep only odd numbers.
       (take 1000)                 ;; (3) Keep the first 1000 filtered odds.
       (map (lambda (n) (* n n)))) ;; (4) Square those 1000.
 #'+       ;; (5) Reducer: Add up all the squares.
 (ints 1)) ;; (1) Source: Generate all positive integers.
1333333000

Two things of note here:

  1. comp is used here to chain together different transducer steps. Notice that the order appears “backwards” from usual function composition. It may help to imagine that comp is acting like the ->> macro here. comp is supplied here as a convenience; you’re free to use alexandria:compose if you wish.
  2. The reduction via + is listed as Step 5, but really it’s occuring throughout the transduction process. Each value that makes it through the composed transducer chain is immediately added to an internal accumulator.

Explore the other transducers and reducers to see what’s possible! You’ll never write a loop again.

Processing JSON Data

The system transducers/jzon provides automatic JSON streaming support via the jzon library. Like transducers itself, it is expected that you import this system with a nickname:

(:local-nicknames (#:j #:transducers/jzon))

Only two functions are exposed: read and write.

  • read is a source that accepts a pathname, open stream, or a string. It produces parsed JSON values as Lisp types. JSON Objects become Hash Tables.
  • write is a reducer that expects an open stream. It writes the stream of Lisp types into their logical JSON equivalents.

Here is a simple example of reading some JSON data from a string, doing nothing to it, and outputting it again to a new string:

(in-package :transducers)

(with-output-to-string (stream)
  (transduce #'pass
             (transducers/jzon:write stream)
             (transducers/jzon:read "[{\"name\": \"A\"}, {\"name\": \"B\"}]")))
[{"name":"A"},{"name":"B"}]

Note that the JSON data must be a JSON array. There is otherwise no size limit; the library can handle any amount of JSON input.

For more examples, see the Gallery below.

Fset: Immutable Collections

The system transducers/fset provides support for the Fset library of immutable collections. It’s expected that you import this system with a nickname:

(:local-nicknames (#:s #:transducers/fset))

Reducers are provided for each of its main types: set, map, seq, and bag.

(in-package :transducers)

(transduce (map #'1+) #'transducers/fset:set (fset:set 1 2 3 1))
#{ 2 3 4 }

API

The examples here use (in-package :transducers) for brevity in the actual function calls and to allow them to be runnable directly in this README, but as mentioned above it’s recommended to nickname the library to :t due to some overlap with :cl.

Transducers

Transducers describe how to alter the items of some stream of values. Some transducers, like take, can short-circuit.

Multiple transducer functions can be chained together with comp.

pass, map

Just pass along each value of the transduction.

(in-package :transducers)
(transduce #'pass #'cons '(1 2 3))
(1 2 3)

Apply a function F to all elements of the transduction.

(in-package :transducers)
(transduce (map #'1+) #'cons '(1 2 3))
(2 3 4)

filter, filter-map, unique, dedup

Only keep elements from the transduction that satisfy PRED.

(in-package :transducers)
(transduce (filter #'evenp) #'cons '(1 2 3 4 5))
(2 4)

Apply a function F to the elements of the transduction, but only keep results that are non-nil.

(in-package :transducers)
(transduce (filter-map #'cl:first) #'cons '(() (2 3) () (5 6) () (8 9)))
(2 5 8)

Only allow values to pass through the transduction once each. Stateful; this uses a set internally so could get quite heavy if you’re not careful.

(in-package :transducers)
(transduce #'unique #'cons '(1 2 1 3 2 1 2 "abc"))
(1 2 3 "abc")

Remove adjacent duplicates from the transduction.

(in-package :transducers)
(transduce #'dedup #'cons '(1 1 1 2 2 2 3 3 3 4 3 3))
(1 2 3 4 3)

drop, drop-while, take, take-while

Drop the first N elements of the transduction.

(in-package :transducers)
(transduce (drop 3) #'cons '(1 2 3 4 5))
(4 5)

Drop elements from the front of the transduction that satisfy PRED.

(in-package :transducers)
(transduce (drop-while #'evenp) #'cons '(2 4 6 7 8 9))
(7 8 9)

Keep only the first N elements of the transduction.

(in-package :transducers)
(transduce (take 3) #'cons '(1 2 3 4 5))
(1 2 3)

Keep only elements which satisfy a given PRED, and stop the transduction as soon as any element fails the test.

(in-package :transducers)
(transduce (take-while #'evenp) #'cons '(2 4 6 8 9 2))
(2 4 6 8)

uncons, concatenate, flatten

Split up a transduction of cons cells.

(in-package :transducers)
(transduce #'uncons #'cons '((:a . 1) (:b . 2) (:c . 3)))
(:A 1 :B 2 :C 3)

Concatenate all the sublists, subvectors, or substrings in the transduction.

(in-package :transducers)
(transduce #'concatenate #'cons '((1 2 3) (4 5 6) (7 8 9)))
(1 2 3 4 5 6 7 8 9)
(in-package :transducers)
(transduce (comp #'concatenate (intersperse #\!))
           #'string '("hello" "there"))
h!e!l!l!o!t!h!e!r!e

Entirely flatten all lists in the transduction, regardless of nesting.

(in-package :transducers)
(transduce #'flatten #'cons '((1 2 3) 0 (4 (5) 6) 0 (7 8 9) 0))
(1 2 3 0 4 5 6 0 7 8 9 0)

segment, window, group-by

Partition the input into lists of N items. If the input stops, flush any accumulated state, which may be shorter than N.

(in-package :transducers)
(transduce (segment 3) #'cons '(1 2 3 4 5))
((1 2 3) (4 5))

Yield N-length windows of overlapping values. This is different from segment which yields non-overlapping windows. If there were fewer items in the input than N, then this yields nothing.

(in-package :transducers)
(transduce (window 3) #'cons '(1 2 3 4 5))
((1 2 3) (2 3 4) (3 4 5))

Group the input stream into sublists via some function F. The cutoff criterion is whether the return value of F changes between two consecutive elements of the transduction.

(in-package :transducers)
(transduce (group-by #'evenp) #'cons '(2 4 6 7 9 1 2 4 6 3))
((2 4 6) (7 9 1) (2 4 6) (3))

intersperse, enumerate, step, scan

Insert an ELEM between each value of the transduction.

(in-package :transducers)
(transduce (intersperse 0) #'cons '(1 2 3))
(1 0 2 0 3)

Index every value passed through the transduction into a cons pair. Starts at 0.

(in-package :transducers)
(transduce #'enumerate #'cons '("a" "b" "c"))
((0 . "a") (1 . "b") (2 . "c"))

Only yield every Nth element of the transduction. The first element of the transduction is always included.

(in-package :transducers)
(transduce (step 2) #'cons '(1 2 3 4 5 6 7 8 9))
(1 3 5 7 9)

Build up successsive values from the results of previous applications of a given function F.

(in-package :transducers)
(transduce (scan #'+ 0) #'cons '(1 2 3 4))
(0 1 3 6 10)

once

Inject some ITEM onto the front of the transduction.

(in-package :transducers)
(transduce (comp (filter (lambda (n) (> n 10)))
                 (once 'hello)
                 (take 3))
           #'cons (ints 1))
(HELLO 11 12)

log

Call some LOGGER function for each step of the transduction. The LOGGER must accept the running results and the current element as input. The original items of the transduction are passed through as-is.

(in-package :transducers)
(transduce (log (lambda (_ n) (format t "Got: ~a~%" n))) #'cons '(1 2 3 4 5))
Got: 1
Got: 2
Got: 3
Got: 4
Got: 5

These are STDOUT results. The actual return value is the result of the reducer, in this case cons, thus a list.

from-csv, into-csv

Interpret the data stream as CSV data.

The first item found is assumed to be the header list, and it will be used to construct useable hashtables for all subsequent items.

Note: This function makes no attempt to convert types from the original parsed strings. If you want numbers, you will need to further parse them yourself.

(in-package :transducers)
(transduce (comp #'from-csv
                 (map (lambda (hm) (gethash "Name" hm))))
           #'cons '("Name,Age" "Alice,35" "Bob,26"))
("Alice" "Bob")

Given a sequence of HEADERS, rerender each item in the data stream into a CSV string. It’s assumed that each item in the transduction is a hash table whose keys are strings that match the values found in HEADERS.

(in-package :transducers)
(transduce (comp #'from-csv
                 (into-csv '("Name" "Age")))
           #'cons '("Name,Age,Hair" "Alice,35,Blond" "Bob,26,Black"))
("Name,Age" "Alice,35" "Bob,26")

Reducers

Reducers describe how to fold the stream of items down into a single result, be it either a new collection or a scalar.

Some reducers, like first, can also force the entire transduction to short-circuit.

cons, snoc, vector, string, hash-table

Collect all results as a list.

(in-package :transducers)
(transduce #'pass #'cons '(1 2 3))
(1 2 3)

Collect all results as a list, but results are reversed. In theory, slightly more performant than cons since it performs no final reversal.

(in-package :transducers)
(transduce #'pass #'snoc '(1 2 3))
(3 2 1)

Collect a stream of values into a vector.

(in-package :transducers)
(transduce #'pass #'vector '(1 2 3))
#(1 2 3)

Collect a stream of characters into to a single string.

(in-package :transducers)
(transduce (map #'char-upcase) #'string "hello")
HELLO

Collect a stream of key-value cons pairs into a hash table.

(in-package :transducers)
(transduce #'enumerate #'hash-table '("a" "b" "c"))
#<COMMON-LISP:HASH-TABLE :TEST EQUAL :COUNT 3 {1004E83BF3}>

count, average, median

Count the number of elements that made it through the transduction.

(in-package :transducers)
(transduce #'pass #'count '(1 2 3 4 5))
5

Calculate the average value of all numeric elements in a transduction.

(in-package :transducers)
(transduce #'pass #'average '(1 2 3 4 5 6))
7/2

Calculate the median value of all elements in a transduction, provided that they are numbers, strings, or characters. The elements are sorted once before the median is extracted.

(in-package :transducers)
(transduce #'pass #'median '(1 1 1 0 2 4 1 4 9))
1

anyp, allp

Yield t if any element in the transduction satisfies PRED. Short-circuits the transduction as soon as the condition is met.

(in-package :transducers)
(transduce #'pass (anyp #'evenp) '(1 3 5 7 9 2))
T

Yield t if all elements of the transduction satisfy PRED. Short-circuits with NIL if any element fails the test.

(in-package :transducers)
(transduce #'pass (allp #'oddp) '(1 3 5 7 9))
T

first, last, find

Yield the first value of the transduction. As soon as this first value is yielded, the entire transduction stops.

(in-package :transducers)
(transduce (filter #'oddp) #'first '(2 4 6 7 10))
7

Yield the last value of the transduction.

(in-package :transducers)
(transduce #'pass #'last '(2 4 6 7 10))
10

Find the first element in the transduction that satisfies a given PRED. Yields NIL if no such element were found.

(in-package :transducers)
(transduce #'pass (find #'evenp) '(1 3 5 6 9))
6

fold

fold is the fundamental reducer. fold creates an ad-hoc reducer based on a given 2-argument function. An optional SEED value can also be given as the initial accumulator value, which also becomes the return value in case there were no input left in the transduction.

Functions like + and * are automatically valid reducers, because they yield sane values even when given 0 or 1 arguments. Other functions like cl:max cannot be used as-is as reducers since they can’t be called without arguments. For functions like this, fold is appropriate.

(in-package :transducers)
(transduce #'pass (fold #'cl:max) '(1 2 3 4 1000 5 6))
1000

With a seed:

(in-package :transducers)
(transduce #'pass (fold #'cl:max 0) '())
0

In Clojure this function is called completing.

for-each

Run through every item in a transduction for their side effects. Throws away all results and yields t.

(in-package :transducers)
(transduce (map (lambda (n) (format t "~a~%" n))) #'for-each #(1 2 3 4))
T

Sources

Data is pulled in an on-demand fashion from Sources. They can be either finite or infinite in length. A list is an example of a simple Source, but you can also pull from files and endless number generators.

ints, random

Yield all integers, beginning with START and advancing by an optional STEP value which can be positive or negative. If you only want a specific range within the transduction, then use take-while within your transducer chain.

(in-package :transducers)
(transduce (take 10) #'cons (ints 0 :step 2))
(0 2 4 6 8 10 12 14 16 18)

Yield an endless stream of random numbers, based on a given LIMIT.

(in-package :transducers)
(transduce (take 20) #'cons (random 10))
(8 0 5 6 6 2 2 4 2 7 9 2 0 0 2 4 4 9 9 9)
(in-package :transducers)
(transduce (take 5) #'cons (random 1.0))
(0.4115485 0.35940528 0.0056368113 0.31019592 0.4214077)

cycle, repeat, shuffle

Yield the values of a given SEQ endlessly.

(in-package :transducers)
(transduce (take 10) #'cons (cycle '(1 2 3)))
(1 2 3 1 2 3 1 2 3 1)

Endlessly yield a given ITEM.

(in-package :transducers)
(transduce (take 4) #'cons (repeat 9))
(9 9 9 9)

Endlessly yield random elements from a given vector.

(in-package :transducers)
(transduce (take 5) #'cons (shuffle #("Alice" "Bob" "Dennis")))
("Alice" "Bob" "Alice" "Dennis" "Bob")

Recall also that strings are vectors too:

(in-package :transducers)
(transduce (take 15) #'string (shuffle "Númenor"))
eeúúrúmnnremmno

plist

Yield key-value pairs from a Property List, usually known as a ‘plist’. The pairs are passed as a cons cell.

(in-package :transducers)
(transduce (map #'cdr) #'+ (plist '(:a 1 :b 2 :c 3)))
6

See also the uncons transducer for another way to handle incoming cons cells.

reversed

Yield a vector’s elements in reverse order.

(in-package :transducers)
(transduce (take 2) #'cons (reversed #(1 2 3 4)))
(4 3)

Recall that strings are also vectors.

(in-package :transducers)
(transduce #'pass #'string (reversed "Hello"))
olleH

Utilities

comp, const

Function composition. You can pass as many functions as you like and they are applied from right to left.

(in-package :transducers)
(funcall (comp #'length #'reverse) #(1 2 3))
3

For transducer functions specifically, they are composed from right to left, but their effects are applied from left to right. This is due to how the reducer function is chained through them all internally via transduce.

Notice here how drop is clearly applied first:

(in-package :transducers)
(transduce (comp (drop 3) (take 2)) #'cons '(1 2 3 4 5 6))
(4 5)

Return a function that ignores its argument and returns ITEM instead.

(in-package :transducers)
(funcall (comp (const 108) (lambda (n) (* 2 n)) #'1+) 1)
108

reduced, reduced-p, reduced-val

When writing your own transducers and reducers, these functions allow you to short-circuit the entire operation.

Here is a simplified definition of first:

(in-package :transducers)
(defun first (&optional (acc nil a-p) (input nil i-p))
  (cond ((and a-p i-p) (reduced input))
        ((and a-p (not i-p)) acc)
        (t acc)))

You can see reduced being used to wrap the return value. transduce sees this wrapping and immediately halts further processing.

reduced-p and reduced-val can similarly be used (mostly within transducer functions) to check if some lower transducer (or the reducer) has signaled a short-circuit, and if so potentially perform some clean-up. This is important for transducers that carry internal state.

Example Gallery

Reading lines from a File

Pathnames can be passed as-is as a Source. This yields their lines one by one.

Counting words:

(in-package :transducers)
(transduce (comp (map #'str:words)
                 #'concatenate)
           #'count #p"README.org")
3661

Reducing into Property Lists and Assocation Lists

There is no special reducer function for plists, because none is needed. If you have a stream of cons cells, you can break it up with uncons and then collect with cons as usual:

(in-package :transducers)
(transduce (comp (map (lambda (pair) (cl:cons (car pair) (1+ (cdr pair)))))
                 #'uncons)
           #'cons (plist '(:a 1 :b 2 :c 3)))
(:A 2 :B 3 :C 4)

Likewise, Association Lists are already lists-of-cons-cells, so no special treatment is needed:

(in-package :transducers)
(transduce #'pass #'cons '((:a . 1) (:b . 2) (:c . 3)))
((:A . 1) (:B . 2) (:C . 3))

JSON: Calculating average age

Since JSON Objects are parsed as Hash Tables, we use the usual functions to retrieve fields we want.

(in-package :transducers)
(transduce (filter-map (lambda (ht) (gethash "age" ht)))
           #'average
           (transducers/jzon:read "[{\"age\": 34}, {\"age\": 25}]"))
59/2

Sieve of Eratosthenes

An ancient method of calculating Prime Numbers.

(in-package :transducers)
(let ((xf (comp (inject (lambda (prime) (filter (lambda (n) (/= 0 (mod n prime))))))
                (take 10))))
  (cl:cons 2 (transduce xf #'cons (ints 3 :step 2))))
(2 3 5 7 11 13 17 19 23 29 31)

Writing your own Primitives

One of the advantages of the Transducers pattern is that there is no “magic”. As you’ll see below, it’s all just function composition.

Transducers

A Transducer is a function that continues the stream. It operates on one element at a time. It receives input, optionally does something to it, and then optionally continues by calling the next function in the chain, or it ignores the current input, or it short-circuits the stream entirely. We’ll see examples of all of these below.

map - A simple transformation

Here is how map is implemented in the library. Let’s study it to learn the overall structure of transducers in general.

(defun map (f)  ;; (1) Top-level arguments needed throughout.
  (lambda (reducer)  ;; (2) The rest of the composed function chain.
    (lambda (result &optional (input nil i-p))  ;; (3) The main body of the transducer.
      (if i-p
          (funcall reducer result (funcall f input))  ;; (4) The primary logic and a call to the next stage.
          (funcall reducer result)))))  ;; (5) The finalisation pass.

Recall that map would be called like:

(in-package :transducers)
(transduce (map #'1+) #'cons '(1 2 3 4 5))
(2 3 4 5 6)

So we can see at (1) that the f corresponds to the function we’re passing in, which we expect to be applied to all elements of the stream.

(2) might be a surprise. What is reducer and where does it come from? Is it the cons call seen above? Well, it’s actually the transducer chain (possibly combined via comp), followed by the reducer. Like this:

transducers.png

It is the call to transduce that puts this all together for you.

(3) is what actually gets called during the iteration. The &optional (input nil i-p) may be new to you; this is how Common Lisp handles the potential lack of optional arguments.

(input nil i-p)
 ^     ^   ^---- Was the optional argument actually given? nil or non-nil.
 |     `---- The default value if the optional arg was missing. Can be anything.
 `---- The name of the optional arg. Either what the user passed, or the default.

Unfortunately due to “nil punning”, testing the input for nil is not enough to determine if the argument was given or not, since they may have legitimately passed nil. Hence we need a second signal, named i-p here, to do that test. Clojure and Scheme can pattern match on the number of arguments directly, but Common Lisp cannot. If the i-p test fails, then we know the transduction is over.

(4) should be clear; apply f and then call the reducer the continue the chain.

(5) will become clearer once we’ve learned about the structure of Reducers. For now, just know that this is the last thing that the top-level transduce call attempts as it is finalising the result.

filter - Ignoring input

With map fresh in your mind, now stare at this:

(defun filter (pred)
  (lambda (reducer)
    (lambda (result &optional (input nil i-p))
      (if i-p
          ;; vvv (4) vvv
          (if (funcall pred input)
              (funcall reducer result input) ;; (4a)
              result) ;; (4b)
          ;; ^^^ (4) ^^^
          (funcall reducer result)))))

Point (4) in the previous example was the “meat”, the actual logic of the transducer. Here we see it expanded a bit. Notice that we only continue the chain at (4a) if the predicate passed. Otherwise, we yield the result we were given and directly return, going no further for this particular input element. Then, transduce will supply the next one. The effect is what we’d expect of filter; some elements make it through the stream and some don’t.

take-while - Short-circuiting

Similar to filter is take-while, except that the latter halts the stream entirely as soon as an element fails the predicate.

(defun take-while (pred)
  (lambda (reducer)
    (lambda (result &optional (input nil i-p))
      (if i-p
          ;; vvv (4) vvv
          (if (not (funcall pred input))
              (reduced result)
              (funcall reducer result input))
          ;; ^^^ (4) ^^^
          (funcall reducer result)))))

Here reduced makes its debut. This wraps the given value in a special type that signals to transduce that the transduction has been short-circuited and must end. Nothing further will be pulled from the Source.

unique - Stateful transduction

Despite just being a group of composed functions, individual transducers can hold state. Consider unique, which is called like:

(in-package :transducers)
(transduce #'unique #'cons '(1 2 1 3 2 1 2 "abc"))
(1 2 3 "abc")

Here’s its definition:

(defun unique (reducer)
  (let ((seen (make-hash-table :test #'equal)))
    (lambda (result &optional (input nil i-p)) ;; (3)
      (if i-p
          ;; vvv (4) vvv
          (if (gethash input seen)
              result
              (progn (setf (gethash input seen) t)
                     (funcall reducer result input)))
          ;; ^^^ (4) ^^^
          (funcall reducer result)))))

There are two immediate differences here:

  1. Since unique requires no top-level argument (like map or filter), it is passed directly to transduce as #'unique. This means we don’t need another inner lambda and can accept the reducer directly.
  2. We can open a let before Point (3), and the seen Hash Table is then captured by the lambda. This has the effect of persisting it between every call of the lambda on each element.

Once again we notice a bare result being returned if we’ve seen the current element already.

Reducers

A Reducer is a function that consumes a stream. It accepts two, one, or no arguments.

count - Simple cumulative state

An example of count being called:

(in-package :transducers)
(transduce #'pass #'count '(1 2 3 4 5))
5

Here is its definition:

(defun count (&optional (acc 0 a-p) (input nil i-p))
  (cond ((and a-p i-p) (1+ acc))   ;; (I) Iterative case. The stream is still running.
        ((and a-p (not i-p)) acc)  ;; (D) We're done! Do any post-processing here.
        (t 0)))  ;; (M) "Monoidal" / base case.

Similar to the Transducer functions, we use the &optional trick to test how many arguments we were given. Let’s start from the bottom with the (M) base case. transduce calls this internally in order to generate an initial value. This corresponds to the result seen in the Transducer examples. Since each Reducer behaves differently and we are not using a static type system, we must define the Reducer’s unique “zero value” here.

(D) is what was hinted at before - this case is called last by transduce in order to allow the Reducer to do any post-processing before the final value is yielded to the user. This is necessary as occasionally the acc value grown by the Reducer can be a complicated structure and we may want to sort it, unwrap it, etc.

(I) is the usual case and corresponds to some Transducer calling down into the Reducer with the cumulative state thusfar and the current stream element. The Reducer then decides what to do with them. In the case of count, the element itself is ignored and we just add 1 to our growing acc.

cons - Some post-processing

(defun cons (&optional (acc nil a-p) (input nil i-p))
  (cond ((and a-p i-p) (cl:cons input acc))   ;; (I)
        ((and a-p (not i-p)) (nreverse acc))  ;; (D)
        (t '()))) ;; (M)

Here in (I) we see the input actually being saved. This then loops back around within transduce, which pulls the next value from the Source and calls the Transducer chain again.

In (D) we see some realistic post-processing. Since (I) was naively consing, the order of our elements is backwards from what we intend. Thus they must be reversed once before being yielded to the user.

In (M) our “zero value” is the empty list. Otherwise, what would we be consing onto on the first pass of (I)?

anyp - Short-circuiting

anyp stops as soon as anything satisfies its predicate.

(in-package :transducers)
(transduce #'pass (anyp #'evenp) '(1 3 5 2 7 9))
T

Usage of reduced isn’t limited to Transducers; Reducers can short-circuit the stream as well.

(defun anyp (pred)
  (lambda (&optional (acc nil a-p) (input nil i-p))
    (cond ((and a-p i-p)
           ;; vvv (I) vvv
           (if (funcall pred input)
               (reduced t)
               nil))
           ;; ^^^ (I) ^^^
          ((and a-p (not i-p)) acc) ;; (D)
          (t nil))))

Like with filter, this Reducer requires a top-level predicate, so we add an inner lambda.

Within (I) we can see reduced employed. Seeing this, transduce will not continue and will instead go right to the (D) case. The final acc is T.

Sources

transduce is a defgeneric, and so can be called on anything that has a corresponding defmethod for it. There are many “natural” Sources like lists and vectors, but we can easily define our own and then supply a transduce method to add it to the family of things we can iterate over.

Let’s review the reversed Source, a means by which to iterate over a vector in reverse order.

(in-package :transducers)
(transduce #'pass #'string (reversed "Hello"))
olleH

In order to have a distinct type to associate a transduce method with, we need a wrapper type:

(defstruct reversed
  (vector #() :type cl:vector))

We also supply a prettier constructor:

(defun reversed (vector)
  "Source: Yield a VECTOR's elements in reverse order."
  (make-reversed :vector vector))

Now come a trio of functions that drive the iteration:

(defmethod transduce (xform f (source reversed))
  (reversed-transduce xform f source))

(defun reversed-transduce (xform f coll)
  (let* ((init   (funcall f))  ;; (1) The (M) case of the Reducer.
         (xf     (funcall xform f))  ;; (2) Putting the transducer/reducer chain together.
         (result (reversed-reduce xf init coll)))  ;; (3) The work.
    (funcall xf result)))  ;; (7) The (D) case of the Reducer.

(defun reversed-reduce (f identity rev)
  (let* ((vec (reversed-vector rev))
         (len (length vec)))
    ;; Simple recursion to drive the iteration.
    (labels ((recurse (acc i)
               (if (< i 0)
                   acc  ;; (4) We're done.
                   (let ((acc (safe-call f acc (aref vec i)))) ;; (5) "Safe" application of the transducer chain.
                     (if (reduced-p acc)  ;; (6a) Short-circuiting occured. Time to go home.
                         (reduced-val acc)
                         (recurse acc (1- i))))))) ;; (6b) Otherwise, keep going.
      (recurse identity (1- len)))))

All types follow this pattern. In (1) and (2) we do initial setup. In (4) we’ve reached the natural end of the Source (e.g. the end of the vector).

At (5) you’re free to do a funcall on f, however using safe-call instead provides you with protection against a number of common failure cases and offers a variety of handy restarts. Who wants to bail halfway through processing a multi-gigabyte data file? No, just skip the bad line and keep going! Etc, etc.

At (6a) we see that we always need to check if the result of the current call was “reduced”, i.e. short-circuited.

That’s it! The beauty of defgeneric is that its methods can be defined in other systems. This is precisely how the extensions for jzon, fset, and nonempty are done.

Limitations

  1. This library is generally portable, but assumes your CL implementation supports tail-call elimination within labels.
  2. A way to model the common zip function has not yet been found, but I suspect the answer lies in being able to pass multiple sources as &rest arguments.

Resources