Skip to content

benjamin-asdf/vsa-binary-sparse-distributed-segments-clj

Repository files navigation

High-dimensional computing is a great approach to neuromorphic, neurosymbolic computing.

This software licensed as GNU GPLv3.

Sparse Binary Vectors

Brain networks exhibit sparse pattern activity, which is the motivation to explore high-dimensional computing with sparse vectors. Sparse VSA architectures have been shown to have competitive capacity and desirable HDC properties.[fn:1] At the same time, they can be represented with drastically less memory consumption.

Blog post

HDV/VSA implementation

Following the paper High-Dimensional Computing with Sparse Vectors [Laiho,Poikonen,Kanerva,Lehtonen 2020].

binary_sparse_segmented.clj contains an implementation of ‘Binary Sparse Distributed Representation with segments’ (naming from *2)

Datastructures / “Programming in Superposition”

Following Vector Symbolic Architectures as a Computing Framework for Emerging Hardware, I made some example complex data structures.

/src/bennischwerdtner/hd/data.clj

Sets, Sequences, Stacks, Graphs, Trees, Finite State Automaton.

Perhaps HDC/VSA’s most distinguished aspect is ’programming in superposition’. The HDC/VSA set is a bloom filter, but it is not limited to sets. Via VSA we can represent structured data.

(I highly recommend the literature if you are interested, this is not a tutorial, but exploring a software-level perspective on existing work).

Here, finite state automata are represented as a superposition of transitions. One can query with a superposition state, making it a nondeterministic finite-state automaton. In effect, running all its possible transitions in parallel.

examples contains examples and expirements.

What Is The Dollar In Mexico?

The implementation succesfully represents the analogy-reasoning example from Kanerva (*3):

  (def mexico-record
    (h/thin
     (h/bundle
      (h/bind (symbol->hv :capital) (symbol->hv 'mxc))
      (h/bind (symbol->hv :currency) (symbol->hv 'peso))
      (h/bind (symbol->hv :name) (symbol->hv 'mex)))))

  (def usa-record
    (h/thin (h/bundle (h/bind (symbol->hv :capital)
                              (symbol->hv 'wdc))
                      (h/bind (symbol->hv :currency)
                              (symbol->hv 'dollar))
                      (h/bind (symbol->hv :name)
                              (symbol->hv 'usa)))))


  (let [result
        (h/unbind mexico-record
                  ;; this represents the query
                  (h/unbind usa-record (symbol->hv 'dollar)))]

    (cleanup-lookup-value result))
;;  => peso

What Is The Bread For Lava?

examples/what_is_the_bread_for_lava.clj

What Is the Moon of Saturn?

examples/what_is_the_moon_of_saturn.clj

(def earth
  (hdd/->directed-edge (hdd/clj->vsa* [:moon :luna])))

(def saturn
  (hdd/directed-graph (hdd/clj->vsa*
                       [[:adjective :saturnian]
                        [:adjective :cronian]
                        [:adjective :kronian]
                        [:rings true]
                        [:moon :mimas]
                        [:moon :enceladus]
                        [:moon :tethys]
                        [:moon :dione]
                        [:moon :rhea]
                        [:moon :titan]
                        [:moon :iapetus]])))

;; let's say you know luna and you want to know what that is in saturn domain

(hdd/cleanup*
 (hdd/edge->destination
  saturn
  ;; ~ :moon
  (hdd/edge->source earth (hdd/clj->vsa :luna))))
'(:rhea :iapetus :tethys :dione :mimas :titan :enceladus)

;; 0. "What is the dollar in mexico?" kind of things work in general.
;; 1. The moon of saturn is a superposition 7 things
;; 2. Saturn is a composite datastructure, yet we pretend it's one element 'edge->destination'
;;    works on an edge element, and the superposition of elements

Nondeterministic Finite-State Automaton

;; --------------------
;; Nondeterministic finite-state automaton
;; --------------------
;; - it can be in several states at once
;; - there can be several valid transitions from a given current state and input symbol
;; - It can assume a so-called generalized state,
;;   defined as a set of the automaton's states that are simultaneously active
;; - a generalized state corresponds to a hypervector representing the set of the currenlty active states
;; - query the same way, is like executing the automaton in parallel (in superposition)
;; - cleanup will have to search for several nearest neighbors
;;

;; automaton in superposition (i.e. just query with states that are in superposition)
;;

(def water-domain
  (apply
   finite-state-automaton
   (clj->vsa*
    [[:frozen :heat :liquid]
     [:liquid :heat :gas]
     [:liquid :cool :frozen]
     [:gas :cool :liquid]
     [:gas :heat :gas]
     [:frozen :cool :frozen]])))

(cleanup*
 (automaton-destination water-domain
                        (hd/superposition
                         (clj->vsa :liquid)
                         (clj->vsa :frozen))
                        (clj->vsa :cool)))
'(:frozen)

;; if your state is the superposition of liquid and frozen

(cleanup* (automaton-destination water-domain
                                 (hd/superposition
                                  (clj->vsa :liquid)
                                  (clj->vsa :frozen))
                                 (clj->vsa :heat)))
'(:liquid :gas)

;; I mean, there is something else that is even crazier (or am I missing something?)
;; that is this:

(def water-bender-domain
  (apply finite-state-automaton
         (map #(map clj->vsa %)
              [[:frozen :heat :shards]
               [:liquid :heat :bubbles]
               [:liquid :cool :absolute-zero]])))

;; now I have 2 automatons,

(cleanup* (automaton-destination
           ;; ... superimpose them
           (hd/superposition water-domain water-bender-domain)
           (hd/superposition
            (clj->vsa :liquid)
            (clj->vsa :frozen))
           (clj->vsa :heat)))

'(:liquid :gas :shards :bubbles)

;; and we just run them in parallel, lol
;; stuff like that.

The point I was missing was that superimposing 2 automatons (union) is equivalent to making 1 large one in the first place. It is somewhat suggestive though, the primitives of a hyper interpreter might have this ‘mixing’ at the core.

Fun With Trees

file: examples/fun_with_trees.clj contains a bit of a walkthrough of some ‘programming in superposition’ concepts.

Hyper-If

(this was a very early idea)

./examples/hyper_if.clj

  ;; Idea 1:
  ;;
  ;; A hyper if
  ;; In high dimensional computing, the outcome of a calculation could represent
  ;; the combination of all 'possible' outcomes.
  ;;
  ;; Interesting here to note is that 'what is possible?' is defined by the threshold, too.
  ;;
  ;; We can imagine dynamically lowering and increasing the threshold.
  ;; (Would model something like 'fast' and 'slow' thinking perhaps).
  ;;


;; a hyper-if evaluates to the information mix of all 'possible' branches.

(def both-true-and-false
  (hd/thin
   (hd/bundle
    (->prototype true)
    (->prototype false))))

(defn coin
  []
  (hyper-if both-true-and-false
            (->prototype :heads)
            (->prototype :tails)))

;; all the bookeeping can go away ofc
(map :k (cleanup-lookup-verbose (coin)))

;; => (:heads :tails)

We can envisage a programming paradigm that models something like a multiverse, where multiple things are true. (This is probably very close to probabilistic programming, I know little of that).

Similarly, a multi-symbol could resolve to either a list of things, or to a thing representing the set of things.

Such explorations are found in

/examples/sequence_processor.clj, which I consider ‘attic’, ‘on the shelf’.

But making some kind of Lisp interpreter gave me at least training with using hdvs.

Sparse Distributed Memory

src/bennischwerdtner/sdm/sdm.clj contains a sparse distributed memory implementation using

libpythonclj, numpy + torch.

This was a quick way for me to implement a gpu version, making this reasonably fast.

Python setup

  • Set up a Python env, and run Clojure using this env.
  • requirements: PyTorch NumPy
  • Here is how I do that:
  • `python -m venv venv`
  • `. ./activate.sh`
  • `pip install PyTorch numpy`
  • start cider via dev.el, or start Clojure via run.sh, or tell your tooling to use run.sh as Clojure program

Fun With SDM And Analogies, How To Use ‘Known Worlds’

Series

Copycat

Copycat [Mitchel and Hofstadter 1988] is an analogical reasoning software solving the copycat domain. A world where strings of letters of the alphabet exist.

Suppose ‘abc’ was changed to ‘abd’, what happened to ‘jkl’ that makes it say “the same happened to me!”?

One of my goals is a hyper cat, a hyper-dimensional copy of the copycat.

This is very early and tiny at the moment. So far, I got to use my SDM, programming in superposition, and what ‘analogical’ programming could be. It’s fun to me.

Acknowledgements

Thanks to Carin Meier’s intro to VSA.

Thanks to Chris Nuernberger’s dtype next for high performance and linear algebra stuff at the clojure repl.

Future

  • Dynamic sparsity mechanisms could perhaps represent different levels of detail, or parallel search processes[fn:2].
  • Develop datastructures, languages, tools, and a software philosophy for /Programming in superposition/.

Literatrue

./lit.org

Footnotes

[fn:1]

Schlegel et.al. 2021 A comparison of Vector Symbolic Architectures

http://www.arxiv.org/abs/2001.11797 arXiv:2001.11797

[fn:2]

G. Palm Neural Assemblies: An Alternative Approach to Artificial Intelligence, (first edition: 1982, 2nd ed.: 2022)

About

HDC/VSA: Binary Sparse Distributed Representation with segments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages