2 HD/VSA Intro

(ns intro2
  (:require
   [wolframite.api.v1 :as wl]
   ;; [wolframite.lib.helpers :as h]
   [wolframite.runtime.defaults]
   [wolframite.tools.hiccup :as wh]
   [wolframite.wolfram :as w]
   ;; [scicloj.clay.v2.api :as clay]
   ))

https://github.com/benjamin-asdf/hdc-wolfram/issues/4

This follows the paper Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors P. Kanerva 2009

2.0.1 Neural Computing à la John Von Neumann

A very humble initial observation: Brains have a high count of elements (large circuits).
let’s see what happens if you build a computer with very wide words.
Instead of 64 bit words, use 10.000 bit words, hypervectors (HDV).
This datatype together with a set of operations is a Vector Symbolic Architecture

2.1 Multiply-Add-Permute (MAP)

3 The Hypervector

   ----------------------------
   | -1 1 -1 1 1 1 -1         |
   ----------------------------
              ------------> ,  d

dimensions = d = 10.000

Using -1 and 1 as elements, we can denote

H = {-1, 1}^10.000

as the 10.000 dimensional hypervector space.

The hypervector is a point in hyperspace.
There are 2^10.000 points in the hypervector space.
2 random hypervectors agree in 1/2 of the bits.
So they have roughly 50% similarity.
The points are not clustered; But looking from one point, the distances are highly concentrated mid-way into the space.

It is easy to see that half the space is closer to a point than 0.5 and the other half is further away, but it is somewhat surprising that less than a millionth of the space is closer than 0.476 and less than a thousand-millionth is closer than 0.47; similarly, less than a millionth is further than 0.524 away and less than a thousand-millionth is further than 0.53. (Kanerva 2009)

Like picking a point, and flipping a coin for each bit.
The chance that more than half coin flips match becomes vanishingly low fast
Robustness: Even when something like 40% of the bits are missing, a hypervector can be recognized.

There are different flavors including sparse hypervectors See Schlegel et.al. 2021 A comparison of Vector Symbolic Architectures http://www.arxiv.org/abs/2001.11797

10.000 dimensions is a common choice.

(def dimensions (long 1e4))

(wl/! (w/_= 'seed
            (w/fn []
              (w/Table (w/RandomChoice [-1 1])
                       [dimensions]))))

nil

(wl/! (w/_= 'seedb
            (w/fn [n]
              (w/Table (w/RandomChoice [-1 1]) ['i n] ['k dimensions]))))

nil

(defn seed
  "Returns a fresh hypervector.
  With `n` return a batch of `n` hypervectors.

  Alias `random`.
  "
  ([n] (list 'seedb n))
  ([] (list 'seed)))

(count (wl/! (seed)))

(take 10 (wl/! (seed)))

(-1 -1 -1 -1 1 -1 1 1 -1 1)

A fresh, random hypervector is also called seed, I think because you pick a fresh one when you encounter a new entity.

Subsequently, we will operations that set these seeds into relationships.

3.1 Visualizing a hypervector

Making a QR code kinda thing out of a hypervector:

(defn plot
  [hd]
  (w/ArrayPlot (w/Plus 1
                       (w/ArrayReshape hd
                                       [(w/Sqrt dimensions)
                                        (w/Sqrt
                                         dimensions)]))))

(wh/view-no-summary
 (wl/! (w/Block [(w/= 'x (plot (seed)))] 'x)))

3.2 Normalize

After superposition, the hypervector contains 0 (ties), -2 and 2 elements. Sometimes, you want to normalize the hypervector to -1 and 1. In other VSA’s, you might also thin a vector to a sparsity etc.

(wl/! (w/_= 'normalize w/Sign))

nil

(defn normalize
  "Returns a normalized hypervector."
  [hdv]
  `(~'normalize ~hdv))

3.3 Similarity

Hamming distance counts the bits that are different.

(wl/! (w/HammingDistance (seed) (seed)))

It is always roughly 0.5 for 2 random hypervectors.

Imagine picking a hypervector, then flipping a coin for each bit.
The differences between all other hypervectors and the picked one follow a binomial distribution.

3.4 Dot similarity

I use ‘dot similarity’ for the rest of this page

Counts where the vectors agree minus where they disagree
Divide by dimensions
-1 for the negative (opposite) vector
~0 for unrelated vectors
1 for the equal vector
>1 if one of the vectors has values higher than 1 and they agree

(wl/!
 (w/_= 'similarity
       (w/fn [a b]
         (w/Divide (w/Dot a b) dimensions))))

nil

(defn similarity
  "Returns the dot similarity of the inputs.
  Keeps batch dim."
  [a b]
  `(~'similarity ~a ~b))

(map float (wl/! (similarity (seed 10) (seed))))

(0.0184 0.0114 0.0044 0.013 0.003 -0.0168 0.0058 -0.0158 0.0034 -0.002)

3.4.1 Experiment: picking unrelated hypervectors

Pick a random hypervector a
Pick 10.000 random hypervectors (n)
Find the similarity between i = 0…n and a

(def a (wl/! (seed)))

(wh/view-no-summary
 (w/ListPlot
  (w/Table
   `(~'similarity ~a ~'(seed))
   [10000])
  '(->
    AxesLabel ["n" "Similarity"])))

In fact, you can pick and pick and keep picking and the similarity will be around 0. (0.5 difference)
The universe will run out of time before it runs out of hypervectors.
Hyperspace is ‘seemingly infinite’, this property alone is interesting for a cognitive datatype.

3.4.2 Robustness

Experiment: TODO

I can recommend Kanerva 2009 and ‘Sparse Distributed Memory’ (1998) for more.

(wh/view-no-summary*
 (w/ListPlot (w/Table (similarity (seed) (seed)) 10))
 false)

4 Item Memory

Bookkeeping to go between HD/VSA domain and symbolic domain.

similiraty above 0.15 is almost certainly ‘related’

(def default-threshold 0.15)

(defn ->item-memory [] {})

(defn cleanup-1
  [mem x threshold]
  (let [items (into [] mem)
        hdvs (into [] (map val) items)
        ;; perf bottleneck, guessing the datatrensfers
        similarities (wl/! (similarity hdvs x))]
      (->> (map-indexed vector similarities)
           (filter (comp #(<= threshold %) second))
           (sort-by second (fn [a b] (compare b a)))
           (map (fn [[index similarity]]
                  {:hdv (hdvs index)
                   :obj (key (items index))
                   :similarity similarity})))))

(defn cleanup*
  ([mem x] (cleanup* mem x default-threshold))
  ([mem x threshold] (map :obj (cleanup-1 mem x threshold))))

(defn store-item [mem x]
  (assoc mem x (wl/! (seed))))

(defonce item-memory (atom (->item-memory)))

(defn book-keep!
  [x]
  (or (@item-memory x)
      (get (swap! item-memory store-item x) x)))

(defn item-cleanup [hdv]
  (cleanup* @item-memory hdv default-threshold))

5 Vector Symbolic Arithmetic (VSA)

VSA stands sometimes for Vector Symbolic Architecture, sometimes for Vector Symbolic Arithmetic or also Algebra.

Arithmetic: the use of numbers in counting and calculation.

I am sort of hesitant to say ‘architecture’ to a set of operations. I go with ‘arithmetic’.

Asking what could the Arithmetic logic unit (ALU) operations of this hyperdimensional computer be?

These operations return the same datatype as the inputs - hypervectors of the same length. In programming, we say the VSA is closed under the hypervector.

5.1 Superposition

A binary operation that returns a hypervector that is similar to its inputs. It is the collection of the inputs.

I imagine overlapping 2 see-through pieces of paper, the outcome is the superposition.

Since neurons have sparse activation, the superposition of 2 neuronal ensembles could just be the combined activation.

(wl/! (w/_= 'superposition 'Plus))

nil

(defn superposition
  "`superposition` returns a hdv that is the multiset
  of the input `hdvs`.

  The resulting hdv is similar in equal meassure to all it's inputs.

  Set membership can be tested with [[similarity]].

  Like `+`, superposition is associative and commutative.

  Superposition distributes over permutation.

  Alias `bundle`.
  "
  [& hdvs]
  `(~'superposition ~@hdvs))

5.1.1 Null

The null element of the superposition operation is a zero hypervector.

(wl/!
 (w/_= 'zeros
       (w/fn [] (w/Table 0 [dimensions]))))

nil

(defn zeros
  "Returns a zero hypervector.
   This is the null element of the superposition operation."
  []
  (list 'zeros))

5.1.2 Negative

The reverse of superposition is the superposition with the negative hdv (substraction).

(wl/! (w/_= 'negative 'Minus))

nil

(defn negative
  "returns a hdv that is the negative of the input hdv."
  [hdv]
  `(~'negative ~hdv))

(let [a (wl/! (seed))]
  (wl/! (similarity a (negative a))))

-1

(let [a (wl/! (seed))
      b (wl/! (seed))]
  (wl/!
   (w/==
    a
    (superposition
     (superposition a b)
     (negative b)))))

true

5.2 Bind

The other binary operation, it returns a hypervector that is dissimilar to its inputs.
It is possible to recover the other input, given one input and the output.
Therefore, ‘bind’ can be used to represent a key-value pair.

(wl/! (w/_= 'bind 'Times))

nil

(defn bind
  "returns a hdv that is different from the inputs.

  The inverse of bind is binding with the 'inverse' vector.
  In MAP, this is the same as the original vector.
  (In MAP, bind is the inverse of itself.)

  The bind represents a `key-value-pair` of the inputs.

  Bind is associative and commutative.
  Bind distributes over superposition and permutation.
  Bind preserves similarity.
  "
  [& hdvs]
  `(~'bind ~@hdvs))

(let [a (wl/! (seed))
      b (wl/! (seed))
      c (wl/! (seed))]
  [
   ;; Revocer input
   (wl/! (w/== a (bind (bind a b) b)))

   ;; Commutative
   (wl/! (w/== (bind a b) (bind b a)))

   ;; Associative
   (wl/! (w/== (bind (bind a b) c) (bind a (bind b c))))

   ;; Distributive over superposition
   (wl/! (w/== (superposition (bind a c) (bind b c))
               (bind c (superposition a b))))

   ;; preserve similarity
   ;; It is like `a`, `b` and their relationship are perfectly mirrored in the `c` domain.
   (wl/!
    (w/==
     (similarity a b)
     (similarity (bind a c) (bind b c))))])

[true true true true true]

5.2.1 Identity

The identity element of the bind operation is the identity hypervector.

For MAP, this is an hdv where all bits are 1.

(wl/! (w/_= 'identityHdv
            (w/fn [] (w/Table 1 [dimensions]))))

nil

(defn identity-hdv []
  (list 'identityHdv))

binding a hdv with itself yields the identity hdv.

(wl/! (w/== (identity-hdv) (bind a a)))

true

binding a hdv with the identity hdv yields the hdv itself.

(wl/! (w/== a (bind a (identity-hdv))))

true

{H, bind, identity-hdv} and {H, superposition, zeros} are commutative monoids.

The whole thing together with inverse is also a ring or something.

5.2.2 Inverse

(wl/! (w/_= 'inverse w/Identity))

nil

(defn inverse
  "`inverse` returns the inverse of the input hdv.

  Bind with the inverse hdv is equivalent to unbind with the hdv.

  (this is trivial in MAP, bind is it's own inverse, and a hdv is it's own inverse)."
  [hdv]
  `(~'inverse ~hdv))

(let [a (wl/! (seed))
      b (wl/! (seed))]
  (= b (wl/! (bind (bind a b) (inverse a)))))

true

(def kvp
  (wl/! (bind (book-keep! :a)
              (book-keep! :b))))

(wl/!
 (similarity
  (book-keep! :b)
  (bind kvp (book-keep! :a))))

(wl/!
 (similarity
  (book-keep! :a)
  (bind kvp (book-keep! :a))))

81/5000

Bind can be thought of returning a point b ‘in the domain of’ the input a.

(let [m (wl/!
         (bind
          (book-keep! :green-domain)
          (superposition
           (book-keep! :a)
           (book-keep! :b)
           (book-keep! :c))))]
  [(float (wl/! (similarity m (book-keep! :c))))

   ;; Ask given :c 'in the the green-domain', is it contained in m?
   (float
    (wl/! (similarity m
                      (bind
                       (book-keep! :green-domain)
                       (book-keep! :c)))))])

[-0.0134 0.9938]

5.3 Permute

shift / roll the bits of a hdv

(wl/! (w/_= 'permute (w/fn [hdv n] (w/RotateLeft hdv n))))

nil

(defn permute
  "`permute` returns a hdv that is the input hdv rotated by n bits.
  With second arg `n`, permute this number of times.

  This is used to create a point from `hdv` in an unrelated domain.

  - to create a quotation
  - to use as position marker in a sequence
  - to implement a non-commutative bind

  Alias `protect` / `unprotect`.

  Inverse [[permute-inverse]].
  "
  ([hdv]
   (permute hdv 1))
  ([hdv n]
   `(~'permute ~hdv ~n)))

(defn permute-inverse
  "`permute-inverse` returns a hdv that is the input hdv rotated by -n bits.
  See [[permute]].
  "
  ([hdv]
   (permute-inverse hdv 1))
  ([hdv n]
   (permute hdv (- n))))

This is sort of subtle but powerful.

A permuted hdv is unrelated to the original hdv. We say permute is ‘randomizing’ the hdv, making it unrelated to hdvs of the input domain.

(wl/! (similarity a (permute a)))

-3/500

(wl/! (similarity a (permute-inverse (permute a))))

Using non-commutative bind, we can represent directed edges for a directed graph.

(defn non-commutative-bind
  ([a b]
   (bind a (permute b))))

(defn directed-edge
  "Given a tail and a head, return a directed edge."
  [tail head]
  (non-commutative-bind tail head))

(defn edge-head
  "Given the tail and a directed edge, return the head of the edge."
  [edge tail]
  (bind edge (permute tail)))

(defn edge-tail
  "Given the head and a directed edge, return the tail of the edge."
  [edge head]
  (permute-inverse (bind edge head)))

(defn directed-graph
  [edges]
  (apply superposition
         (map (fn [[tail head]] (directed-edge tail head)) edges)))

(let [g (directed-graph [[(book-keep! :a) (book-keep! :b)]
                         [(book-keep! :b) (book-keep! :c)]
                         [(book-keep! :c) (book-keep! :a)]])]
  (item-cleanup
   (edge-tail g (book-keep! :a))))

(:b)

6 Examples with Cognitive Connotations

6.1 What is the Dollar in Mexico?

(defn record [m]
  (apply superposition
         (map (fn [[k v]] (bind k v)) m)))

(defn ->record
  [m]
  (wl/! (record (map #(mapv book-keep! %) m))))

(def usa-record
  (->record
   {:country :usa
    :capital :washington
    :currency :dollar}))

(def mexico-record
  (->record
   {:capital :mexico-city
    :country :mexico
    :currency :peso}))

(def query-outcome
  (wl/!
   ;; ~ :peso
   (bind
    mexico-record
    ;; ~ :currency
    (bind usa-record (book-keep! :dollar)))))

(take 10 query-outcome)

(-3 -3 -3 -3 1 3 1 -1 -3 -3)

(map
 #(select-keys % [:obj :similarity])
 (cleanup-1 @item-memory query-outcome 0.2))

({:obj :peso, :similarity 2573/2500})

Superposition enables ‘programming in superposition’.

(def record-xyz (wl/!
                 (record
                  [[(book-keep! :x) (book-keep! 1)]
                   [(book-keep! :y) (book-keep! 2)]
                   [(book-keep! :z) (superposition
                                     (book-keep! 1)
                                     (book-keep! 2)
                                     (book-keep! 3))]])))

the value in the :x domain is 1

(item-cleanup (bind record-xyz (book-keep! :x)))

(1)

the value in the :y domain is 2

(item-cleanup (bind record-xyz (book-keep! :y)))

(2)

the value in the :z domain is the superposition of 1,2,3

(item-cleanup (bind record-xyz (book-keep! :z)))

(1 2 3)

Programming in superposition…

A collection can be treated like an element.
Everything is a set.
ambiguity / continuity is first class.

Dot similarity explorations:

(let [a (wl/! (seed))]
  (wl/! (similarity a a)))

(let [a (wl/! (seed))
      b (wl/! (seed))
      c (wl/! (seed))]
  (float
   (wl/!
    (similarity a (w/Plus a b c)))))

0.9924

(let [a (wl/! (seed))
      b (wl/! (seed))
      c (wl/! (seed))]
  (float
   (wl/!
    (similarity
     (w/Plus a b c)
     (w/Plus a b c)))))

2.9816

(let [a (wl/! (seed))
      b (wl/! (seed))
      c (wl/! (seed))]
  (float
   (wl/!
    (similarity
     (normalize (w/Plus a b c))
     (normalize (w/Plus a b c))))))

1.0

(let [a (wl/! (seed))
      b (wl/! (seed))
      c (wl/! (seed))]
  (float
   (wl/!
    (similarity
     a
     (normalize (w/Plus a b c))))))

0.5

source: notebooks/intro2.clj