2 HD/VSA Intro
ns intro2
(:require
(:as wl]
[wolframite.api.v1 ;; [wolframite.lib.helpers :as h]
[wolframite.runtime.defaults]:as wh]
[wolframite.tools.hiccup :as w]
[wolframite.wolfram ;; [scicloj.clay.v2.api :as clay]
))
https://github.com/benjamin-asdf/hdc-wolfram/issues/4
This follows the paper Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors P. Kanerva 2009
2.0.1 Neural Computing à la John Von Neumann
- A very humble initial observation: Brains have a high count of elements (large circuits).
- let’s see what happens if you build a computer with very wide words.
- Instead of 64 bit words, use 10.000 bit words, hypervectors (HDV).
- This datatype together with a set of operations is a Vector Symbolic Architecture
2.1 Multiply-Add-Permute (MAP)
3 The Hypervector
----------------------------
| -1 1 -1 1 1 1 -1 |
----------------------------
------------> , d
dimensions = d = 10.000
Using -1 and 1 as elements, we can denote
H = {-1, 1}^10.000
as the 10.000 dimensional hypervector space.
The hypervector is a point in hyperspace.
There are 2^10.000 points in the hypervector space.
2 random hypervectors agree in 1/2 of the bits.
So they have roughly 50% similarity.
The points are not clustered; But looking from one point, the distances are highly concentrated mid-way into the space.
It is easy to see that half the space is closer to a point than 0.5 and the other half is further away, but it is somewhat surprising that less than a millionth of the space is closer than 0.476 and less than a thousand-millionth is closer than 0.47; similarly, less than a millionth is further than 0.524 away and less than a thousand-millionth is further than 0.53. (Kanerva 2009)
- Like picking a point, and flipping a coin for each bit.
- The chance that more than half coin flips match becomes vanishingly low fast
- Robustness: Even when something like 40% of the bits are missing, a hypervector can be recognized.
There are different flavors including sparse hypervectors See Schlegel et.al. 2021 A comparison of Vector Symbolic Architectures http://www.arxiv.org/abs/2001.11797
10.000 dimensions is a common choice.
def dimensions (long 1e4)) (
'seed
(wl/! (w/_=
(w/fn []1 1])
(w/Table (w/RandomChoice [- [dimensions]))))
nil
'seedb
(wl/! (w/_=
(w/fn [n]1 1]) ['i n] ['k dimensions])))) (w/Table (w/RandomChoice [-
nil
defn seed
("Returns a fresh hypervector.
With `n` return a batch of `n` hypervectors.
Alias `random`.
"
list 'seedb n))
([n] (list 'seed))) ([] (
count (wl/! (seed))) (
10000
take 10 (wl/! (seed))) (
1 -1 -1 -1 1 -1 1 1 -1 1) (-
A fresh, random hypervector is also called seed
, I think because you pick a fresh one when you encounter a new entity.
Subsequently, we will operations that set these seeds into relationships.
3.1 Visualizing a hypervector
Making a QR code kinda thing out of a hypervector:
defn plot
(
[hd]1
(w/ArrayPlot (w/Plus
(w/ArrayReshape hd
[(w/Sqrt dimensions)
(w/Sqrt dimensions)]))))
(wh/view-no-summary'x (plot (seed)))] 'x))) (wl/! (w/Block [(w/=
3.2 Normalize
After superposition, the hypervector contains 0 (ties), -2 and 2 elements. Sometimes, you want to normalize the hypervector to -1 and 1. In other VSA’s, you might also thin a vector to a sparsity etc.
'normalize w/Sign)) (wl/! (w/_=
nil
defn normalize
("Returns a normalized hypervector."
[hdv]~'normalize ~hdv)) `(
3.3 Similarity
Hamming distance counts the bits that are different.
(wl/! (w/HammingDistance (seed) (seed)))
5002
It is always roughly 0.5 for 2 random hypervectors.
- Imagine picking a hypervector, then flipping a coin for each bit.
- The differences between all other hypervectors and the picked one follow a binomial distribution.
3.4 Dot similarity
I use ‘dot similarity’ for the rest of this page
- Counts where the vectors agree minus where they disagree
- Divide by dimensions
-1
for the negative (opposite) vector~0
for unrelated vectors1
for the equal vector>1
if one of the vectors has values higher than 1 and they agree
(wl/!'similarity
(w/_=
(w/fn [a b] (w/Divide (w/Dot a b) dimensions))))
nil
defn similarity
("Returns the dot similarity of the inputs.
Keeps batch dim."
[a b]~'similarity ~a ~b)) `(
map float (wl/! (similarity (seed 10) (seed)))) (
0.0184 0.0114 0.0044 0.013 0.003 -0.0168 0.0058 -0.0158 0.0034 -0.002) (
3.4.2 Robustness
Experiment: TODO
- I can recommend Kanerva 2009 and ‘Sparse Distributed Memory’ (1998) for more.
(wh/view-no-summary*10))
(w/ListPlot (w/Table (similarity (seed) (seed)) false)
4 Item Memory
Bookkeeping to go between HD/VSA domain and symbolic domain.
similiraty above 0.15 is almost certainly ‘related’
def default-threshold 0.15) (
defn ->item-memory [] {}) (
defn cleanup-1
(
[mem x threshold]let [items (into [] mem)
(into [] (map val) items)
hdvs (;; perf bottleneck, guessing the datatrensfers
similarities (wl/! (similarity hdvs x))]->> (map-indexed vector similarities)
(filter (comp #(<= threshold %) second))
(sort-by second (fn [a b] (compare b a)))
(map (fn [[index similarity]]
(:hdv (hdvs index)
{:obj (key (items index))
:similarity similarity})))))
defn cleanup*
(
([mem x] (cleanup* mem x default-threshold))map :obj (cleanup-1 mem x threshold)))) ([mem x threshold] (
defn store-item [mem x]
(assoc mem x (wl/! (seed)))) (
defonce item-memory (atom (->item-memory))) (
defn book-keep!
(
[x]or (@item-memory x)
(get (swap! item-memory store-item x) x))) (
defn item-cleanup [hdv]
(@item-memory hdv default-threshold)) (cleanup*
5 Vector Symbolic Arithmetic (VSA)
VSA stands sometimes for Vector Symbolic Architecture, sometimes for Vector Symbolic Arithmetic or also Algebra.
Arithmetic: the use of numbers in counting and calculation.
I am sort of hesitant to say ‘architecture’ to a set of operations. I go with ‘arithmetic’.
Asking what could the Arithmetic logic unit (ALU) operations of this hyperdimensional computer be?
These operations return the same datatype as the inputs - hypervectors of the same length. In programming, we say the VSA is closed under the hypervector.
5.1 Superposition
A binary operation that returns a hypervector that is similar to its inputs. It is the collection of the inputs.
I imagine overlapping 2 see-through pieces of paper, the outcome is the superposition.
Since neurons have sparse activation, the superposition of 2 neuronal ensembles could just be the combined activation.
'superposition 'Plus)) (wl/! (w/_=
nil
defn superposition
("`superposition` returns a hdv that is the multiset
of the input `hdvs`.
The resulting hdv is similar in equal meassure to all it's inputs.
Set membership can be tested with [[similarity]].
Like `+`, superposition is associative and commutative.
Superposition distributes over permutation.
Alias `bundle`.
"
[& hdvs]~'superposition ~@hdvs)) `(
5.1.1 Null
The null element of the superposition operation is a zero hypervector.
(wl/!'zeros
(w/_= 0 [dimensions])))) (w/fn [] (w/Table
nil
defn zeros
("Returns a zero hypervector.
This is the null element of the superposition operation."
[]list 'zeros)) (
5.1.2 Negative
The reverse of superposition is the superposition with the negative hdv (substraction).
'negative 'Minus)) (wl/! (w/_=
nil
defn negative
("returns a hdv that is the negative of the input hdv."
[hdv]~'negative ~hdv)) `(
let [a (wl/! (seed))]
( (wl/! (similarity a (negative a))))
1 -
let [a (wl/! (seed))
(
b (wl/! (seed))]
(wl/!
(w/==
a
(superposition
(superposition a b) (negative b)))))
true
5.2 Bind
- The other binary operation, it returns a hypervector that is dissimilar to its inputs.
- It is possible to recover the other input, given one input and the output.
- Therefore, ‘bind’ can be used to represent a key-value pair.
'bind 'Times)) (wl/! (w/_=
nil
defn bind
("returns a hdv that is different from the inputs.
The inverse of bind is binding with the 'inverse' vector.
In MAP, this is the same as the original vector.
(In MAP, bind is the inverse of itself.)
The bind represents a `key-value-pair` of the inputs.
Bind is associative and commutative.
Bind distributes over superposition and permutation.
Bind preserves similarity.
"
[& hdvs]~'bind ~@hdvs)) `(
let [a (wl/! (seed))
(
b (wl/! (seed))
c (wl/! (seed))]
[;; Revocer input
(wl/! (w/== a (bind (bind a b) b)))
;; Commutative
(wl/! (w/== (bind a b) (bind b a)))
;; Associative
(wl/! (w/== (bind (bind a b) c) (bind a (bind b c))))
;; Distributive over superposition
(wl/! (w/== (superposition (bind a c) (bind b c))
(bind c (superposition a b))))
;; preserve similarity
;; It is like `a`, `b` and their relationship are perfectly mirrored in the `c` domain.
(wl/!
(w/==
(similarity a b) (similarity (bind a c) (bind b c))))])
true true true true true] [
5.2.1 Identity
The identity element of the bind operation is the identity hypervector.
For MAP, this is an hdv where all bits are 1.
'identityHdv
(wl/! (w/_= 1 [dimensions])))) (w/fn [] (w/Table
nil
defn identity-hdv []
(list 'identityHdv)) (
binding a hdv with itself yields the identity hdv.
(wl/! (w/== (identity-hdv) (bind a a)))
true
binding a hdv with the identity hdv yields the hdv itself.
(wl/! (w/== a (bind a (identity-hdv))))
true
{H
, bind
, identity-hdv
} and {H
, superposition
, zeros
} are commutative monoids.
The whole thing together with inverse
is also a ring or something.
5.2.2 Inverse
'inverse w/Identity)) (wl/! (w/_=
nil
defn inverse
("`inverse` returns the inverse of the input hdv.
Bind with the inverse hdv is equivalent to unbind with the hdv.
(this is trivial in MAP, bind is it's own inverse, and a hdv is it's own inverse)."
[hdv]~'inverse ~hdv)) `(
let [a (wl/! (seed))
(
b (wl/! (seed))]= b (wl/! (bind (bind a b) (inverse a))))) (
true
def kvp
(:a)
(wl/! (bind (book-keep! :b)))) (book-keep!
(wl/!
(similarity:b)
(book-keep! :a)))) (bind kvp (book-keep!
1
(wl/!
(similarity:a)
(book-keep! :a)))) (bind kvp (book-keep!
81/5000
Bind can be thought of returning a point b ‘in the domain of’ the input a.
let [m (wl/!
(
(bind:green-domain)
(book-keep!
(superposition:a)
(book-keep! :b)
(book-keep! :c))))]
(book-keep! float (wl/! (similarity m (book-keep! :c))))
[(
;; Ask given :c 'in the the green-domain', is it contained in m?
float
(
(wl/! (similarity m
(bind:green-domain)
(book-keep! :c)))))]) (book-keep!
0.0134 0.9938] [-
5.3 Permute
shift / roll the bits of a hdv
'permute (w/fn [hdv n] (w/RotateLeft hdv n)))) (wl/! (w/_=
nil
defn permute
("`permute` returns a hdv that is the input hdv rotated by n bits.
With second arg `n`, permute this number of times.
This is used to create a point from `hdv` in an unrelated domain.
- to create a quotation
- to use as position marker in a sequence
- to implement a non-commutative bind
Alias `protect` / `unprotect`.
Inverse [[permute-inverse]].
"
([hdv]1))
(permute hdv
([hdv n]~'permute ~hdv ~n))) `(
defn permute-inverse
("`permute-inverse` returns a hdv that is the input hdv rotated by -n bits.
See [[permute]].
"
([hdv]1))
(permute-inverse hdv
([hdv n]- n)))) (permute hdv (
This is sort of subtle but powerful.
A permuted hdv is unrelated to the original hdv. We say permute is ‘randomizing’ the hdv, making it unrelated to hdvs of the input domain.
(wl/! (similarity a (permute a)))
3/500 -
(wl/! (similarity a (permute-inverse (permute a))))
1
Using non-commutative bind, we can represent directed edges for a directed graph.
defn non-commutative-bind
(
([a b] (bind a (permute b))))
defn directed-edge
("Given a tail and a head, return a directed edge."
[tail head] (non-commutative-bind tail head))
defn edge-head
("Given the tail and a directed edge, return the head of the edge."
[edge tail] (bind edge (permute tail)))
defn edge-tail
("Given the head and a directed edge, return the tail of the edge."
[edge head] (permute-inverse (bind edge head)))
defn directed-graph
(
[edges]apply superposition
(map (fn [[tail head]] (directed-edge tail head)) edges))) (
let [g (directed-graph [[(book-keep! :a) (book-keep! :b)]
(:b) (book-keep! :c)]
[(book-keep! :c) (book-keep! :a)]])]
[(book-keep!
(item-cleanup:a)))) (edge-tail g (book-keep!
:b) (
6 Examples with Cognitive Connotations
6.1 What is the Dollar in Mexico?
defn record [m]
(apply superposition
(map (fn [[k v]] (bind k v)) m))) (
defn ->record
(
[m]map #(mapv book-keep! %) m)))) (wl/! (record (
def usa-record
(
(->record:country :usa
{:capital :washington
:currency :dollar}))
def mexico-record
(
(->record:capital :mexico-city
{:country :mexico
:currency :peso}))
def query-outcome
(
(wl/!;; ~ :peso
(bind
mexico-record;; ~ :currency
:dollar))))) (bind usa-record (book-keep!
take 10 query-outcome) (
3 -3 -3 -3 1 3 1 -1 -3 -3) (-
map
(select-keys % [:obj :similarity])
#(-1 @item-memory query-outcome 0.2)) (cleanup
:obj :peso, :similarity 2573/2500}) ({
Superposition enables ‘programming in superposition’.
def record-xyz (wl/!
(
(record:x) (book-keep! 1)]
[[(book-keep! :y) (book-keep! 2)]
[(book-keep! :z) (superposition
[(book-keep! 1)
(book-keep! 2)
(book-keep! 3))]]))) (book-keep!
the value in the :x domain is 1
:x))) (item-cleanup (bind record-xyz (book-keep!
1) (
the value in the :y domain is 2
:y))) (item-cleanup (bind record-xyz (book-keep!
2) (
the value in the :z domain is the superposition of 1,2,3
:z))) (item-cleanup (bind record-xyz (book-keep!
1 2 3) (
Programming in superposition…
- A collection can be treated like an element.
- Everything is a set.
- ambiguity / continuity is first class.
Dot similarity explorations:
let [a (wl/! (seed))]
( (wl/! (similarity a a)))
1
let [a (wl/! (seed))
(
b (wl/! (seed))
c (wl/! (seed))]float
(
(wl/! (similarity a (w/Plus a b c)))))
0.9924
let [a (wl/! (seed))
(
b (wl/! (seed))
c (wl/! (seed))]float
(
(wl/!
(similarity
(w/Plus a b c) (w/Plus a b c)))))
2.9816
let [a (wl/! (seed))
(
b (wl/! (seed))
c (wl/! (seed))]float
(
(wl/!
(similarity
(normalize (w/Plus a b c)) (normalize (w/Plus a b c))))))
1.0
let [a (wl/! (seed))
(
b (wl/! (seed))
c (wl/! (seed))]float
(
(wl/!
(similarity
a (normalize (w/Plus a b c))))))
0.5
source: notebooks/intro2.clj