separated and serializable encoders,about htm-community/comportex

Comments (13)

cogmission commented on August 30, 2024

Isn't an Encoder a stateless entity? Why bother serializing it (if you're talking about state)? Seems the only state is configuration variables which should be "packaged" externally so that they are "re-appliable", that way you can just instantiate an encoder at the endpoint and serialize the parameters and just send those - re-applying the parameters locally to the instantiated Encoder?

Notice the above are all questions. I'm testing my logic, not really declaring a "should" :-)

Sent from my iPhone

On Jun 24, 2015, at 2:48 AM, Felix Andrews [email protected] wrote:

Encoders are the only things in Comportex HTM values that are not serializable. We should fix that so that models can be saved or sent over the wire.

Currently they are created using (reify) which makes a closure. But the main problem is how they get their particular data inputs out of the general, amorphous input-value provided to htm-step. That is using pre-transform which applies some arbitrary function to do the extraction.

Proposal:

Shift the task of formatting the data for encoding outside to whatever is creating the input values. An input-value as provided to htm-step would then be a structured value directly providing the data required by each input's encoder. Each input already has a keyword id specified in core/RegionNetwork so these keys could specify the input data for each.

Example. Suppose there are 2 inputs (feeding into one or more regions), called :main-input and :motor. The former is a concatenation of a category and a coordinate encoding. The latter is a single linear number encoding.

The input-value might then look like:

{:motor 42
:main-input [:red {:coord [5.0 10.0], :radius 2.5}]
}
Each input encoder will be passed just its sub value.

The various encoder types can be made into Records or Types.

—
Reply to this email directly or view it on GitHub.

from comportex.

floybix commented on August 30, 2024

Hi David, interesting question.

So the information, as you rightly point out, is
more like configuration than mutable state. It defines which encoders are
used and how they are combined. Then each encoder will have parameters such
as width/onbits/min/max in a linear scalar encoder.

In Clojure, any object, such as an encoder, is just a map with keyword
keys, and tagged with its type, so when you serialize it (which is the same
as printing it) you get something that looks like configuration data.

E.g. in the above example the serialized data for the inputs (as nodes in
an htm network, like regions) might look like

:inputs {
  :motor #LinearEncoder {
    :width 127, :on-bits 21, :min 0, :max 100
  },
  :main-input #ConcatEncoder {
    :encoders [
      #CategoryEncoder {
        :width 100, :values [:red :green :blue]
      }
      #CoordinateEncoder {
        :width 180, :on-bits 20
      }
  }
}

So it kinda is configuration data. I guess an argument for inventing and
handling a separate configuration format is that it would be less tied to
the current implementation details. But a really nice thing about this is
there is no special serialization mechanism at all. You just print out the
whole htm object, and then read it in at the other end like any other value.

I don't have any real experience with serialization so i may also be
missing something!

Cheers

On Wednesday, June 24, 2015, David Ray [email protected] wrote:

Isn't an Encoder a stateless entity? Why bother serializing it (if you're
talking about state)? Seems the only state is configuration variables which
should be "packaged" externally so that they are "re-appliable", that way
you can just instantiate an encoder at the endpoint and serialize the
parameters and just send those - re-applying the parameters locally to the
instantiated Encoder?

Notice the above are all questions. I'm testing my logic, not really
declaring a "should" :-)

Sent from my iPhone

On Jun 24, 2015, at 2:48 AM, Felix Andrews [email protected]
wrote:

Encoders are the only things in Comportex HTM values that are not
serializable. We should fix that so that models can be saved or sent over
the wire.

Currently they are created using (reify) which makes a closure. But the
main problem is how they get their particular data inputs out of the
general, amorphous input-value provided to htm-step. That is using
pre-transform which applies some arbitrary function to do the extraction.

Proposal:

Shift the task of formatting the data for encoding outside to whatever
is creating the input values. An input-value as provided to htm-step would
then be a structured value directly providing the data required by each
input's encoder. Each input already has a keyword id specified in
core/RegionNetwork so these keys could specify the input data for each.

Example. Suppose there are 2 inputs (feeding into one or more regions),
called :main-input and :motor. The former is a concatenation of a category
and a coordinate encoding. The latter is a single linear number encoding.

The input-value might then look like:

{:motor 42
:main-input [:red {:coord [5.0 10.0], :radius 2.5}]
}
Each input encoder will be passed just its sub value.

The various encoder types can be made into Records or Types.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
#19 (comment)
.

from comportex.

floybix commented on August 30, 2024

Encoders should probably be decomplected from the core functioning of a HTM network.

Let's call a network node representing encoded input a Sensor. It doesn't need to know about an encoder, only its resulting bit-set. Sensor nodes in a network could simply be containers holding bit-sets. We could define p/htm-activate-raw taking bit-sets to assign to each Sensor node, rather than a domain-related input value.

However, we would want to record the original input value, as it is used:

in visualizing the input data for a model (world-pane in ComportexViz)
in labelling or grouping states (state transition diagram in ComportexViz, or assessing classification or temporal pooling). Note this could be extra data beyond the values passed to encoders.

So a network (SensingNetwork?) would store (a) a map of encoders, corresponding to the Sensor nodes and (b) its input value. A function htm-activate would take a domain input value, use the encoders to generate bit-sets, and pass them to htm-activate-raw. And record the input value.

Does that sound reasonable?

from comportex.

floybix commented on August 30, 2024

On second thought, the term "Sensor" wrongly suggests that it does its own encoding. A better name is "Sense".

Another question that arises is: should Senses define in themselves whether they provide proximal or distal inputs (as currently), or alternatively should Regions declare each of their inputs to be either proximal or distal (and Sense outputs be available for either purpose)? In other words, should the Senses or the Regions decide?

This would matter if Senses were used by some regions as proximal input and others as distal input. Or if a region needed to receive the same Sense as both proximal and distal input (as was suggested for Layer 4 at one point). However a work-around would be to make duplicate Sense nodes for proximal vs distal. I think I would prefer this to complicating the network connection definitions. So, staying with the current approach for now.

from comportex.

mrcslws commented on August 30, 2024

Just so I'm clear... is the SensingNetwork separate from the RegionNetwork? Would it be something like

(let [[htm sensors] (htm-activate htm sensors input)])

or are the inputs being recorded in an atom somewhere? I suspect I'm not interpreting this correctly.

I like "Sense". That name resonates with me. And removing encoders from the RegionNetwork definitely sounds like a good idea since we'll probably never stop creating encoders via higher-order functions, e.g. with pre-transform, so they'll never be serializable.

from comportex.

floybix commented on August 30, 2024

Interesting that you say "we'll probably never stop creating encoders via higher-order functions"... as I was just about to have a go at doing that (as in original description of this issue). Do you see a problem?

My current idea is not to have a SensingNetwork, rather encoders would be stored in a RegionNetwork but treated separately. Like this:

(defprotocol PHTM
  (sense [this in-value])
  (htm-activate-raw [this in-bits])
  (htm-learn [this])
  (htm-depolarise [this])
  ...)

(defn htm-activate
  [this in-value]
  (-> this
      (assoc :input-value in-value)
      (htm-activate-raw (sense this in-value))))

A RegionNetwork would gain keys :encoders and :senses instead of :inputs

from comportex.

mrcslws commented on August 30, 2024

Good call, I should have scrolled up. Ok, so no more passing arbitrary values as inputs. Only those that our defrecord encoders can handle. I can live with that.

from comportex.

floybix commented on August 30, 2024

I started implementing the proposal above in the demos but it felt wrong. It makes the representation of the world be twisted into a shape required by particular encoders.

For example, in demo coordinates-2d the world is represented like

{:x -10
 :y -20
 :vx 1
 :vy 1
 :ax 1
 :ay 1}

i.e. position, velocity, acceleration. And the world is updated each time step by a function that operates on this in a natural way. But with the new proposal we would need a transformation step before it can be used as an input value:

(fn [world-value]
  (let [{:keys [x y]} world-value]
    {:input {:coord [x y]
             :radii [radius radius]}
     :world world-value}))

This feels awkward since

the :input-value stored in the model, used for viz or analysis, is then not the original world value, even if maybe it includes the original value somehow (here a magical key :world, yuck);
that transformation step needs to know about the names (here :input) and types of encoders. So if you wanted to switch from one encoder to another you would have to change not only the encoder passed to region-network but also the shape of the input value.

It seems better to have encoders encapsulate how to extract their data from the world, as in pre-transform... So we are left with the serialization problem.

It turns out that almost all uses of pre-transform so far are just selecting a value under a particular key. Obviously keywords are serializable so such selector use cases are easy. Even in the coordinates-2d example above, the world value could just as well be represented with {:coord [-10 20], ...} such that a selector :coord could be used; the radius doesn't actually depend on the input value so could instead be a parameter of the CoordinateEncoder. If we wanted to switch from a CoordinateEncoder to a pair of LinearEncoders, they could select paths [:coord 0] and [:coord 1] without changing the input value - good.

There may be more complex transformations or selections required in other cases. For example NuPIC's GeospatialEncoder has a radius that varies with input, so it would need to operate on a value like {:coord [x y], :radii [rx ry]}.

Or imagine a world represented by objects in a space, but encoders need to pull out a visual perspective on those objects from a particular point in space. A complex transformation. In this case it seems reasonable to first transform the world into a sensed / sensible value.

So maybe a good compromise is to attach selectors to encoders, which gives some degree of independence between encoders and the shape of the input value, while relying on the user to handle any more complex transformations required to actually compute the data.

from comportex.

mrcslws commented on August 30, 2024

Would the model hold on to the selector and the encoder separately? Or would we use something like a higher-level encoder to get-in the value?

I'm thinking we should keep them separate. Otherwise using p/decode within the model would become weird.

So it'd be in the region-network params, either a sense->path or maybe making sensory-encoders use a [sense [path encoder]] format.

from comportex.

floybix commented on August 30, 2024

Ah, you mean it would be weird because what you pass to p/encode would be a different shape to what you get out of p/decode. Yes I see.

I kind of like having an encoder as a self-contained object though. And you couldn't in general switch an encoder without also changing its selector.

What if the PEncoder protocol had a method extract which returns the encodable value from the world value, while its encode method is passed just that extracted value? That would also leave the door open to other more complex extraction functions beyond get-in, like a juxt-alike for example...

I don't know, maybe I am complecting.

from comportex.

floybix commented on August 30, 2024

Names help. If I think of a sensor as a [selector encoder] then I'm willing to have them defined in user code and passed as arguments.

Whether the selector should satisfy a protocol or just be a get-in path, I'm not sure.

from comportex.

floybix commented on August 30, 2024

Actually this separation makes it awkward to combine sensors. Say we have {:position 42, :colour "red"} and we want to concatenate an encoding of the number from :position and the category from :colour.

Individually they would be

(def posn-sensor [(e/select :position) (e/linear-encoder 100 20 [0 99])])
(def colr-sensor [(e/select :colour) (e/unique-encoder [100] 20)])

But to concatenate them?

(def both-sensor
  [(e/juxt (e/select :position) (e/select :colour))
   (e/encat
     (e/linear-encoder 100 20 [0 99])
     (e/unique-encoder [100] 20))])

Whereas if the selector was encapsulated in an encoder one could write

(def both-encoder
  (e/encat
     (e/select :position
        (e/linear-encoder 100 20 [0 99]))
     (e/select :colour
        (e/unique-encoder [100] 20))))

That said, I'm not sure there is any reason to concatenate encoders into one sense, rather than treating them as separate senses.

from comportex.

floybix commented on August 30, 2024

My last complaint can be disregarded. In practice one would want to concatenate sensors, so we can have a function (e/sensor-cat) which just applies e/juxt and e/encat as above.

You've convinced me of the need for a PExtractor protocol. Thanks!

from comportex.

separated and serializable encoders about comportex HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent