GRID CELL PATH INTEGRATION FOR MOVEMENT-BASED VISUAL OBJECT RECOGNITION

Created on 2024-01-24T08:27:49-06:00

Composition of a Grid Cell/Neuron: A distribution of dots with an orientation and scale. The dots are spaced regularly such as on the points of a hexagon.

Use of Grid Cell/Neurons: When an entity moves it is tracked as a dot on an infinite 2D plane. The dot translates with the movement of the entity. As the entity overlaps with dots projected by grid cells (wherein dots are spaced by their orientation and scale) that particular grid cell will fire. As cells of different orientations and scales overlap and the cells fire it creates a sparse representation of where something is located in the world.

Use of “k-Winner Take All” layer to get a binary output from a convolution neural network.

Sensorimotor network wants to activate output columns for features that exist in various known objects. Example given is the top arch of a 9 and 0 will be seen as a top arch, and thus activate possible outputs for the 9 and the 0.

A second layer is used to indicate which area of the input field the feature was seen in.

Mutual strengthening between feature space and location space is used to help identify different objects.

It looks like the network is allowed to “glance” at different parts of an input; reference is made to

Questions

does it actually choose which section to look at next, or is it fed them randomly? There is mention of a movement vector but I didn’t quite understand if the observed target is moving or the retina is moving.