GATO (DeepMind Generalist Agent)

Created on 2022-08-23T04:31:11-05:00

Basically proves that "intelligent" problems can be solved with a sufficiently large linear map.
All inputs are encoded as a feature vector with a tag for the problem domain and gradient descent trained with the solution.
The network is then trained on all problems at once and is able to use the same model to serve dozens of functions.

Quinn: If a sufficiently large linear map is all that is needed; is it possible to train on butterfly matrices instead? they are embarassingly more efficient.

Butterfly matrices