Scott Reed and Nando de Freitas, developed at Google DeepMind
We propose the neural programmer-interpreter (NPI): a recurrent and compositional neural network that learns to represent and execute programs. NPI has three learnable components: a task-agnostic recurrent core, a persistent key-value program memory, and domain-specific encoders that enable a single NPI to operate in multiple perceptually diverse environments with distinct affordances.
In this environment the model has access to read-write pointers on a scratch pad. The task is to write down the solution to the addition problem. The first two pointers can read from the input numbers, the third pointer can record carries, and the fourth pointer can write down the output.
Canonicalizing 3D car models
Given a rendering of a 3D car, we would like to learn a visual program that “canonicalizes” the model with respect to its pose. Whatever the starting position, the program should generate a trajectory of actions that delivers the camera to the target view, e.g. frontal pose at a 15 degree elevation. For training data, we used renderings of the 3D car CAD models from (Fidler et al., 2012).
In this environment the model has access to read-only pointers on an array, with the ability to swap values of the first two pointers. The third pointer is used as a counter. The task is to sort the array in ascending order. We trained the model to perform bubble sort.