DEC Colloquium

Metacontrol of reinforcement learning

Practical information
17 September 2019

ENS, room Jaurès, 24 rue Lhomond, 75005 Paris


Modern theories of reinforcement learning posit two systems competing for control of behavior: a "model-free" or "habitual" system that learns cached state-action values, and a "model-based" or  "goal-directed" system that learns a world model which is then used to plan actions. I will argue that humans can adaptively invoke model-based computation when its benefits outweigh its costs. A simple meta-control learning rule can capture the dynamics of this cost-benefit analysis. Neuroimaging evidence points to the role of cognitive control regions in this computation. The theory also resolves a number of puzzling observations about controller arbitration in the brain.

To meet Sam Gershman, please contact Srdjan Ostojic:

Sam Gershman