A major question in systems neuroscience concerns dopamine’s computational function during learning. The most influential framework addressing this

Dopamine and the need for alternative theories

submited by
Style Pass
2024-09-30 23:30:05

A major question in systems neuroscience concerns dopamine’s computational function during learning. The most influential framework addressing this question is the reward prediction error (RPE) model of dopamine function, which is based on temporal difference reinforcement learning (TDRL). RPE is defined as received minus predicted “value,” in which value for every moment is roughly the total expected future rewards. The core idea of TDRL is that this RPE signal enables a brain to progressively make better value estimates.

For example, imagine a child learning that the sound of an ice cream truck predicts ice cream, a highly valuable reward. The value of the sound grows as the child learns it predicts ice cream. According to the TDRL model, this increase in value is driven by the RPE associated with an unexpected ice cream. After learning, the sound of the truck acquires the high value of the ice cream, and the ice cream RPE drops to zero because it is now fully predicted by the sound.

The original conception of TDRL is elegant and principled. So the similarity observed between the TDRL RPE model and mesolimbic dopamine signaling is every theorist’s dream. Indeed, considerable data on dopamine signaling are consistent with RPE signaling. However, here I argue that a critical reevaluation of this established dogma is necessary for progress in the field, and my lab recently proposed a promising alternative.

Leave a Comment