The credit assignment problem in cortico-basal ganglia-thalamic networks: a review, a problem, and a possible solution.

2020 
The question of how cortico-basal ganglia-thalamic (CBGT) pathways use dopaminergic feedback signals to modify future decisions has challenged computational neuroscientists for decades. Reviewing the literature on computational representations of dopaminergic corticostriatal plasticity, we show how the field is converging on a normative, synapticlevel learning algorithm that elegantly captures both neurophysiological properties of CBGT circuits and behavioral dynamics during reinforcement learning. Unfortunately, the computational studies that have led to this normative algorithmic model have all relied on simplified circuits that use abstracted action-selection rules. As a result, the application of this corticostriatal plasticity algorithm to a full model of the CBGT pathways immediately fails because the spatiotemporal distance between integration (corticostriatal circuits), action selection (thalamocortical loops), and learning (nigrostriatal circuits)means that the network does not knowwhich synapses should be reinforced to favor previously rewarding actions. We showhowobservations from neurophysiology, in particular the sustained activation of selected action representations, can provide a simplemeans of resolving this credit assignment problem in models of CBGT learning. Using a biologically realistic spiking model of the full CBGT circuit, we demonstrate how this solution can allow a network to learn to select optimal targets and to relearn action-outcome contingencieswhen the environment changes. This simple illustration highlights how the normative framework for corticostriatal plasticity can be expanded to capture macroscopic network dynamics during learning and decision-making.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    126
    References
    4
    Citations
    NaN
    KQI
    []