Dissociating the Contributions of Reward-Prediction Errors to Trial-Level Adaptation and Long-Term Learning

2019 
Abstract Reward positivity (RewP) is an EEG component reflecting reward-prediction errors. Using multilevel models, we measured single-trial RewP amplitude from trial-to-trial, while reward and prediction varied during learning. Sixty participants completed a category-learning task in either engaging or sterile conditions with the RewP time-locked to feedback. Sequential analysis of single-trial RewP showed its relationship to current and previous accuracy, and the probability of changing one’s response to subsequent stimuli. Simulations show these effects can be explained in detail by the dynamics of participants’ expectations according to principles of reinforcement learning. The single-trial RewP findings were consistent with previous literature linking RewP to reward-prediction error under reinforcement-learning theory. In contrast, the aggregate RewP was unrelated to the engagement manipulation or to delayed retention performance. Thus the present results provide a detailed computational account how RewP relates to acute adaptation, but suggest RewP plays little role in long-term learning.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    10
    Citations
    NaN
    KQI
    []