Supervised Learning in SNN via Reward-Modulated Spike-Timing-Dependent Plasticity for a Target Reaching Vehicle

2019 
Spiking neural networks (SNNs) offer many advantages over traditional artificial neural networks (ANNs) such as biological plausibility, fast information processing, and energy efficiency. Although SNNs have been used to solve a variety of control tasks using the modulated Spike-Timing-Dependent-Plasticity (STDP) learning rule, existing solutions usually involve hard-coded network architecture solving specified tasks rather than solving tasks in the decent style as traditional ANNs do. This results in neglecting one of the biggest advantages of ANNs, being general-purpose and easy-to-use due to their simple network architecture, which usually consists of an input layer, one or multiple hidden layers and an output layer. This paper addresses the problem by introducing an end-to-end learning approach of spiking neural networks constructed with one hidden layer and R-STDP synapses in an all-to-all fashion. We use the supervised reward-modulated Spike-Timing-Dependent-Plasticity (R-STDP) learning rule to train two different SNN-based sub-controllers to replicate a desired obstacle avoiding and goal approaching behavior, provided by pre-generated datasets. Together they make up a target-reaching controller and are used to control a simulated mobile robot to reach a target area while avoiding obstacles in its path. We demonstrate the performance and effectiveness of our trained SNNs to achieve target reaching tasks in different unknown scenarios.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    9
    Citations
    NaN
    KQI
    []