Efficient Sampling With Q -Learning to Guide Rapidly Exploring Random Trees

2018 
This letter presents a novel approach for efficient sampling of Rapidly-exploring Random Trees (RRTs) based upon learning a state-action value function ( Q -function). Our sampling method selects the optimal node to extend in the search tree via the learned state value computed from the node feature representation. Our softmax node selection procedure avoids becoming stuck at local minima and maintains the asymptotic completeness property of RRTs. We employ several features in learning the Q -function, including radial basis function (RBF) scoring of collision and collision-free regions in the configuration space. Since this approach allows the RRT to explore efficiently while avoiding obstacles via the Q -function, the RRT planner is continually adapted to the surrounding environment in an online manner. We compare our proposed method with traditional sampling-based planning algorithms in a number of robot arm planning scenarios and demonstrate the utility and effectiveness of our approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    6
    Citations
    NaN
    KQI
    []