Learning Geometric Reasoning and Control for Long-Horizon Tasks from Visual Input

2021 
Long-horizon manipulation tasks require joint reasoning over a sequence of discrete actions and their associated continuous control parameters. While Task and Motion Planning (TAMP) approaches are capable of generating motion plans that account for this joint reasoning, they usually assume full knowledge about the environment (e.g. in terms of shapes, poses of objects) and often require computation times not suitable for real-time control.To overcome this, we propose a learning framework where a high-level reasoning network predicts, based on an image of the scene, a sequence of discrete actions and the parameter values of their associated low-level controllers. These controllers are parameterized in terms of a learned energy function, leading to time-invariant controllers for each phase. We train the whole framework end-to-end using a dataset of TAMP solutions computed using Logic Geometric Programming. A key feature is that the reasoning network determines the parameters of the controllers jointly, such that the overall task can be solved. Despite having no explicit representation of the geometry nor pose of the objects in the scene, our network is still able to accomplish geometrically precise manipulation tasks, including handovers and an accurate pointing task where the parameters of early actions are tightly coupled with those of later actions. Video: https://youtu.be/AcPWRTkr3_g
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    59
    References
    8
    Citations
    NaN
    KQI
    []