Neural Network Heuristic Functions for Classical Planning: Reinforcement Learning and Comparison to Other Methods

2021 
How can we train neural network (NN) heuristic functions for classical planning, using only states as the NN input? Prior work addressed this question by (a) supervised learning and/or (b) per-domain learning generalizing over problem in- stances. The former limits the approach to instances small enough for training data generation, the latter to domains and instance distributions where the necessary knowledge generalizes across instances. Clearly, reinforcement learning (RL) on large instances can potentially avoid both difficul- ties. We explore this here in terms of three methods drawing on previous ideas relating to bootstrapping and approximate value iteration, including a new bootstrapping variant that es- timates search effort instead of goal distance. We empirically compare these methods to (a) and (b), aligning three differ- ent NN heuristic function learning architectures for cross- comparison in an experiment of unprecedented breadth in this context. Key lessons from this experiment are that our meth- ods and supervised learning are highly complementary; that per-instance learning often yields stronger heuristics than per- domain learning; and that LAMA is still dominant but is out- performed by our methods in one benchmark domain .
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []