Sparse online maximum entropy inverse reinforcement learning via proximal optimization and truncated gradient

Li Song,Dazi Li,Xin Xu

Sparse online maximum entropy inverse reinforcement learning via proximal optimization and truncated gradient

2022

Li Song
Dazi Li
Xin Xu

-regularization and adaptive per-state learning rates, our proposed algorithm can select features and correct the update direction of reward weights to reduce model complexity and avoid overfitting, which also speeds up convergence. During each iteration, the truncated gradient (TG) method is applied for the ME-FTPRL IRL (named ME-TFTPRL IRL) to update reward weights. This avoids the floating-point problem of the FTPRL method.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations