Adversarial reinforcement learning for dynamic treatment regimes

Zhaohong Sun,Wei Dong,Haomin Li,Zhengxing Huang

Adversarial reinforcement learning for dynamic treatment regimes

2023

Treatment recommendation, as a critical task of delivering effective interventions according to patient state and expected outcome, plays a vital role in precision medicine and healthcare management. As a well-suited tactic to learn optimal policies of recommender systems, reinforcement learning is promising to address the challenge of treatment recommendation. However, existing solutions mostly require frequent interactions between treatment recommender systems and clinical environment, which are expensive, time-consuming, and even infeasible in clinical practice. In this study, we present a novel model-based offline reinforcement learning approach to optimize a treatment policy by utilizing patient treatment trajectories in Electronic Health Records (EHRs). Specifically, a patient treatment trajectory simulator is firstly constructed based on the ground-truth trajectories in EHRs. Thereafter, the constructed simulator is utilized to model the online interactions between the treatment recommender system and clinical environment. In this way, the counterfactual trajectories can be generated. To alleviate the bias deriving from the ground-truth and the counterfactual trajectories, an adversarial network is incorporated into the proposed model, such that a large space of treatment actions can be explored with the scaled rewards. The proposed model is evaluated on a simulated dataset and a real-world dataset. The experimental results demonstrate that the proposed model is superior to other methods, giving rise to a new solution for dynamic treatment regimes and beyond.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations