Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network

Ning Liu, Ying Liu, Brent Logan, Zhiyuan Xu, Jian Tang, Yanzhi Wang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

This paper presents the deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctors and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts’ decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.

Original languageEnglish (US)
Article number1495
JournalScientific Reports
Volume9
Issue number1
DOIs
StatePublished - Dec 1 2019

Fingerprint

Registries
Learning
Therapeutics
Graft vs Host Disease
Reward
Disease Progression
Transplantation
Bone Marrow
Reinforcement (Psychology)
Databases
Transplants
Research

ASJC Scopus subject areas

  • General

Cite this

Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network. / Liu, Ning; Liu, Ying; Logan, Brent; Xu, Zhiyuan; Tang, Jian; Wang, Yanzhi.

In: Scientific Reports, Vol. 9, No. 1, 1495, 01.12.2019.

Research output: Contribution to journalArticle

Liu, Ning ; Liu, Ying ; Logan, Brent ; Xu, Zhiyuan ; Tang, Jian ; Wang, Yanzhi. / Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network. In: Scientific Reports. 2019 ; Vol. 9, No. 1.
@article{4ab8c9d567194f738a32566ae0ea2000,
title = "Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network",
abstract = "This paper presents the deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctors and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts’ decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.",
author = "Ning Liu and Ying Liu and Brent Logan and Zhiyuan Xu and Jian Tang and Yanzhi Wang",
year = "2019",
month = "12",
day = "1",
doi = "10.1038/s41598-018-37142-0",
language = "English (US)",
volume = "9",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network

AU - Liu, Ning

AU - Liu, Ying

AU - Logan, Brent

AU - Xu, Zhiyuan

AU - Tang, Jian

AU - Wang, Yanzhi

PY - 2019/12/1

Y1 - 2019/12/1

N2 - This paper presents the deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctors and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts’ decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.

AB - This paper presents the deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctors and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts’ decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.

UR - http://www.scopus.com/inward/record.url?scp=85061117758&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061117758&partnerID=8YFLogxK

U2 - 10.1038/s41598-018-37142-0

DO - 10.1038/s41598-018-37142-0

M3 - Article

C2 - 30728403

AN - SCOPUS:85061117758

VL - 9

JO - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

IS - 1

M1 - 1495

ER -