TY - GEN
T1 - Deep reinforcement learning
T2 - 36th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2017
AU - Li, Hongjia
AU - Wei, Tianshu
AU - Ren, Ao
AU - Zhu, Qi
AU - Wang, Yanzhi
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/13
Y1 - 2017/12/13
N2 - The recent breakthroughs of deep reinforcement learning (DRL) technique in Alpha Go and playing Atari have set a good example in handling large state and actions spaces of complicated control problems. The DRL technique is comprised of (i) an offline deep neural network (DNN) construction phase, which derives the correlation between each state-action pair of the system and its value function, and (ii) an online deep Q-learning phase, which adaptively derives the optimal action and updates value estimates. In this paper, we first present the general DRL framework, which can be widely utilized in many applications with different optimization objectives. This is followed by the introduction of three specific applications: the cloud computing resource allocation problem, the residential smart grid task scheduling problem, and building HVAC system optimal control problem. The effectiveness of the DRL technique in these three cyber-physical applications have been validated. Finally, this paper investigates the stochastic computing-based hardware implementations of the DRL framework, which consumes a significant improvement in area efficiency and power consumption compared with binary-based implementation counterparts.
AB - The recent breakthroughs of deep reinforcement learning (DRL) technique in Alpha Go and playing Atari have set a good example in handling large state and actions spaces of complicated control problems. The DRL technique is comprised of (i) an offline deep neural network (DNN) construction phase, which derives the correlation between each state-action pair of the system and its value function, and (ii) an online deep Q-learning phase, which adaptively derives the optimal action and updates value estimates. In this paper, we first present the general DRL framework, which can be widely utilized in many applications with different optimization objectives. This is followed by the introduction of three specific applications: the cloud computing resource allocation problem, the residential smart grid task scheduling problem, and building HVAC system optimal control problem. The effectiveness of the DRL technique in these three cyber-physical applications have been validated. Finally, this paper investigates the stochastic computing-based hardware implementations of the DRL framework, which consumes a significant improvement in area efficiency and power consumption compared with binary-based implementation counterparts.
KW - Cyber-physical systems
KW - Deep reinforcement learning
KW - Optimal control
KW - Stochastic computing
UR - http://www.scopus.com/inward/record.url?scp=85043497204&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85043497204&partnerID=8YFLogxK
U2 - 10.1109/ICCAD.2017.8203866
DO - 10.1109/ICCAD.2017.8203866
M3 - Conference contribution
AN - SCOPUS:85043497204
T3 - IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
SP - 847
EP - 854
BT - 2017 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 November 2017 through 16 November 2017
ER -