Achieving autonomous power management using reinforcement learning

Hao Shen, Ying Tan, Jun Lu, Qing Wu, Qinru Qiu

Research output: Contribution to journalArticlepeer-review

87 Scopus citations


System level power management must consider the uncertainty and variability that come from the environment, the application and the hardware. A robust power management technique must be able to learn the optimal decision from past events and improve itself as the environment changes. This article presents a novel on-line power management technique based on model-free constrained reinforcement learning (Q-learning). The proposed learning algorithm requires no prior information of the workload and dynamically adapts to the environment to achieve autonomous power management. We focus on the power management of the peripheral device and the microprocessor, two of the basic components of a computer. Due to their different operating behaviors and performance considerations, these two types of devices require different designs of Q-learning agent. The article discusses system modeling and cost function construction for both types of Q-learning agent. Enhancement techniques are also proposed to speed up the convergence and better maintain the required performance (or power) constraint in a dynamic system with large variations. Compared with the existing machine learning based power management techniques, the Q-learning based power management is more flexible in adapting to different workload and hardware and provides a wider range of power-performance tradeoff.

Original languageEnglish (US)
Article number24
JournalACM Transactions on Design Automation of Electronic Systems
Issue number2
StatePublished - Mar 2013


  • Computer
  • Machine learning
  • Power management
  • Thermal management

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering


Dive into the research topics of 'Achieving autonomous power management using reinforcement learning'. Together they form a unique fingerprint.

Cite this