Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification

Yanzhi Wang, Qing Xie, Ahmed Ammari, Massoud Pedram

Research output: Chapter in Book/Report/Conference proceedingConference contribution

41 Scopus citations

Abstract

To cope with the variations and uncertainties that emanate from hardware and application characteristics, dynamic power management (DPM) frameworks must be able to learn about the system inputs and environment and adjust the power management policy on the fly. In this paper we present an online adaptive DPM technique based on model-free reinforcement learning (RL), which is commonly used to control stochastic dynamical systems. In particular, we employ temporal difference learning for semi-Markov decision process (SMDP) for the model-free RL. In addition a novel workload predictor based on an online Bayes classifier is presented to provide effective estimates of the workload states for the RL algorithm. In this DPM framework, power and latency tradeoffs can be precisely controlled based on a user-defined parameter. Experiments show that amount of average power saving (without any increase in the latency) is up to 16.7% compared to a reference expert-based approach. Alternatively, the per-request latency reduction without any power consumption increase is up to 28.6% compared to the expert-based approach.

Original languageEnglish (US)
Title of host publication2011 48th ACM/EDAC/IEEE Design Automation Conference, DAC 2011
Pages41-46
Number of pages6
StatePublished - 2011
Event2011 48th ACM/EDAC/IEEE Design Automation Conference, DAC 2011 - San Diego, CA, United States
Duration: Jun 5 2011Jun 9 2011

Publication series

NameProceedings - Design Automation Conference
ISSN (Print)0738-100X

Other

Other2011 48th ACM/EDAC/IEEE Design Automation Conference, DAC 2011
CountryUnited States
CitySan Diego, CA
Period6/5/116/9/11

Keywords

  • Bayes Classification
  • Dynamic Power Management
  • Reinforcement Learning

ASJC Scopus subject areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification'. Together they form a unique fingerprint.

  • Cite this

    Wang, Y., Xie, Q., Ammari, A., & Pedram, M. (2011). Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification. In 2011 48th ACM/EDAC/IEEE Design Automation Conference, DAC 2011 (pp. 41-46). [5981919] (Proceedings - Design Automation Conference).