Reinforcement learning algorithms for dynamic power management

Maryam Triki, Ahmed C. Ammari, Yanzhi Wang, Massoud Pedram

Research output: Chapter in Book/Entry/PoemConference contribution

4 Scopus citations


In this paper we present a dynamic power management (DPM) framework based on model-free reinforcement learning (RL) techniques. For the RL algorithms, we employ both temporal difference learning and Q-learning for semi-Markov decision process in a continuous-time manner. The proposed DPM is model-free and do not require any prior information of the workload characteristics. The power manager learns the optimal power management policy that significantly reduces energy consumption while maintaining an acceptable performance level. Moreover, power-latency tradeoffs can be precisely controlled based on a user-defined parameter. In addition, the temporal difference (TD) learning is compared with the Q-learning approach in terms of both performance and convergence speed. Experiments on network cards show that TD achieves better power saving without sacrificing any latency and has faster convergence speed compared to Q-learning.

Original languageEnglish (US)
Title of host publication2014 World Symposium on Computer Applications and Research, WSCAR 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479928057
StatePublished - Oct 3 2014
Externally publishedYes
Event2014 World Symposium on Computer Applications and Research, WSCAR 2014 - Sousse, Tunisia
Duration: Jan 18 2014Jan 20 2014


Other2014 World Symposium on Computer Applications and Research, WSCAR 2014

ASJC Scopus subject areas

  • Computer Science Applications


Dive into the research topics of 'Reinforcement learning algorithms for dynamic power management'. Together they form a unique fingerprint.

Cite this