Dynamic power management (DPM) in battery-powered mobile systems attempts to achieve higher energy efficiency by selectively setting idle components to a sleep state. However, re-activating these components at a later time consumes a large amount of energy, which means that it will create a significant power draw from the battery supply in the system. This is known as the energy overhead of the "wakeup" operation. We start from the observation that, due to the rate capacity effect in Li-ion batteries which are commonly used to power mobile systems, the actual energy overhead is in fact larger than previously thought. Next we present a model-free reinforcement learning (RL) approach for an adaptive DPM framework in systems with bursty workloads, using a hybrid power supply comprised of Li-ion batteries and supercapacitors. Simulation results show that our technique enhances power efficiency by up to 9% compared to a battery-only power supply. Our RL-based DPM approach also achieves a much lower energy-delay product compared to a previously reported expert-based learning approach.