Advanced Persistent Threat (APT) has dramatically changed the landscape of cybersecurity. APT is carried out by stealthy, continuous, sophisticated, and well-funded attack processes for long-term malicious gain thwarting most current defense mechanisms. There is a need for a defense strategy that continuously combats APT over a long time-span in imper-fect/incomplete information on attacker's actions. We propose the stochastic evolutionary game model to simulate the dynamic adversary to address this need in this work. We add the player's rationality parameter c to the Logit Quantal Response Dynamics (LQ RD) model to quantify the cognitive differences of real-world players. We propose an optimal decision-making plan by calculating the stable evolutionary equilibrium that balances a trade-off between defense cost and benefit. Cases studies conducted on Energy Delivery Systems (EDS) indicate that the proposed method can help the defender predict possible attack action, select the related optimal cyber defense remediation over time, and gain the maximum defense payoff.