Monte Carlo tree search for dynamic shortest-path interdiction

Alexey A. Bochkarev, J. Cole Smith

Research output: Contribution to journalArticlepeer-review

Abstract

We present a reinforcement learning-based heuristic for a two-player interdiction game called the dynamic shortest path interdiction problem (DSPI). The DSPI involves an evader and an interdictor who take turns in the problem, with the interdictor selecting a set of arcs to attack and the evader choosing an arc to traverse at each step of the game. Our model employs the Monte Carlo tree search framework to learn a policy for the players using randomized roll-outs. This policy is stored as an asymmetric game tree and can be further refined as the game unfolds. We leverage alpha–beta pruning and existing bounding schemes in the literature to prune suboptimal branches. Our numerical experiments demonstrate that the prescribed approach yields near-optimal solutions in many cases and allows for flexibility in balancing solution quality and computational effort.

Original languageEnglish (US)
Pages (from-to)398-419
Number of pages22
JournalNetworks
Volume84
Issue number4
DOIs
StatePublished - Dec 2024

Keywords

  • Monte Carlo tree search
  • alpha–beta pruning
  • games
  • heuristics
  • interdiction
  • shortest path

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Monte Carlo tree search for dynamic shortest-path interdiction'. Together they form a unique fingerprint.

Cite this