In cognitive radio networks (CRNs), dynamic spectrum access has been proposed to improve the spectrum utilization, but it also generates spectrum misuse problems. One common solution to these problems is to deploy monitors to detect misbehaviors on certain channel. However, in multi-channel CRNs, it is very costly to deploy monitors on every channel. With a limited number of monitors, we have to decide which channels to monitor. In addition, we need to determine how long to monitor each channel and in which order to monitor, because switching channels incurs costs. Moreover, the information about the misuse behavior is not available a priori. To answer those questions, we model the spectrum usage monitoring problem as an adversarial multi-armed bandit problem with switching costs and design two effective online algorithms, SpecWatch and SpecWatch+. In SpecWatch, we select strategies based on the monitoring history and repeat the same strategy for certain timeslots to reduce switching costs. We prove its expected weak regret, i.e., the performance difference between the solution of SpecWatch and optimal (fixed) solution, is O(T2/3), where T is the time horizon. Whereas, in SpecWatch+, we select strategies more strategically to improve the performance. We show its actual weak regret is O(T2/3) with probability 1-δ, for any δ e (0,1). Both algorithms are evaluated through extensive simulations.