To address the anomaly detection problem in the presence of noisy sensor observations and probing costs, we in this paper propose a soft actor-critic deep reinforcement learning framework. Moreover, considering adversarial jamming attacks, we design a generative adversarial network (GAN) based framework to identify the jammed sensors. To evaluate the proposed framework, we measure the performance in terms of detection accuracy, stopping time, and the total number of samples needed for detection. Via simulation results, we demonstrate the performances when soft actor-critic algorithms are sensitive to the probing cost and actively adapt to different environment settings. We analyze the impact of jamming attacks and identify the improvements achieved by GAN-based approach. We further provide comparisons between the performances of the proposed soft actor-critic and conventional actor-critic algorithms.