Autonomous vehicles often benefit from the Global Positioning System (GPS) for navigational guidance as people do with their mobile phones or automobile radios. However, since GPS is not always available or reliable everywhere, autonomous vehicles need more reliable systems to understand where they are and where they should head to. Moreover, even though GPS is reliable, autonomous vehicles usually need extra sensors for more precise position estimation. In this work, we propose a localization method for autonomous Unmanned Aerial Vehicles (UAVs) for infrastructure health monitoring without relying on GPS data. The proposed method only depends on depth image frames from a 3D camera (Structure Sensor) and the 3D map of the structure. Captured 3D scenes are projected onto 2D binary images as templates, and matched with the 2D projection of relevant facade of the structure. Back-projections of matching regions are then used to calculate 3D translation (shift) as estimated position relative to the structure. Our method estimates position for each frame independently from others at a rate of 200Hz. Thus, the error does not accumulate with the traveled distance. The proposed approach provides promising results with mean Euclidean distance error of 13.4 cm and standard deviation of 8.4cm.