TY - GEN
T1 - Person detection and re-identification across multiple images and videos obtained via crowdsourcing
AU - Zheng, Yu
AU - Chen, Zhenhua
AU - Velipasalar, Senem
AU - Tang, Jian
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/9/12
Y1 - 2016/9/12
N2 - Person re-identification is indispensable for consistent labeling across different camera views. Most existing studies use static cameras, apply background subtraction to detect moving people, and then focus on the matching of detection results. However, if cameras are mobile or only single image frames (not videos) are available, then background subtraction cannot be used, and human detection needs to be performed on entire images. In this paper, different from most of the existing work, we focus on a crowdsourcing scenario to find and follow person(s) of interest in the collected images/videos. We propose a novel approach combining R-CNN based person detection with the GPU implementation of color histogram and SURF- based re-identification. Moreover, GeoTags are extracted from the EXIF data of videos captured by smart phones, and are displayed on a map together with the time-stamps. All the processing is performed on a GPU, and the average processing time is 5 ms per frame.
AB - Person re-identification is indispensable for consistent labeling across different camera views. Most existing studies use static cameras, apply background subtraction to detect moving people, and then focus on the matching of detection results. However, if cameras are mobile or only single image frames (not videos) are available, then background subtraction cannot be used, and human detection needs to be performed on entire images. In this paper, different from most of the existing work, we focus on a crowdsourcing scenario to find and follow person(s) of interest in the collected images/videos. We propose a novel approach combining R-CNN based person detection with the GPU implementation of color histogram and SURF- based re-identification. Moreover, GeoTags are extracted from the EXIF data of videos captured by smart phones, and are displayed on a map together with the time-stamps. All the processing is performed on a GPU, and the average processing time is 5 ms per frame.
UR - http://www.scopus.com/inward/record.url?scp=84989350500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84989350500&partnerID=8YFLogxK
U2 - 10.1145/2967413.2967421
DO - 10.1145/2967413.2967421
M3 - Conference contribution
AN - SCOPUS:84989350500
T3 - ACM International Conference Proceeding Series
SP - 178
EP - 183
BT - ICDSC 2016 - 10th International Conference on Distributed Smart Cameras
PB - Association for Computing Machinery
T2 - 10th International Conference on Distributed Smart Cameras, ICDSC 2016
Y2 - 12 September 2016 through 15 September 2016
ER -