Efficient Human Activity Classification from Egocentric Videos Incorporating Actor-Critic Reinforcement Learning

Yantao Lu, Yilan Li, Senem Velipasalar

Research output: Chapter in Book/Entry/PoemConference contribution

7 Scopus citations

Abstract

In this paper, we introduce a novel framework to significantly reduce the computational cost of human temporal activity recognition from egocentric videos while maintaining the accuracy at the same level. We propose to apply the actor-critic model of reinforcement learning to optical flow data to locate a bounding box around region of interest, which is then used for clipping a sub-image from a video frame. We also propose to use one shallow and one deeper 3D convolutional neural network to process the original image and the clipped image region, respectively. We compared our proposed method with another approach using 3D convolutional networks on the recently released Dataset of Multimodal Semantic Egocentric Video. Experimental results show that the proposed method reduces the processing time by 36.4% while providing comparable accuracy at the same time.

Original languageEnglish (US)
Title of host publication2019 IEEE International Conference on Image Processing, ICIP 2019 - Proceedings
PublisherIEEE Computer Society
Pages564-568
Number of pages5
ISBN (Electronic)9781538662496
DOIs
StatePublished - Sep 2019
Event26th IEEE International Conference on Image Processing, ICIP 2019 - Taipei, Taiwan, Province of China
Duration: Sep 22 2019Sep 25 2019

Publication series

NameProceedings - International Conference on Image Processing, ICIP
Volume2019-September
ISSN (Print)1522-4880

Conference

Conference26th IEEE International Conference on Image Processing, ICIP 2019
Country/TerritoryTaiwan, Province of China
CityTaipei
Period9/22/199/25/19

Keywords

  • activity classification
  • actor critic
  • reinforcement learning

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Fingerprint

Dive into the research topics of 'Efficient Human Activity Classification from Egocentric Videos Incorporating Actor-Critic Reinforcement Learning'. Together they form a unique fingerprint.

Cite this