Human activity classification incorporating egocentric video and inertial measurement unit data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many methods have been proposed for human activity classification, which rely either on Inertial Measurement Unit (IMU) data or data from static cameras watching subjects. There have been relatively less work using egocentric videos, and even fewer approaches combining egocentric video and IMU data. Systems relying only on IMU data are limited in the complexity of the activities that they can detect. In this paper, we present a robust and autonomous method, for fine-grained activity classification, that leverages data from multiple wearable sensor modalities to differentiate between activities, which are similar in nature, with a level of accuracy that would be impossible by each sensor alone. We use both egocentric videos and IMU sensors on the body. We employ Capsule Networks together with Convolutional Long Short Term Memory (LSTM) to analyze egocentric videos, and an LSTM framework to analyze IMU data, and capture temporal aspect of actions. We performed experiments on the CMU-MMAC dataset achieving overall recall and precision rates of 85.8% and 86.2%, respectively. We also present results of using each sensor modality alone, which show that the proposed approach provides 19.47% and 39.34% increase in accuracy compared to using only ego-vision data and only IMU data, respectively.

Original languageEnglish (US)
Title of host publication2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages429-433
Number of pages5
ISBN (Electronic)9781728112954
DOIs
StatePublished - Feb 20 2019
Event2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Anaheim, United States
Duration: Nov 26 2018Nov 29 2018

Publication series

Name2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings

Conference

Conference2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018
CountryUnited States
CityAnaheim
Period11/26/1811/29/18

Fingerprint

Units of measurement
Sensors
Cameras
Experiments

Keywords

  • Activity classification
  • Capsule networks
  • Egocentric video
  • IMU data
  • Multi-modal sensors

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing

Cite this

Lu, Y., & Velipasalar, S. (2019). Human activity classification incorporating egocentric video and inertial measurement unit data. In 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings (pp. 429-433). [8646367] (2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/GlobalSIP.2018.8646367

Human activity classification incorporating egocentric video and inertial measurement unit data. / Lu, Yantao; Velipasalar, Senem.

2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 429-433 8646367 (2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lu, Y & Velipasalar, S 2019, Human activity classification incorporating egocentric video and inertial measurement unit data. in 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings., 8646367, 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 429-433, 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018, Anaheim, United States, 11/26/18. https://doi.org/10.1109/GlobalSIP.2018.8646367
Lu Y, Velipasalar S. Human activity classification incorporating egocentric video and inertial measurement unit data. In 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 429-433. 8646367. (2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings). https://doi.org/10.1109/GlobalSIP.2018.8646367
Lu, Yantao ; Velipasalar, Senem. / Human activity classification incorporating egocentric video and inertial measurement unit data. 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 429-433 (2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings).
@inproceedings{cb2777ad4696431ca2d096ffbb957877,
title = "Human activity classification incorporating egocentric video and inertial measurement unit data",
abstract = "Many methods have been proposed for human activity classification, which rely either on Inertial Measurement Unit (IMU) data or data from static cameras watching subjects. There have been relatively less work using egocentric videos, and even fewer approaches combining egocentric video and IMU data. Systems relying only on IMU data are limited in the complexity of the activities that they can detect. In this paper, we present a robust and autonomous method, for fine-grained activity classification, that leverages data from multiple wearable sensor modalities to differentiate between activities, which are similar in nature, with a level of accuracy that would be impossible by each sensor alone. We use both egocentric videos and IMU sensors on the body. We employ Capsule Networks together with Convolutional Long Short Term Memory (LSTM) to analyze egocentric videos, and an LSTM framework to analyze IMU data, and capture temporal aspect of actions. We performed experiments on the CMU-MMAC dataset achieving overall recall and precision rates of 85.8{\%} and 86.2{\%}, respectively. We also present results of using each sensor modality alone, which show that the proposed approach provides 19.47{\%} and 39.34{\%} increase in accuracy compared to using only ego-vision data and only IMU data, respectively.",
keywords = "Activity classification, Capsule networks, Egocentric video, IMU data, Multi-modal sensors",
author = "Yantao Lu and Senem Velipasalar",
year = "2019",
month = "2",
day = "20",
doi = "10.1109/GlobalSIP.2018.8646367",
language = "English (US)",
series = "2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "429--433",
booktitle = "2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings",

}

TY - GEN

T1 - Human activity classification incorporating egocentric video and inertial measurement unit data

AU - Lu, Yantao

AU - Velipasalar, Senem

PY - 2019/2/20

Y1 - 2019/2/20

N2 - Many methods have been proposed for human activity classification, which rely either on Inertial Measurement Unit (IMU) data or data from static cameras watching subjects. There have been relatively less work using egocentric videos, and even fewer approaches combining egocentric video and IMU data. Systems relying only on IMU data are limited in the complexity of the activities that they can detect. In this paper, we present a robust and autonomous method, for fine-grained activity classification, that leverages data from multiple wearable sensor modalities to differentiate between activities, which are similar in nature, with a level of accuracy that would be impossible by each sensor alone. We use both egocentric videos and IMU sensors on the body. We employ Capsule Networks together with Convolutional Long Short Term Memory (LSTM) to analyze egocentric videos, and an LSTM framework to analyze IMU data, and capture temporal aspect of actions. We performed experiments on the CMU-MMAC dataset achieving overall recall and precision rates of 85.8% and 86.2%, respectively. We also present results of using each sensor modality alone, which show that the proposed approach provides 19.47% and 39.34% increase in accuracy compared to using only ego-vision data and only IMU data, respectively.

AB - Many methods have been proposed for human activity classification, which rely either on Inertial Measurement Unit (IMU) data or data from static cameras watching subjects. There have been relatively less work using egocentric videos, and even fewer approaches combining egocentric video and IMU data. Systems relying only on IMU data are limited in the complexity of the activities that they can detect. In this paper, we present a robust and autonomous method, for fine-grained activity classification, that leverages data from multiple wearable sensor modalities to differentiate between activities, which are similar in nature, with a level of accuracy that would be impossible by each sensor alone. We use both egocentric videos and IMU sensors on the body. We employ Capsule Networks together with Convolutional Long Short Term Memory (LSTM) to analyze egocentric videos, and an LSTM framework to analyze IMU data, and capture temporal aspect of actions. We performed experiments on the CMU-MMAC dataset achieving overall recall and precision rates of 85.8% and 86.2%, respectively. We also present results of using each sensor modality alone, which show that the proposed approach provides 19.47% and 39.34% increase in accuracy compared to using only ego-vision data and only IMU data, respectively.

KW - Activity classification

KW - Capsule networks

KW - Egocentric video

KW - IMU data

KW - Multi-modal sensors

UR - http://www.scopus.com/inward/record.url?scp=85063083739&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063083739&partnerID=8YFLogxK

U2 - 10.1109/GlobalSIP.2018.8646367

DO - 10.1109/GlobalSIP.2018.8646367

M3 - Conference contribution

T3 - 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings

SP - 429

EP - 433

BT - 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -