IMPROVED EFFICIENCY BASED ON LEARNED SACCADE AND CONTINUOUS SCENE RECONSTRUCTION FROM FOVEATED VISUAL SAMPLING

Jiayang Liu, Yiming Bu, Daniel Ts'o, Qinru Qiu

Research output: Contribution to conferencePaperpeer-review

Abstract

High accuracy, low latency and high energy efficiency represent a set of conflicting goals when searching for system solutions for image classification and detection. While high-quality images naturally result in more precise detection and classification, they also result in a heavier computational workload for imaging and processing, reduced camera frame rates, and increased data communication between the camera and processor. Taking inspiration from the foveal-peripheral sampling mechanism, and saccade mechanism of the human visual system and the filling-in phenomena of brain, we have developed an active scene reconstruction architecture based on multiple foveal views. This model stitches together information from a sequence of foveal-peripheral views, which are sampled from multiple glances. Assisted by a reinforcement learning-based saccade mechanism, our model reduces the required input pixels by over 90% per frame while maintaining the same level of performance in image recognition as with the original images. We evaluated the effectiveness of our model using the GTSRB dataset and the ImageNet dataset. Using an equal number of input pixels, our model demonstrates a 5% higher image recognition accuracy compared to state-of-the-art foveal-peripheral based vision systems. Furthermore, we demonstrate that our foveal sampling/saccadic scene reconstruction model exhibits significantly lower complexity and higher data efficiency during the training phase compared to existing approaches. Code is available at Github.

Original languageEnglish (US)
StatePublished - 2024
Externally publishedYes
Event12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria
Duration: May 7 2024May 11 2024

Conference

Conference12th International Conference on Learning Representations, ICLR 2024
Country/TerritoryAustria
CityHybrid, Vienna
Period5/7/245/11/24

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'IMPROVED EFFICIENCY BASED ON LEARNED SACCADE AND CONTINUOUS SCENE RECONSTRUCTION FROM FOVEATED VISUAL SAMPLING'. Together they form a unique fingerprint.

Cite this