PointOfView: A Multi-modal Network for Few-shot 3D Point Cloud Classification Fusing Point and Multi-view Image Features

Huantao Ren, Jiyang Wang, Minmin Yang, Senem Velipasalar

Research output: Chapter in Book/Entry/PoemConference contribution

2 Scopus citations

Abstract

Most existing 3D point cloud analysis approaches employ traditional supervised methods, which require large amounts of labeled data, and data annotation is labor-intensive, and costly. On the other hand, although many existing works use either raw 3D point clouds or multiple 2D depth images, their joint use is relatively under-explored. To address these issues, we propose PointOfView, a novel, multi-modal few-shot 3D point cloud classification model, to classify never-before-seen classes with only a few annotated samples. A 2D multi-view learning branch is proposed for processing multiple projection images, and it contains two sub-branches to extract information at individual image level as well as among all six depth images. In addition, we propose a multi-scale 2D pooling layer, which employs various 2D max-pooling and 2D average pooling operations, with different pooling sizes. This allows fusing features at different scales. The second main branch processes raw 3D point clouds by first sorting them, and then using DGCNN to extract features. We perform within-dataset and cross-domain experiments on ModelNel40, ModelNet40-C and ScanobjectNN datasets, and compare with six state-of-the-art baselines. The results show that our approach outperforms all baselines in all experimental settings and achieve the state-of-the-art performance.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
PublisherIEEE Computer Society
Pages784-793
Number of pages10
ISBN (Electronic)9798350365474
DOIs
StatePublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle, United States
Duration: Jun 16 2024Jun 22 2024

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
Country/TerritoryUnited States
CitySeattle
Period6/16/246/22/24

Keywords

  • classification
  • few-shot learning
  • multi-modal
  • point cloud

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'PointOfView: A Multi-modal Network for Few-shot 3D Point Cloud Classification Fusing Point and Multi-view Image Features'. Together they form a unique fingerprint.

Cite this