TY - GEN
T1 - Letting 3D Guide the Way
T2 - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
AU - Chen, Jiajing
AU - Yang, Minmin
AU - Velipasalar, Senem
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/1/3
Y1 - 2024/1/3
N2 - Existing few-shot image classification networks aim to perform prediction on images belonging to classes that were not seen during training, with only a few labeled images, which are randomly picked from the same image pool as the support set. However, this traditional approach has two main issues: (i) in real-world applications, since support images are randomly picked, the angle they were captured from can be very different from that of the query image, causing the images to look very different and making it hard to match them; (ii) since support and query images, for both training and testing, are sampled from the same image pool, models can overfit the dataset, especially if the image pool contains images with similar color, texture or view angle. Thus, good performance on a dataset does not reflect a model's real ability. To address these issues, we propose a novel few-shot learning approach referred to as the 3D guided 2D (3DG2D) few-shot image classification. In our proposed approach, the queries are 2D images, and the support set is composed of 3D mesh data, providing different views of an object, in contrast to randomly picked images providing a single view. From each 3D mesh, 14 projection images are generated from different angles. Thus, these projections have significant variance among themselves. To address this challenge, we also propose the Angle Inference Module (AIM), which is used to infer the view angle of a query image so that more attention is given to projection images corresponding to the same view angle as the query image to achieve better prediction performance. We perform experiments on ModelNet40, Toys4K and ShapeNet datasets with 4-fold cross validation, and show that our 3DG2D few-shot classification approach consistently outperforms the state-of-the-art baselines.
AB - Existing few-shot image classification networks aim to perform prediction on images belonging to classes that were not seen during training, with only a few labeled images, which are randomly picked from the same image pool as the support set. However, this traditional approach has two main issues: (i) in real-world applications, since support images are randomly picked, the angle they were captured from can be very different from that of the query image, causing the images to look very different and making it hard to match them; (ii) since support and query images, for both training and testing, are sampled from the same image pool, models can overfit the dataset, especially if the image pool contains images with similar color, texture or view angle. Thus, good performance on a dataset does not reflect a model's real ability. To address these issues, we propose a novel few-shot learning approach referred to as the 3D guided 2D (3DG2D) few-shot image classification. In our proposed approach, the queries are 2D images, and the support set is composed of 3D mesh data, providing different views of an object, in contrast to randomly picked images providing a single view. From each 3D mesh, 14 projection images are generated from different angles. Thus, these projections have significant variance among themselves. To address this challenge, we also propose the Angle Inference Module (AIM), which is used to infer the view angle of a query image so that more attention is given to projection images corresponding to the same view angle as the query image to achieve better prediction performance. We perform experiments on ModelNet40, Toys4K and ShapeNet datasets with 4-fold cross validation, and show that our 3DG2D few-shot classification approach consistently outperforms the state-of-the-art baselines.
KW - 3D computer vision
KW - Algorithms
KW - Algorithms
KW - Machine learning architectures
KW - and algorithms
KW - formulations
UR - http://www.scopus.com/inward/record.url?scp=85191963789&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85191963789&partnerID=8YFLogxK
U2 - 10.1109/WACV57701.2024.00271
DO - 10.1109/WACV57701.2024.00271
M3 - Conference contribution
AN - SCOPUS:85191963789
T3 - Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
SP - 2720
EP - 2728
BT - Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 4 January 2024 through 8 January 2024
ER -