Numerical Exploration of Training Loss Level-Sets in Deep Neural Networks

Naveed Tahir, Garrett E. Katz

Research output: Chapter in Book/Entry/PoemConference contribution

Abstract

We present a computational method for empirically characterizing the training loss level-sets of deep neural networks. Our method numerically constructs a path in parameter space that is constrained to a set with a fixed near-zero training loss. By measuring regularization functions and test loss at different points within this path, we examine how different points in the parameter space with the same fixed training loss compare in terms of generalization ability. We also compare this method for finding regularized points with the more typical method, that uses objective functions which are weighted sums of training loss and regularization terms. We apply dimensionality reduction to the traversed paths in order to visualize the loss level sets in a well-regularized region of parameter space. Our results provide new information about the loss landscape of deep neural networks, as well as a new strategy for reducing test loss.

Original languageEnglish (US)
Title of host publicationIJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9780738133669
DOIs
StatePublished - Jul 18 2021
Event2021 International Joint Conference on Neural Networks, IJCNN 2021 - Virtual, Shenzhen, China
Duration: Jul 18 2021Jul 22 2021

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2021-July

Conference

Conference2021 International Joint Conference on Neural Networks, IJCNN 2021
Country/TerritoryChina
CityVirtual, Shenzhen
Period7/18/217/22/21

Keywords

  • deep learning
  • generalization
  • optimization

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Numerical Exploration of Training Loss Level-Sets in Deep Neural Networks'. Together they form a unique fingerprint.

Cite this