Torchessian

This repository aims to provide a tool for analyzing the full Hessian of the loss function for neural networks. For the moment, only a single GPU-mode is enabled, but I have plans to implement a distributed version in the future.

Motivation

I found this article very interesting, and I wanted to reproduce the results very quickly. I've implemented a batch mode spectrum estimation in order to get even faster results.

For instance, when analyzing the impact of having batchnom layers in a ResNet-18 architecture, I found the following spectrum on the test set of CIFAR-10:

Note: both architectures (i.e. with and without batchnorm) were trained until almost reaching a global optimum, i.e. at least 98% of accuracy.

Spectrum of a single batch

Spectrum of the entire test dataset

The results are pretty similar, and the conclusion is the same: the batchnorm layers seem to eliminate big positive eigenvalues, which makes the training process easier.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
torchessian		torchessian
.gitattributes		.gitattributes
.gitignore		.gitignore
CIFAR10Spectrum.ipynb		CIFAR10Spectrum.ipynb
LanczosCheck.ipynb		LanczosCheck.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Torchessian

Motivation

Spectrum of a single batch

Spectrum of the entire test dataset

About

Releases

Packages

Languages

LeviViana/torchessian

Folders and files

Latest commit

History

Repository files navigation

Torchessian

Motivation

Spectrum of a single batch

Spectrum of the entire test dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages