Pruning is one of the methods for inference to efficiently produce models smaller in size, more memory-efficient, more power-efficient and faster at inference with minimal loss in accuracy, other such techniques being weight sharing and quantization. Out of several aspects that deep learning takes as an inspiration from the area of Neuroscience. Pruning in deep learning is also a biologically inspired.
- Model are getting larger
- Speed of the machine if you use CPU
- Energy Efficiency (Memory Usage)
- Weight Pruning
- Unit/Neuron Pruning
1)Set individual weights in the weight matrix to zero. This corresponds to deleting connections.
2)Here, to achieve sparsity of k% we rank the individual weights in weight matrix W according to their magnitude, and then set to zero the smallest k%.
Tensors with several values set to zero can be considered sparse. This results in important benefits:
- Compression. Sparse tensors are amenable to compression by only keeping the non-zero values and their corresponding coordinates.
- Speed. Sparse tensors allow us to skip otherwise unnecessary computations involving the zero values.
1)Set entire columns to zero in the weight matrix to zero, in effect deleting the corresponding output neuron.
2)Here to achieve sparsity of k% we rank the columns of a weight matrix according to their L2-norm and delete the smallest k%.
If you want to do some research or work on Pruning the Neural Network then follow the under instructions.
- Fork the Repository
- Clone this repo to your local machine using https://github.com/jinsel/Pruning-the-Neural-Networks
(Requires the latest pip)
$ pip install --upgrade pip
$ pip install tensorflow
(If you want to use GPU then)
$ pip install tensorflow-gpu
$ pip install numpy
Enable the GPU with: Runtime > Change runtime type > Hardware accelator and make sure GPU is selected.
-
To prune, or not to prune: exploring the efficacy of pruning for model compression, Michael H. Zhu, Suyog Gupta, 2017
-
Learning to Prune Filters in Convolutional Neural Networks, Qiangui Huang et. al, 2018
-
https://jacobgil.github.io/deeplearning/pruning-deep-learning Pruning deep neural networks to make them fast and small
-
Optimize machine learning models with Tensorflow Model Optimization Toolkit
-
https://www.tensorflow.org/model_optimization/guide/pruning/train_sparse_models
-
https://towardsdatascience.com/pruning-deep-neural-network-56cae1ec5505 Pruning Deep Neural Networks
-
https://www.youtube.com/watch?v=CrDRr2fxbsg&t=656s Toward Efficient Deep Neural Network Deployment: Deep Compression and EIE, Song Han
-
https://www.youtube.com/watch?v=vouEMwDNopQ Deep Compression, DSD Training and EIE