This repository is an independent extension of the NIPS'16 paper:
Renjie Liao, Alexander Schwing, Richard S. Zemel, Raquel Urtasun. Learning Deep Parsimonious Representations. Neural Information Processing System, 2016. https://github.com/lrjconan/deep_parsimonious
The code here applies distillation to networks trained using the methods described in the paper above to generate smaller networks of comparable accuracy. Furhter, distillation is combined with clustering regularization as described in the paper to generate a "hybrid" training method.