TY - GEN
T1 - Magnitude and Uncertainty Pruning Criterion for Neural Networks
AU - Ko, Vinnie
AU - Oehmcke, Stefan
AU - Gieseke, Fabian
PY - 2019
Y1 - 2019
N2 - Neural networks have achieved dramatic improvements in recent years and depict the state-of-the-art methods for many real-world tasks nowadays. One drawback is, however, that many of these models are overparameterized, which makes them both computationally and memory intensive. Furthermore, overparameterization can also lead to undesired overfitting side-effects. Inspired by recently proposed magnitude-based pruning schemes and the Wald test from the field of statistics, we introduce a novel magnitude and uncertainty (MU) pruning criterion that helps to lessen such shortcomings. One important advantage of our MU pruning criterion is that it is scale-invariant, a phenomenon that the magnitude-based pruning criterion suffers from. In addition, we present a 'pseudo bootstrap' scheme, which can efficiently estimate the uncertainty of the weights by using their update information during training. Our experimental evaluation, which is based on various neural network architectures and datasets, shows that our new criterion leads to more compressed models compared to models that are solely based on magnitude-based pruning criteria, with, at the same time, less loss in predictive power.
AB - Neural networks have achieved dramatic improvements in recent years and depict the state-of-the-art methods for many real-world tasks nowadays. One drawback is, however, that many of these models are overparameterized, which makes them both computationally and memory intensive. Furthermore, overparameterization can also lead to undesired overfitting side-effects. Inspired by recently proposed magnitude-based pruning schemes and the Wald test from the field of statistics, we introduce a novel magnitude and uncertainty (MU) pruning criterion that helps to lessen such shortcomings. One important advantage of our MU pruning criterion is that it is scale-invariant, a phenomenon that the magnitude-based pruning criterion suffers from. In addition, we present a 'pseudo bootstrap' scheme, which can efficiently estimate the uncertainty of the weights by using their update information during training. Our experimental evaluation, which is based on various neural network architectures and datasets, shows that our new criterion leads to more compressed models compared to models that are solely based on magnitude-based pruning criteria, with, at the same time, less loss in predictive power.
KW - Neural network compression
KW - overparameterization
KW - pruning
KW - Wald test
U2 - 10.1109/BigData47090.2019.9005692
DO - 10.1109/BigData47090.2019.9005692
M3 - Article in proceedings
AN - SCOPUS:85081327889
T3 - Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
SP - 2317
EP - 2326
BT - 2019 IEEE International Conference on Big Data, Big Data
A2 - Baru, Chaitanya
A2 - Huan, Jun
A2 - Khan, Latifur
A2 - Hu, Xiaohua Tony
A2 - Ak, Ronay
A2 - Tian, Yuanyuan
A2 - Barga, Roger
A2 - Zaniolo, Carlo
A2 - Lee, Kisung
A2 - Ye, Yanfang Fanny
PB - IEEE
T2 - 2019 IEEE International Conference on Big Data, Big Data 2019
Y2 - 9 December 2019 through 12 December 2019
ER -