Accelerate Convolutional Neural Networks
High-Performance Neural Networks for Visual Object Classification
- intro: “reduced network parameters by randomly removing connections before training”
- arxiv:
Predicting Parameters in Deep Learning
- intro: “decomposed the weighting matrix into two low-rank matrices”
- arxiv:
Neurons vs Weights Pruning in Artificial Neural Networks
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
- intro: “presented a series of low-rank decomposition designs for convolutional kernels. singular value decomposition was adopted for the matrix factorization”
- paper:
Efficient and accurate approximations of nonlinear convolutional networks
- intro: “considered the subsequent nonlinear units while learning the low-rank decomposition”
- arxiv:
Flattened Convolutional Neural Networks for Feedforward Acceleration(ICLR 2015)
Compressing Deep Convolutional Networks using Vector Quantization
- intro: “this paper showed that vector quantization had a clear advantage over matrix factorization methods in compressing fully-connected layers.”
- arxiv:
Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition
- intro: “a low-rank CPdecomposition was adopted to transform a convolutional layer into multiple layers of lower complexity”
- arxiv:
Deep Fried Convnets
- intro: “fully-connected layers were replaced by a single “Fastfood” layer for end-to-end training with convolutional layers”
- arxiv:
Distilling the Knowledge in a Neural Network (by Geoffrey Hinton, Oriol Vinyals, Jeff Dean)
- intro: “trained a distilled model to mimic the response of a larger and well-trained network”
- arxiv:
Compressing Neural Networks with the Hashing Trick
- intro: “randomly grouped connection weights into hash buckets, and then fine-tuned network parameters with back-propagation”
- arxiv:
Accelerating Very Deep Convolutional Networks for Classification and Detection
- intro: “considered the subsequent nonlinear units while learning the low-rank decomposition”
- arxiv:
Fast ConvNets Using Group-wise Brain Damage
- intro: “applied group-wise pruning to the convolutional tensor to decompose it into the multiplications of thinned dense matrices”
- arxiv:
Learning both Weights and Connections for Efficient Neural Networks
Data-free parameter pruning for Deep Neural Networks
- intro: “proposed to remove redundant neurons instead of network connections”
- arXiv:
Fast Algorithms for Convolutional Neural Networks
- intro: “2.6x as fast as Caffe when comparing CPU implementations”
- arXiv:
- discussion:
Tensorizing Neural Networks
A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding(ICLR 2016)
- intro: “reduced the size of AlexNet by 35x from 240MB to 6.9MB, the size of VGG16 by 49x from 552MB to 11.3MB, with no loss of accuracy”
- arXiv:
ZNN - A Fast and Scalable Algorithm for Training 3D Convolutional Networks on Multi-Core and Many-Core Shared Memory Machines
Reducing the Training Time of Neural Networks by Partitioning
Convolutional neural networks with low-rank regularization
Quantized Convolutional Neural Networks for Mobile Devices (Q-CNN)
- intro: “Extensive experiments on the ILSVRC-12 benchmark demonstrate 4 ∼ 6× speed-up and 15 ∼ 20× compression with merely one percentage loss of classification accuracy”
- arxiv:
Convolutional Tables Ensemble: classification in microseconds
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size
- arxiv:
- github:
- note:
Convolutional Neural Networks using Logarithmic Data Representation
DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices
Accelerate Convolutional Neural Networks
- intro: “This tool aims to accelerate the test-time computation and decrease number of parameters of deep CNNs.”
- github:
OptNet - reducing memory usage in torch neural networks
NNPACK: Acceleration package for neural networks on multi-core CPUs
- github:
- comments(Yann LeCun):