Encouraging orthogonality between weight vectors in pretrained deep neural networks

Karol Grzegorczyk,Marcin Kurdziel,Piotr Iwo Wójcik

Encouraging orthogonality between weight vectors in pretrained deep neural networks

2016

Deep neural networks have recently shown impressive performance in several machine learning tasks. An important approach to training deep networks, useful especially when labeled data is scarce, relies on unsupervised pretraining of hidden layers followed by supervised finetuning. One of the most widely used approaches to unsupervised pretraining is to train each layer with the Contrastive Divergence (CD) algorithm. In this work we present a modification to CD with the goal of learning more diverse sets of features in hidden layers. In particular, we extend the CD learning rule to penalize cosines of the angles between weight vectors, which in turn encourages orthogonality between the learned features. We demonstrate experimentally that this extension to CD improves performance of pretrained deep networks on image recognition and document retrieval tasks.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations