Multimodal Tucker Decomposition for Gated RBM Inference

2021 
Gated networks are networks that contain gating connections in which the output of at least two neurons are multiplied. The basic idea of a gated restricted Boltzmann machine (RBM) model is to use the binary hidden units to learn the conditional distribution of one image (the output) given another image (the input). This allows the hidden units of a gated RBM to model the transformations between two successive images. Inference in the model consists in extracting the transformations given a pair of images. However, a fully connected multiplicative network creates cubically many parameters, forming a three-dimensional interaction tensor that requires a lot of memory and computations for inference and training. In this paper, we parameterize the bilinear interactions in the gated RBM through a multimodal tensor-based Tucker decomposition. Tucker decomposition decomposes a tensor into a set of matrices and one (usually smaller) core tensor. The parameterization through Tucker decomposition helps reduce the number of model parameters, reduces the computational costs of the learning process and effectively strengthens the structured feature learning. When trained on affine transformations of still images, we show how a completely unsupervised network learns explicit encodings of image transformations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []