|Changying Du||Institute of Software, Chinese Academy of Sciences|
|Changde Du||Institute of Automation, Chinese Academy of Sciences|
|Xingyu Xie||Nanjing University of Aeronautics and Astronautics|
|Chen Zhang||Qihoo 360 Search Lab|
|Hao Wang||Qihoo 360 Search Lab|
Many important data mining problems can be modeled as learning a (bidirectional) multidimensional mapping between two data domains. The authors propose a multi-view adversarially learned inference (ALI) model, termed as MALI.
Many important data mining problems can be modeled as learning a (bidirectional) multidimensional mapping between two data domains. Based on the generative adversarial networks (GANs), particularly conditional ones, cross-domain joint distribution matching is an increasingly popular kind of methods addressing such problems. Though significant advances have been achieved, there are still two main disadvantages of existing models, i.e., the requirement of large amount of paired training samples and the notorious instability of training. In this paper, we propose a multi-view adversarially learned inference (ALI) model, termed as MALI, to address these issues. Unlike the common practice of learning direct domain mappings, our model relies on shared latent representations of both domains and can generate arbitrary number of paired faking samples, benefiting from which usually very few paired samples (together with sufficient unpaired ones) is enough for learning good mappings. Extending the vanilla ALI model, we design novel discriminators to judge the quality of generated samples (both paired and unpaired), and provide theoretical analysis of our new formulation. Experiments on image-to-image translation, image-to-attribute generation (multi-label classification), attribute-to-image generation tasks demonstrate that our semi-supervised learning framework yields significant performance improvements over existing ones. Results on cross-modality retrieval show that our latent space based method can achieve competitive similarity search performance in relative fast speed, compared to those methods that compute similarities in the high-dimensional data space.