Disentangling the Spatial Structure and Style in Conditional VAE

Ziye Zhang,Li Sun,Zhilin Zheng,Qingli Li

Disentangling the Spatial Structure and Style in Conditional VAE

2020

This paper proposes a structure in conditional variation autoencoder (cVAE) to disentangle the latent vector into a spatial structure and a style code, complementary to each other, with the one $( z_{s})$ being label relevant and the other $( z_{u})$ irrelevant. Different from traditional cVAE, our network maps the condition label into its relevant code z s through a separated module. Depending on whether the label directly relates to the image spatial structure or not, z s output from the condition mapping module is used either as the style code with the two spatial dimension of $1 \times 1$, or as the spatial structure code with a single channel. Based on the input image and its corresponding z s , the encoder provides the posterior distribution close to a common prior regardless of its label, thus z u sampled from it becomes label irrelevant. The decoder employs z s and z u by two typical adaptive normalization modules to reconstruct the input image. Results on two datasets with different types of labels show the effectiveness of our method.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations