Hierarchical Taxonomy Aware Network Embedding

Authors:
Jianxin Ma Tsinghua University
Peng Cui Tsinghua University
Xiao Wang Tsinghua University
Wenwu Zhu Tsinghua University

Introduction:

This paper studies Network embedding. Incorporating the hierarchical taxonomy into network embedding poses a great challenge. The authors propose NetHiex, a NETwork embedding model that captures the latent HIErarchical taXonomy.The whole model is unified within a nonparametric probabilistic framework.

Abstract:

Network embedding learns the low-dimensional representations for vertices, while preserving the inter-vertex similarity reflected by the network structure. The neighborhood structure of a vertex is usually closely related with an underlying hierarchical taxonomy—-the vertices are associated with successively broader categories that can be organized hierarchically. The categories of different levels reflects similarity of different granularity. The hierarchy of the taxonomy therefore requires that the learned representations support multiple levels of granularity. Moreover, the hierarchical taxonomy enables the information to flow between vertices via their common categories, and thus provides an effective mechanism for alleviating data scarcity. However, incorporating the hierarchical taxonomy into network embedding poses a great challenge (since the taxonomy is generally unknown), and it is neglected by the existing approaches. In this paper, we propose NetHiex, a NETwork embedding model that captures the latent HIErarchical taXonomy. In our model, a vertex representation consists of multiple components that are associated with categories of different granularity. The representations of both the vertices and the categories are co-regularized. We employ the nested Chinese restaurant process to guide the search of the most plausible hierarchical taxonomy. The network structure is then recovered from the latent representations via a Bernoulli distribution. The whole model is unified within a nonparametric probabilistic framework. A scalable expectation-maximization algorithm is derived for optimization. Empirical results demonstrate that NetHiex achieves significant performance gain over the state-of-arts.

You may want to know: