OpenAttHetRL: An Open Source Toolkit for Attributed Heterogeneous Network Representation Learning

Learning the latent representations of entities based on their relationships and the data associated with them is an essential task in many applications such as ranking, recommendation systems, graph-based team formation, keyword search, and many more. However, the majority of existing techniques learn the latent representations of either network or textual data. Structural embedding techniques suffer from the sparsity of real-world networks. Attributes of nodes are a source of rich information to ameliorate network embedding vectors which are overlooked in the literature. Thus, most existing network representation learning tools capture structural information. This paper introduces an open-source toolkit called OpenAttHetRL to learn the latent representations of entities based on their both network and textual data in an end-to-end fashion. OpenAttHetRL is easy to employ and adapt for a variety of tasks including ranking, recommendation systems, and expert finding. OpenAttHetRL aims to provide a unified toolkit for data pre-processing, building and training models, and performing predictions for a downstream task. It employs a graph convolution network to capture the relationships among entities and a kernel pooling technique to preserve the similarity of their textual data in the embedding space. We use expert finding in community question answering systems to demonstrate how OpenAttHetRL can be trained to get latent representations of questions, their askers, tags, and answerers and find potential answerers of new questions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader