Adding Structure: Social Network Inference with Graph Priors

2016 
Adding Structure: Social Network Inference with Graph Priors Han Liu Stratis Ioannidis ECE, UC-Davis bhgliu@ucdavis.edu ECE, Northeastern ioannidis@ece.neu.edu Smriti Bhagat Chen-Nee Chuah Facebook smr@fb.com ECE, UC-Davis chuah@ucdavis.edu ABSTRACT We study the problem of social network graph inference, whereby the topology of user interaction networks, as well as the strength of pairwise influences, are inferred from traces of information cascades. We propose a framework intro- ducing graph structural priors into the above inference pro- cess. This framework allows us to capture different priors on the graph’s degree distribution, including, e.g., stretched ex- ponential, power-law, and an approximation of log-normal, which are important due to their natural prevalence in real social networks and many other complex graphs. We show that network inference under our model is amenable to the so-called majorize-minimize method, and that its implemen- tation is tractable, as each step amounts to solving a con- vex optimization problem. We evaluate our method over synthetic datasets as well as real-world datasets from Twit- ter and a Facebook gifting application. We observe that network inference incorporating our structural priors signif- icantly outperforms state-of-the-art inference. INTRODUCTION Online Social Networks (OSNs) have long been perceived as platforms over which information spreads and, as such, they have been extensively used by marketing companies, grass-roots movements, and political strategists for the im- plementation of “word-of-mouth” campaigns [8]. Follow- ing the seminal work by Kempe et al. [16], there has been an extensive research effort around the optimal design of such campaigns [17, 5, 13, 2]; typically, such efforts rely on parametrized models, such as the independent cascade [9] and the linear threshold model [12, 26]. Motivated by this state of affairs, learning the parameters of these models—or variants thereof—from datasets of information cascades has recently received considerable attention [11, 23, 7, 21, 4]. In short, the above works attempt to solve a problem commonly referred to as the social network inference prob- lem. This amounts to observing a sequence of cascades— e.g., adoptions of a product across a population, the spread of tweets, hashtags or URLs over Twitter, etc.—and infer- ring the underlying user-to-user interaction network struc- ture over which propagations take place and, a fortiori, the relative influence users exert on each other. The latter is model dependent; for example, in the independent cascade model [4, 21, 23], which we also adopt, the influence of user i to user j is captured by a probability that, given that i is “recruited”, it successfully recruits user j. Learning pairwise user influence thus amounts to training these probabilities from cascade traces, while learning the underlying graph amounts to classifying the edges between users as existing or non-existing based on the inferred probability. In this work, we incorporate graph structural priors to the network inference problem. This allows us to capture inherent information that may be known a priori about the underlying network. For example, social networks have been reported to follow power law [29, 1, 18, 20], stretched expo- nential [24, 30, 14], or log-normal [25, 10] degree distribu- tions. Our approach incorporates such information in the inference process, leading to better estimation. Clearly, as the number of graphs grows exponentially with the graph size, introducing prior distributions may lead to a combinatorial explosion. As such, a careful selection of the prior structure is necessary. Moreover, the parameter esti- mation problem that results from the introduction of priors should be tractable. In existing prior-free models [4, 21, 23], parameter estimation (i.e., learning the influence prob- abilities), reduces to solving a convex optimization prob- lem. Standard regularization approaches can easily break this convex structure [21, 19]. It is unclear how to intro- duce, e.g., a power-law or a stretched exponential prior while maintaining tractability. In this paper, we tackle these chal- lenges and make the following contributions: • We present a generic scheme for introducing degree- dependent priors to social network inference problems. Even though the resulting inference problem is not con- vex, we show that our priors are amenable to analysis through the majorize-minimize (MM) method [15]. It is also highly scalable: multiple MM iterations, executed in parallel, lead to an increase in the likelihood of the computed estimates. Each iteration solves a convex op- timization problem, thus ensuring tractability. • We evaluate our method over both synthetic and real datasets. Despite the lack of convexity and potential convergence to local minima, our method outperforms convex methods in inferring both the underlying topology and the influence strengths. We describe prior work in Sec. 2. Sections 3 and 4 present our problem formulation and proposed solution, respectively. We evaluate our approach in Sec. 5, and conclude in Sec. 6. RELATED WORK There are several recent approaches inferring the under- lying unobserved social network from cascade traces [21, 23,
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    3
    Citations
    NaN
    KQI
    []