Device Graphing By Example

Authors:
Keith Funkhouser comScore
Matthew Malloy comScore
Enis Ceyhun Alp Ecole Polytechnique Fédérale de Lausanne
Phillip Poon comScore
Paul Barford comScore, University of Wisconsin

Introduction:

The authors demonstrate how measurement, tracking, and other internet entities can associate multiple identifiers with a single device or user after coarse associations are made. The authors employ a Bayesian similarity algorithm

Abstract:

Datasets that organize and associate the many identifiers produced by PCs, smartphones, and tablets accessing the internet are referred to as internet device graphs . In this paper, we demonstrate how measurement, tracking, and other internet entities can associate multiple identifiers with a single device or user after coarse associations, e.g ., based on IP-colocation , are made. We employ a Bayesian similarity algorithm that relies on examples of pairs of identifiers and their associated telemetry, including user agent, screen size, and domains visited, to establish pair-wise scores. Community detection algorithms are applied to group identifiers that belong to the same device or user. We train and validate our methodology using a unique dataset collected from a client panel with full visibility, apply it to a dataset of 700 million device identifiers collected over the course of six weeks in the United States, and show that it outperforms several unsupervised learning approaches. Results show mean precision and recall exceeding 90% for association of identifiers at both the device and user levels.

You may want to know: