Propensity-Independent Bias Recovery in Offline Learning-to-Rank Systems

2021 
Learning-to-rank systems often utilize user-item interaction data (e.g., clicks) to provide users with high-quality rankings. However, this data suffers from several biases, and if naively used as training data, it can lead to suboptimal ranking algorithms. Most existing bias-correcting methods focus on position bias, the fact that higher-ranked results are more likely to receive interaction, and address this bias by leveraging inverse propensity weighting. However, it is not always possible to accurately estimate propensity scores, and in addition to position bias, selection bias is often encountered in real-world recommender systems. Selection bias occurs because users are exposed to a truncated list of results, which gives a zero chance for some items to be observed and, therefore, interacted with, even if they are relevant. Here, we propose a new counterfactual method that uses a two-stage correction approach and jointly addresses selection and position bias in learning-to-rank systems without relying on propensity scores. Our experimental results show that our method is better than state-of-the-art propensity-independent methods and either better than or comparable to methods that make the strong assumption for which the propensity model is known.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    0
    Citations
    NaN
    KQI
    []