Human-in-the-Loop Challenges for Entity Matching: A Midterm Report

2017 
Entity matching (EM) has been a long-standing challenge in data management. In the past few years we have started two major projects on EM (Magellan and Corleone/Falcon). These projects have raised many human-in-the-loop (HIL) challenges. In this paper we discuss these challenges. In particular, we show how these challenges forced us to revise our solution architecture, from a typical RDBMS-style architecture to a very human-centric one, in which human users are first-class objects driving the EM process, using tools at pain-point places. We discuss how such solution architectures can be viewed as combining "tools in the loop" with "human in the loop". Finally, we discuss lessons learned which can potentially be applied to other problem settings. We also hope that more researchers will investigate EM, as it can be a rich "playground" for HIL research.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    24
    Citations
    NaN
    KQI
    []