Counting and Enumerating Preferred Database Repairs

2020 
Abstract In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a “minimal way.” Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense. Three notions of lifting (and preferred repairs) have been proposed: Pareto, global, and completion. In this manuscript, we investigate the complexity of three problems on preferred repairs. The first is the problem of deciding whether the priority relation contains enough information to clean the database unambiguously, or in other words, whether there is exactly one preferred repair. We show that the different lifting semantics entail highly different complexities for this problem. Then, we study the ability to quantify ambiguity, by investigating two classes of problems. The first is that of counting the preferred repairs. We establish a dichotomy in data complexity for the entire space of (sets of) functional dependencies for all three notions. The second class of problems is that of enumerating (i.e., generating) the preferred repairs. We devise enumeration algorithms with efficiency guarantees on the delay between generated repairs, even for constraints represented as general conflict graphs or hypergraphs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    0
    Citations
    NaN
    KQI
    []