Interactively Discovering and Ranking Desired Tuples without Writing SQL Queries

2020 
The very first step of many data analytics is to find and (possibly) rank desired tuples, typically through writing SQL queries - this is feasible only for data experts who can write SQL queries and know the data very well. Unfortunately, in practice, the queries might be complicated (for example, "find and rank good off-road cars based on a combination of Price, Make, Model, Age, Mileage, and so on" is complicated because it contains many if-then-else, and, or and not logic) such that even data experts cannot precisely specify SQL queries; and the data might be unknown, which is common in data discovery that one tries to discover desired data from a data lake. Naturally, a system that can help users to discover and rank desired tuples without writing SQL queries is needed. We propose to demonstrate such as a system, namely DExPlorer. To use DExPlorer for data exploration, the user only needs to interactively perform two simple operations over a set of system provided tuples: (1) annotate which tuples are desired (i.e., true labels) or not (i.e., false labels), and (2) annotate whether a tuple is more preferred than another one (i.e., partial orders or ranked lists). We will show that DExPlorer can find user's desired tuples and rank them in a few interactions, even for complicated queries.
    • Correction
    • Source
    • Cite
    • Save
    11
    References
    0
    Citations
    NaN
    KQI
    []