Towards Data Discovery by Example
2021
Data scientists today have to query an avalanche of multi-source data (e.g., data lakes, company databases) for diverse analytical tasks. Data discovery is labor-intensive as users have to find the right tables, and the combination thereof to answer their queries. Data discovery systems automatically find and link (e.g., joins) tables across various sources to aid users in finding the data they need. In this paper, we outline our ongoing efforts to build a data discovery by example system, DICE, that iteratively searches for new tables guided by user-provided data examples. Additionally, DICE asks users to validate results to improve the discovery process over multiple iterations.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
1
Citations
NaN
KQI