Using Entropy Measures for Evaluating the Quality of Entity Resolution

2021 
This research describes some of the results from an unsupervised ER process using cluster entropy as a way to self-regulate linking. The experiments were performed using synthetic person references of varying quality. The process was able to obtain a linking accuracy of 93% for samples with moderate to high data quality. While results for low-quality references were much lower, there are many possible avenues of research that could further improve the results from this process. The purpose of this research is to allow ER processes to self-regulate linking based on cluster entropy. The results are very promising for entity references of relatively high quality; using this process for low-quality data needs further improvement. The best overall result obtained from the sample was just over 50% linking accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []