CrowdRL: An End-to-End Reinforcement Learning Framework for Data Labelling

2021 
Data labelling is very important in many database and machine learning applications. Traditional methods rely on humans (workers or experts) to acquire labels. However, the human cost is rather expensive for a large dataset. Active learning based methods only label a small set of data with large uncertainty, train a model on these labelled data, and use the trained model to label the remainder unlabelled data. However they have two limitations. First, they cannot judiciously select appropriate data (task selection) and assign the tasks to proper humans (task assignment). Moreover, they independently process task selection and task assignment, which cannot capture the correlation between them. Second, they simply infer the truth of a task based on the answers from humans and the trained model (truth inference) by independently modeling humans and models. In other words, they ignore the correlation between them (the labelled data may have noise caused by humans with biases, and the model trained by the noisy labels may bring additional biases), and thus lead to poor inference results.To address these limitations, in this paper, we propose CrowdRL, an end-to-end reinforcement learning (RL) based framework for data labelling. To the best of our knowledge, CrowdRL is the first RL framework designed for the data labelling workflow by seamlessly integrating task selection, task assignment and truth inference together. CrowdRL fully utilizes the power of heterogeneous annotators (experts and crowdsourcing workers) and machine learning models together to infer the truth, which highly improves the quality of data labelling. CrowdRL uses RL to model task assignment and task selection, and designs an agent to judiciously assign tasks to appropriate workers. CrowdRL jointly models the answers of workers, experts and models, and designs a joint inference model to infer the truths. Experimental results on real datasets show that CrowdRL outperforms state-of-the-art approaches with the same (even fewer) monetary cost while achieving 5%-20% higher accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    0
    Citations
    NaN
    KQI
    []