Crowdsourcing Natural Language Data at Scale: A Hands-On Tutorial

Alexey Drutsa,Dmitry Ustalov,Valentina Fedorova,Olga Megorskaya,Daria Baidakova

Crowdsourcing Natural Language Data at Scale: A Hands-On Tutorial

2021

Alexey Drutsa
Dmitry Ustalov
Valentina Fedorova
Olga Megorskaya
Daria Baidakova

In this tutorial, we present a portion of unique industry experience in efficient natural language data annotation via crowdsourcing shared by both leading researchers and engineers from Yandex. We will make an introduction to data labeling via public crowdsourcing marketplaces and will present the key components of efficient label collection. This will be followed by a practical session, where participants address a real-world language resource production task, experiment with selecting settings for the labeling process, and launch their label collection project on one of the largest crowdsourcing marketplaces. The projects will be run on real crowds within the tutorial session and we will present useful quality control techniques and provide the attendees with an opportunity to discuss their own annotation ideas.

Keywords:

Natural language
Computer science
session
task
Annotation
World Wide Web
Resource (project management)
Crowds
Process (engineering)
Crowdsourcing

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations