Predicting worker disagreement for more effective crowd labeling

Crowdsourcing is a popular mechanism used for labeling tasks to produce large corpora for training. However, producing a reliable crowd labeled training corpus is challenging and resource consuming. Research on crowdsourcing has shown that label quality is much affected by worker engagement and expertise. In this study, we postulate that label quality can also be affected by inherent ambiguity of the documents to be labeled. Such ambiguities are not known in advance, of course, but, once encountered by the workers, they lead to disagreement in the labeling - a disagreement that cannot be resolved by employing more workers. To deal with this problem, we propose a crowd labeling framework: we train a disagreement predictor on a small seed of documents, and then use this predictor to decide which documents of the complete corpus should be labeled and which should be checked for document-inherent ambiguities before assigning (and potentially wasting) worker effort on them. We report on the findings of the experiments we conducted on crowdsourcing a Twitter corpus for sentiment classification.

Predicting worker disagreement for more effective crowd labeling

Rabiger, Stefan;Gezici, Gizem;Saygin, Yucel;Spiliopoulou, Myra

2018

Abstract

Crowdsourcing is a popular mechanism used for labeling tasks to produce large corpora for training. However, producing a reliable crowd labeled training corpus is challenging and resource consuming. Research on crowdsourcing has shown that label quality is much affected by worker engagement and expertise. In this study, we postulate that label quality can also be affected by inherent ambiguity of the documents to be labeled. Such ambiguities are not known in advance, of course, but, once encountered by the workers, they lead to disagreement in the labeling - a disagreement that cannot be resolved by employing more workers. To deal with this problem, we propose a crowd labeling framework: we train a disagreement predictor on a small seed of documents, and then use this predictor to decide which documents of the complete corpus should be labeled and which should be checked for document-inherent ambiguities before assigning (and potentially wasting) worker effort on them. We report on the findings of the experiments we conducted on crowdsourcing a Twitter corpus for sentiment classification.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Titolo del Volume
	
				2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA)
			
	DOI
	
				https://dx.doi.org/10.1109/dsaa.2018.00028
			
	Parole chiave
	
				worker disagreement, crowdsourcing, dataset quality, label reliability, tweet ambiguity
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
DSAA_Conference_Predicting worker disagreement for more effective crowd labeling.pdf accesso aperto Tipologia: Published version Licenza: Solo Lettura Dimensione 430.05 kB Formato Adobe PDF	430.05 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/139929

Citazioni

ND

6

5

ND

social impact