TY - GEN
T1 - The Origin and Value of Disagreement Among Data Labelers
T2 - 17th International Conference on Information for a Better World: Shaping the Global Future, iConference 2022
AU - Sang, Yisi
AU - Stanton, Jeffrey
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Human annotated data is the cornerstone of today’s artificial intelligence efforts, yet data labeling processes can be complicated and expensive, especially when human labelers disagree with each other. The current work practice is to use majority-voted labels to overrule the disagreement. However, in the subjective data labeling tasks such as hate speech annotation, disagreement among individual labelers can be difficult to resolve. In this paper, we explored why such disagreements occur using a mixed-method approach – including interviews with experts, concept mapping exercises, and self-reporting items – to develop a multidimensional scale for distilling the process of how annotators label a hate speech corpus. We tested this scale with 170 annotators in a hate speech annotation task. Results showed that our scale can reveal facets of individual differences among annotators (e.g., age, personality, etc.), and these facets’ relationships to an annotator’s final label decision of an instance. We suggest that this work contributes to the understanding of how humans annotate data. The proposed scale can potentially improve the value of the currently discarded minority-vote labels.
AB - Human annotated data is the cornerstone of today’s artificial intelligence efforts, yet data labeling processes can be complicated and expensive, especially when human labelers disagree with each other. The current work practice is to use majority-voted labels to overrule the disagreement. However, in the subjective data labeling tasks such as hate speech annotation, disagreement among individual labelers can be difficult to resolve. In this paper, we explored why such disagreements occur using a mixed-method approach – including interviews with experts, concept mapping exercises, and self-reporting items – to develop a multidimensional scale for distilling the process of how annotators label a hate speech corpus. We tested this scale with 170 annotators in a hate speech annotation task. Results showed that our scale can reveal facets of individual differences among annotators (e.g., age, personality, etc.), and these facets’ relationships to an annotator’s final label decision of an instance. We suggest that this work contributes to the understanding of how humans annotate data. The proposed scale can potentially improve the value of the currently discarded minority-vote labels.
KW - Content moderation
KW - Data labeler
KW - Disagreement
KW - Hate speech
KW - Label
KW - Multidimensional scale
UR - http://www.scopus.com/inward/record.url?scp=85126232054&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126232054&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-96957-8_36
DO - 10.1007/978-3-030-96957-8_36
M3 - Conference contribution
AN - SCOPUS:85126232054
SN - 9783030969561
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 425
EP - 444
BT - Information for a Better World
A2 - Smits, Malte
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 28 February 2022 through 4 March 2022
ER -