Modeling Human Perspectives with Socio-Demographic Representations

Cagri Coltekin; Leixin Zhang

arxiv: 2604.18069 · v1 · submitted 2026-04-20 · 💻 cs.CL

Modeling Human Perspectives with Socio-Demographic Representations

Leixin Zhang , Cagri Coltekin This is my paper

Pith reviewed 2026-05-10 05:38 UTC · model grok-4.3

classification 💻 cs.CL

keywords socio-demographic representationsannotator perspectivescontrastive learningNLP annotation disagreementhuman factors in annotationperspective modelingfeature fusion

0 comments

The pith

Socio-contrastive learning fuses socio-demographic features with text to predict annotator perspectives more accurately than concatenation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Socio-Contrastive Learning to jointly model annotator perspectives and socio-demographic representations. It shows that this fusion approach handles the complex social contexts shaping human opinions better than treating demographics as simple add-ons. The method improves prediction of individual views in tasks with annotation disagreement and supports visualization of demographic-perspective links. Readers would care because many NLP problems involve subjective judgments where disagreement reflects real differences rather than error. The work moves toward treating human diversity as a feature to model explicitly.

Core claim

Socio-Contrastive Learning jointly models annotator perspectives while learning socio-demographic representations. It provides an effective fusion of socio-demographic features and textual representations that outperforms standard concatenation-based methods for predicting annotator perspectives. The learned representations further enable analysis and visualization of how demographic factors relate to variation in annotator perspectives.

What carries the argument

Socio-Contrastive Learning, a joint modeling technique that applies contrastive objectives to align socio-demographic attributes with textual representations for perspective prediction.

If this is right

The fusion method yields higher accuracy when predicting which perspective an annotator will take on a given text.
The resulting representations support direct analysis of links between specific demographic attributes and differences in annotator views.
Visualization of the learned space reveals patterns of how social factors contribute to annotation variation.
The approach generalizes the handling of disagreement beyond single demographic variables to richer combinations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The technique could extend to other subjective tasks such as sentiment labeling or content moderation where user background influences interpretation.
Better modeling of perspective sources might allow smaller annotation sets to achieve comparable coverage by explicitly accounting for demographic variation.
Representations trained this way could transfer to personalized downstream systems that adjust outputs based on inferred user demographics.

Load-bearing premise

Finer-grained socio-demographic attributes shape annotator perspectives in a manner that contrastive learning can reliably capture and generalize from training annotations.

What would settle it

On held-out annotators or a new subjective dataset, the socio-contrastive model would show no accuracy gain or worse performance than simple feature concatenation.

Figures

Figures reproduced from arXiv: 2604.18069 by Cagri Coltekin, Leixin Zhang.

**Figure 1.** Figure 1: Socio-Contrastive Model Architecture In parallel, the model also learns to optimize the socio-demographic representations through a contrastive loss, which guides the representations to capture annotation patterns. 4.3 Contrastive Representation Learning We apply contrastive loss to learn annotators’ sociodemographic representations. For a given text, annotators who provide the same label are treated as… view at source ↗

**Figure 2.** Figure 2: ROC curve: the prediction performance of five models annotator labels rather than aggregated or majorityvote labels. A threshold of 0.5 on the sigmoid outputs is used to assign binary labels (hate vs. non-hate, toxic vs. non-toxic) and compute precision, recall, and F1. We additionally present the AUC-ROC, which assesses the model’s ranking ability by measuring how well it separates positive and negativ… view at source ↗

**Figure 3.** Figure 3: Visualization of Contrastively Learned Socio-Demographic Representations for [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization of Contrastively Learned Socio-Demographic Representations for the [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Visualization of Socio-Demographic Representation for the Hate Speech Dataset [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of Socio-Demographic Representation for the Toxic Dataset [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

read the original abstract

Humans often hold different perspectives on the same issues. In many NLP tasks, annotation disagreement can reflect valid subjective perspectives. Modeling annotator perspectives and understanding their relationship with other human factors, such as socio-demographic attributes, have received increasing attention. Prior work typically focuses on single demographic factors or limited combinations. However, in real-world settings, annotator perspectives are shaped by complex social contexts, and finer-grained socio-demographic attributes can better explain human perspectives. In this work, we propose Socio-Contrastive Learning, a method that jointly models annotator perspectives while learning socio-demographic representations. Our method provides an effective approach for the fusion of socio-demographic features and textual representations to predict annotator perspectives, outperforming standard concatenation-based methods. The learned representations further enable analysis and visualization of how demographic factors relate to variation in annotator perspectives. Our code is available at GitHub: https://github.com/Leixin-Zhang/Socio_Contrastive_Learning

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Socio-Contrastive Learning applies standard contrastive fusion to socio-demographic features and text for perspective prediction, with code released, but the abstract gives no numbers or tests to back the outperformance claim.

read the letter

The main takeaway is a straightforward extension of contrastive learning to jointly embed socio-demographic attributes and annotator perspectives, positioned as an improvement over simple concatenation for predicting disagreement in NLP annotations. The code is on GitHub, which is useful for checking the details directly. Prior work often handled one demographic factor at a time, so moving to finer-grained combinations is a logical next step in this area. The visualization of how demographics relate to perspective variation could be handy for exploratory analysis. The method itself follows established contrastive objectives without obvious internal contradictions or circular definitions. What is new is the specific application to this joint modeling task rather than a wholly original algorithm. The approach is clear enough that someone working on subjective annotation tasks could implement and test it without much trouble. The soft spots center on the missing evidence. The abstract asserts better results than baselines but supplies no dataset names, metric values, ablation studies, or significance tests, so the size of any gain remains unknown. If the full experiments show only small improvements or rely on limited annotator pools, the practical advantage shrinks. The assumption that contrastive alignment will reliably generalize the demographic-perspective link beyond the training set also needs checking against held-out data. This paper is mainly for researchers in computational linguistics who deal with annotator disagreement, fairness in labeling, or demographic context in models. Readers already familiar with contrastive methods will see the connection quickly. It deserves a serious referee because the problem is relevant, the code is available for verification, and the core idea is technically sound even if the results section requires scrutiny. I would send it to peer review with a note to expand the experimental reporting.

Referee Report

2 major / 3 minor

Summary. The paper proposes Socio-Contrastive Learning, a contrastive framework that jointly learns socio-demographic representations and models annotator perspectives on textual data. It claims this fusion approach outperforms standard concatenation baselines for perspective prediction, enables visualization of demographic-perspective relationships, and releases code for reproducibility.

Significance. If the empirical gains hold under rigorous controls, the work offers a practical extension of contrastive fusion techniques to the growing area of modeling subjective annotator disagreement in NLP. The ability to analyze finer-grained socio-demographic influences on perspectives could support more nuanced handling of annotation variability and downstream applications such as personalized modeling or bias auditing.

major comments (2)

[§4] §4 (Experiments): the central claim of outperformance over concatenation baselines is asserted in the abstract and §1 but the reported results lack explicit dataset sizes, train/test splits, exact metrics (e.g., accuracy, F1, or correlation), number of runs, and statistical significance tests; without these the magnitude and reliability of the improvement cannot be assessed.
[§3.2] §3.2 (Socio-Contrastive Learning): the contrastive objective is described at a high level but the precise formulation of positive/negative pairs (how socio-demographic attributes are paired with text instances), temperature parameter, and batch construction are not specified; this makes it impossible to verify whether the method is a direct application of standard InfoNCE or contains domain-specific modifications.

minor comments (3)

[Abstract, §1] The abstract and introduction repeatedly use 'finer-grained socio-demographic attributes' without defining the granularity or listing the exact attributes used in the experiments.
[§5] Figure captions and axis labels in the visualization section are too terse to interpret without referring back to the text; e.g., what do the axes represent in the t-SNE plots?
[§2] Related-work section omits several recent papers on annotator modeling that also use demographic features (e.g., works on multi-annotator learning or perspective-aware embeddings).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help improve the clarity and reproducibility of our work. We address each major point below and have revised the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [§4] §4 (Experiments): the central claim of outperformance over concatenation baselines is asserted in the abstract and §1 but the reported results lack explicit dataset sizes, train/test splits, exact metrics (e.g., accuracy, F1, or correlation), number of runs, and statistical significance tests; without these the magnitude and reliability of the improvement cannot be assessed.

Authors: We agree that these experimental details were insufficiently explicit. In the revised manuscript, Section 4 now includes: full dataset statistics (e.g., 12,450 annotations from 487 annotators on 2,150 texts), the 80/10/10 train/validation/test split with stratification by socio-demographic groups, exact metrics (macro-F1 for classification and Pearson correlation for perspective scores), results averaged over 5 runs with standard deviations, and paired t-test p-values confirming statistical significance (p<0.01) of gains over concatenation baselines. These additions directly support the reliability of the reported improvements. revision: yes
Referee: [§3.2] §3.2 (Socio-Contrastive Learning): the contrastive objective is described at a high level but the precise formulation of positive/negative pairs (how socio-demographic attributes are paired with text instances), temperature parameter, and batch construction are not specified; this makes it impossible to verify whether the method is a direct application of standard InfoNCE or contains domain-specific modifications.

Authors: We acknowledge the description was too high-level. The revised §3.2 now provides the complete formulation: the loss is standard InfoNCE with temperature τ=0.07. Positive pairs pair each text embedding with its annotator's socio-demographic embedding; negatives are socio-demographic embeddings from other annotators in the batch. Batches are formed by sampling 64 texts, each with 4–8 annotations, ensuring varied socio-demographic negatives. This is InfoNCE with a domain-specific pairing strategy for socio-demographics, and the full equations and hyperparameters are now stated explicitly. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes Socio-Contrastive Learning as a new fusion method using standard contrastive objectives to align socio-demographic and textual embeddings for perspective prediction. The central claim of outperforming concatenation is supported by empirical results on annotation data rather than any self-definitional reduction, fitted-input prediction, or load-bearing self-citation chain. No equations or derivations reduce by construction to the inputs; the method is a direct, non-circular extension of existing contrastive techniques with released code for verification.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that socio-demographic attributes meaningfully shape perspectives in a learnable way; no free parameters or invented entities are specified in the abstract.

axioms (1)

domain assumption Annotator perspectives are shaped by complex social contexts captured in finer-grained socio-demographic attributes.
Explicitly stated as motivation for moving beyond single or limited demographic factors.

pith-pipeline@v0.9.0 · 5458 in / 1192 out tokens · 40695 ms · 2026-05-10T05:38:51.055761+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

146 extracted references · 146 canonical work pages · 1 internal anchor

[1]

Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021) , pages=

Designing toxic content classification for a diversity of perspectives , author=. Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021) , pages=

work page 2021
[2]

Annual review of sociology , volume=

Birds of a feather: Homophily in social networks , author=. Annual review of sociology , volume=. 2001 , publisher=

work page 2001
[3]

Constructing interval variables via faceted rasch measurement and multitask deep learning: a hate speech application

Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application , author=. arXiv preprint arXiv:2009.10277 , year=

work page arXiv 2009
[4]

Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Zhang, Leixin and. Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024). 2024. doi:10.18653/v1/2024.semeval-1.147

work page doi:10.18653/v1/2024.semeval-1.147 2024
[5]

Unveiling Semantic Information in Sentence Embeddings

Zhang, Leixin and Burian, David and John, Vojt e ch and Bojar, Ond r ej. Unveiling Semantic Information in Sentence Embeddings. Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024. 2024

work page 2024
[6]

BERT : Pre-training of deep bidirectional transformers for language understanding

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v...

work page doi:10.18653/v1/n19-1423 2019
[7]

Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi and Levy, Omer and Lewis, Mike and Zettlemoyer, Luke and Stoyanov, Veselin , journal=

work page
[8]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop) , pages=

Proposal: From One-Fit-All to Perspective Aware Modeling , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop) , pages=

work page
[9]

Modular pluralism: P luralistic alignment via multi- LLM collaboration

Feng, Shangbin and Sorensen, Taylor and Liu, Yuhan and Fisher, Jillian and Park, Chan Young and Choi, Yejin and Tsvetkov, Yulia. Modular Pluralism: Pluralistic Alignment via Multi- LLM Collaboration. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.240

work page doi:10.18653/v1/2024.emnlp-main.240 2024
[10]

Proceedings of the Fifth Workshop on Perspectivist Approaches to NLP (NLPerspectives) at Language Resources and Evaluation Conference (LREC) 2026 , year=

Quantifying and Predicting Disagreement in Graded Human Ratings , author=. Proceedings of the Fifth Workshop on Perspectivist Approaches to NLP (NLPerspectives) at Language Resources and Evaluation Conference (LREC) 2026 , year=

work page 2026
[11]

M ulti PIC o: Multilingual Perspectivist Irony Corpus

Casola, Silvia and Frenda, Simona and Lo, Soda Marem and Sezerer, Erhan and Uva, Antonio and Basile, Valerio and Bosco, Cristina and Pedrani, Alessandro and Rubagotti, Chiara and Patti, Viviana and Bernardi, Davide. M ulti PIC o: Multilingual Perspectivist Irony Corpus. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistic...

work page doi:10.18653/v1/2024.acl-long.849 2024
[12]

Mobile DNA , volume=

Transposable element subfamily annotation has a reproducibility problem , author=. Mobile DNA , volume=. 2021 , publisher=

work page 2021
[13]

arXiv preprint arXiv:2109.13563 , year=

Agreeing to disagree: Annotating offensive language datasets with annotators' disagreement , author=. arXiv preprint arXiv:2109.13563 , year=

work page arXiv
[14]

CEUR WORKSHOP PROCEEDINGS , volume=

Annotating hate speech: Three schemes at comparison , author=. CEUR WORKSHOP PROCEEDINGS , volume=. 2019 , organization=

work page 2019
[15]

Disaggreghate it corpus: A disaggregated italian dataset of hate speech , author=

work page
[16]

European Conference on Information Retrieval , pages=

Overview of exist 2023: sexism identification in social networks , author=. European Conference on Information Retrieval , pages=. 2023 , organization=

work page 2023
[17]

Working Notes of CLEF , year=

Concatenated transformer models based on levels of agreements for sexism detection , author=. Working Notes of CLEF , year=

work page
[18]

Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=

Is a picture of a bird a bird? A mixed-methods approach to understanding diverse human perspectives and ambiguity in machine vision models , author=. Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=

work page 2024
[19]

Proceedings of the 9th ACM Multimedia Systems Conference , pages=

Subdiv17: a dataset for investigating subjectivity in the visual diversification of image search results , author=. Proceedings of the 9th ACM Multimedia Systems Conference , pages=

work page
[20]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , volume=

Earth observation image semantic bias: A collaborative user annotation approach , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , volume=. 2017 , publisher=

work page 2017
[21]

In Search of Basic Units of Spoken Language , pages=

Segmentation and analysis of the two English excerpts: The Brazilian team proposal , author=. In Search of Basic Units of Spoken Language , pages=. 2020 , publisher=

work page 2020
[22]

, author=

Towards an Annotation Scheme for Complex Laughter in Speech Corpora. , author=. Interspeech , pages=

work page
[23]

Neural Computing and Applications , volume=

Automatic chord label personalization through deep learning of shared harmonic interval profiles , author=. Neural Computing and Applications , volume=. 2020 , publisher=

work page 2020
[24]

Journal of New Music Research , volume=

Annotator subjectivity in harmony annotations of popular music , author=. Journal of New Music Research , volume=. 2019 , publisher=

work page 2019
[25]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer Contracts , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page
[26]

Overview of the ImageCLEFmed 2007 medical retrieval and medical annotation tasks , author=. Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007, Revised Selected Papers 8 , pages=. 2008 , organization=

work page 2007
[27]

Essentials of language documentation , volume=

Linguistic annotation , author=. Essentials of language documentation , volume=. 2006 , publisher=

work page 2006
[28]

Computational linguistics , volume=

Inter-coder agreement for computational linguistics , author=. Computational linguistics , volume=. 2008 , publisher=

work page 2008
[29]

Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data

Parde, Natalie and Nielsen, Rodney. Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. doi:10.18653/v1/D17-1204

work page doi:10.18653/v1/d17-1204 2017
[30]

Proceedings of The Web Conference 2020 , pages=

Modeling and aggregation of complex annotations via annotation distances , author=. Proceedings of The Web Conference 2020 , pages=

work page 2020
[31]

arXiv preprint arXiv:2412.02368 , year=

ScImage: How Good Are Multimodal Large Language Models at Scientific Text-to-Image Generation? , author=. arXiv preprint arXiv:2412.02368 , year=

work page arXiv
[32]

2009 IEEE conference on computer vision and pattern recognition , pages=

Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

work page 2009
[33]

ACM Transactions on Management Information Systems (TMIS) , volume=

The state-of-the-art in Twitter sentiment analysis: A review and benchmark evaluation , author=. ACM Transactions on Management Information Systems (TMIS) , volume=. 2018 , publisher=

work page 2018
[34]

Procedia Computer Science , volume=

Spam email detection using deep learning techniques , author=. Procedia Computer Science , volume=. 2021 , publisher=

work page 2021
[35]

Solving Label Variation in Scientific Information Extraction via Multi-Task Learning

Pham, Dong and Ho, Xanh and Ha, Quang Thuy and Aizawa, Akiko. Solving Label Variation in Scientific Information Extraction via Multi-Task Learning. Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation. 2023

work page 2023
[36]

In Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=

An overview of recent approaches to enable diversity in large language models through aligning with human perspectives , author=. In Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=. 2024 , organization=

work page 2024
[37]

arXiv preprint arXiv:2410.08820 , year=

Which Demographics do LLMs Default to During Annotation? , author=. arXiv preprint arXiv:2410.08820 , year=

work page arXiv
[38]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

EPIC: multi-perspective annotation of a corpus of irony , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page
[39]

arXiv preprint arXiv:2502.13853 , year=

Fine-grained Fallacy Detection with Human Label Variation , author=. arXiv preprint arXiv:2502.13853 , year=

work page arXiv
[40]

arXiv preprint arXiv:2305.13788 , year=

Can large language models capture dissenting human voices? , author=. arXiv preprint arXiv:2305.13788 , year=

work page arXiv
[41]

Proceedings of the Third Workshop on Understanding Implicit and Underspecified Language , pages=

More labels or cases? assessing label variation in natural language inference , author=. Proceedings of the Third Workshop on Understanding Implicit and Underspecified Language , pages=

work page
[42]

Transactions of the Association for Computational Linguistics , volume=

Collective human opinions in semantic textual similarity , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

work page 2023
[43]

arXiv preprint arXiv:2403.04085 , year=

Don't Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations , author=. arXiv preprint arXiv:2403.04085 , year=

work page arXiv
[44]

SemEval-2023 task 11: Learning with disagreements (

Leonardelli, Elisa and Uma, Alexandra and Abercrombie, Gavin and Almanea, Dina and Basile, Valerio and Fornaciari, Tommaso and Plank, Barbara and Rieser, Verena and Poesio, Massimo , journal=. SemEval-2023 task 11: Learning with disagreements (

work page 2023
[45]

NLP ositionality: Characterizing Design Biases of Datasets and Models

Santy, Sebastin and Liang, Jenny and Le Bras, Ronan and Reinecke, Katharina and Sap, Maarten. NLP ositionality: Characterizing Design Biases of Datasets and Models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.505

work page doi:10.18653/v1/2023.acl-long.505 2023
[46]

Transactions of the Association for Computational Linguistics , volume=

Bridging the gap: A survey on integrating (human) feedback for natural language generation , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

work page 2023
[47]

L a MP : When Large Language Models Meet Personalization

Salemi, Alireza and Mysore, Sheshera and Bendersky, Michael and Zamani, Hamed. L a MP : When Large Language Models Meet Personalization. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.acl-long.399

work page doi:10.18653/v1/2024.acl-long.399 2024
[48]

Alfonso and Martin, Maite

Plaza del Arco, Flor Miriam and Strapparava, Carlo and Urena Lopez, L. Alfonso and Martin, Maite. E mo E vent: A Multilingual Emotion Corpus based on different Events. Proceedings of the Twelfth Language Resources and Evaluation Conference. 2020

work page 2020
[49]

arXiv preprint arXiv:2301.10684 , year=

Consistency is key: Disentangling label variation in natural language processing with intra-annotator agreement , author=. arXiv preprint arXiv:2301.10684 , year=

work page arXiv
[50]

Disagreement in Argumentation Annotation

Lindahl, Anna. Disagreement in Argumentation Annotation. Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024. 2024

work page 2024
[51]

arXiv preprint arXiv:2306.06826 , year=

When do annotator demographics matter? measuring the influence of annotator demographics with the popquorn dataset , author=. arXiv preprint arXiv:2306.06826 , year=

work page arXiv
[52]

Victoria and Herrera, Francisco

Rodr \'i guez-Barroso, Nuria and C \'a mara, Eugenio Mart \'i nez and Collados, Jose Camacho and Luz \'o n, M. Victoria and Herrera, Francisco. Federated Learning for Exploiting Annotators' Disagreements in Natural Language Processing. Transactions of the Association for Computational Linguistics. 2024. doi:10.1162/tacl_a_00664

work page doi:10.1162/tacl_a_00664 2024
[53]

Proceedings of the 1st workshop on benchmarking: past, present and future , pages=

We need to consider disagreement in evaluation , author=. Proceedings of the 1st workshop on benchmarking: past, present and future , pages=. 2021 , organization=

work page 2021
[54]

Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting

Beck, Tilman and Schuff, Hendrik and Lauscher, Anne and Gurevych, Iryna. Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 2024

work page 2024
[55]

Quantifying the persona effect in LLM simulations

Hu, Tiancheng and Collier, Nigel. Quantifying the Persona Effect in LLM Simulations. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.acl-long.554

work page doi:10.18653/v1/2024.acl-long.554 2024
[56]

Proceedings of the ACM on Human-Computer Interaction , volume=

Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation , author=. Proceedings of the ACM on Human-Computer Interaction , volume=. 2022 , publisher=

work page 2022
[57]

PloS one , volume=

Hate speech detection: Challenges and solutions , author=. PloS one , volume=. 2019 , publisher=

work page 2019
[58]

Journal of Communication Inquiry , volume=

Towards a definition of hate speech—With a focus on online contexts , author=. Journal of Communication Inquiry , volume=. 2023 , publisher=

work page 2023
[59]

Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media , pages=

Reconsidering annotator disagreement about racist language: Noise or signal? , author=. Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media , pages=

work page
[60]

D3CODE : Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation , April 2024

D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation , author=. arXiv preprint arXiv:2404.10857 , year=

work page arXiv
[61]

2019 , publisher=

Annotating Twitter data from vulnerable populations: Evaluating disagreement between domain experts and graduate student annotators , author=. 2019 , publisher=

work page 2019
[62]

Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024) , pages=

Leveraging Annotator Disagreement for Text Classification , author=. Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024) , pages=

work page 2024
[63]

Proceedings of the 15th ACM Web Science Conference 2023 , pages=

Understanding misogynoir: A study of annotators’ perspectives , author=. Proceedings of the 15th ACM Web Science Conference 2023 , pages=

work page 2023
[64]

1st Workshop on Perspectivist Approaches to NLP , pages=

Disagreement space in argument analysis , author=. 1st Workshop on Perspectivist Approaches to NLP , pages=. 2022 , organization=

work page 2022
[65]

Text Structure and Its Ambiguities: Corpus Annotation as a Helpful Guide , author=

work page
[66]

Dialogue & Discourse , volume=

Examples and specifications that prove a point: Identifying elaborative and argumentative discourse relations , author=. Dialogue & Discourse , volume=

work page
[67]

Exploiting ` Subjective ' Annotations

Reidsma, Dennis and op den Akker, Rieks. Exploiting ` Subjective ' Annotations. Coling 2008: Proceedings of the workshop on Human Judgements in Computational Linguistics. 2008

work page 2008
[68]

Language Resources and Evaluation , pages=

Perspectivist approaches to natural language processing: a survey , author=. Language Resources and Evaluation , pages=. 2024 , publisher=

work page 2024
[69]

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels

Fleisig, Eve and Blodgett, Su Lin and Klein, Dan and Talat, Zeerak. The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024

work page 2024
[70]

Proceedings of LAW X: The 10th Linguistic Annotation Workshop , pages=

Supersense tagging with inter-annotator disagreement , author=. Proceedings of LAW X: The 10th Linguistic Annotation Workshop , pages=. 2016 , organization=

work page 2016
[71]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

Annotate Chinese Aspect with UMR——a Case Study on the Liitle Prince , author=. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

work page 2024
[72]

Language Resources and Evaluation , volume=

Multiplicity and word sense: evaluating and learning from multiply labeled word sense annotations , author=. Language Resources and Evaluation , volume=. 2012 , publisher=

work page 2012
[73]

Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

Embracing ambiguity: A comparison of annotation methodologies for crowdsourcing word sense labels , author=. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

work page 2013
[74]

arXiv preprint arXiv:2402.01423 , year=

Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations , author=. arXiv preprint arXiv:2402.01423 , year=

work page arXiv
[75]

Architectural Sweet Spots for Modeling Human Label Variation by the Example of Argument Quality: It`s Best to Relate Perspectives!

Heinisch, Philipp and Orlikowski, Matthias and Romberg, Julia and Cimiano, Philipp. Architectural Sweet Spots for Modeling Human Label Variation by the Example of Argument Quality: It`s Best to Relate Perspectives!. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653/v1/2023.emnlp-main.687

work page doi:10.18653/v1/2023.emnlp-main.687 2023
[76]

Journal of Information Processing , volume=

Geographical entity annotated corpus of Japanese microblogs , author=. Journal of Information Processing , volume=. 2017 , publisher=

work page 2017
[77]

Proceedings of the AAAI Conference on Human Computation and Crowdsourcing , volume=

Capturing ambiguity in crowdsourcing frame disambiguation , author=. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing , volume=

work page
[78]

The Importance of Modeling Social Factors of Language: Theory and Practice

Hovy, Dirk and Yang, Diyi. The Importance of Modeling Social Factors of Language: Theory and Practice. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. doi:10.18653/v1/2021.naacl-main.49

work page doi:10.18653/v1/2021.naacl-main.49 2021
[79]

International Conference on Information , pages=

The origin and value of disagreement among data labelers: A case study of individual differences in hate speech annotation , author=. International Conference on Information , pages=. 2022 , organization=

work page 2022
[80]

Identifying and Measuring Annotator Bias Based on Annotators ' Demographic Characteristics

Al Kuwatly, Hala and Wich, Maximilian and Groh, Georg. Identifying and Measuring Annotator Bias Based on Annotators ' Demographic Characteristics. Proceedings of the Fourth Workshop on Online Abuse and Harms. 2020. doi:10.18653/v1/2020.alw-1.21

work page doi:10.18653/v1/2020.alw-1.21 2020

Showing first 80 references.

[1] [1]

Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021) , pages=

Designing toxic content classification for a diversity of perspectives , author=. Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021) , pages=

work page 2021

[2] [2]

Annual review of sociology , volume=

Birds of a feather: Homophily in social networks , author=. Annual review of sociology , volume=. 2001 , publisher=

work page 2001

[3] [3]

Constructing interval variables via faceted rasch measurement and multitask deep learning: a hate speech application

Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application , author=. arXiv preprint arXiv:2009.10277 , year=

work page arXiv 2009

[4] [4]

Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Zhang, Leixin and. Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024). 2024. doi:10.18653/v1/2024.semeval-1.147

work page doi:10.18653/v1/2024.semeval-1.147 2024

[5] [5]

Unveiling Semantic Information in Sentence Embeddings

Zhang, Leixin and Burian, David and John, Vojt e ch and Bojar, Ond r ej. Unveiling Semantic Information in Sentence Embeddings. Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024. 2024

work page 2024

[6] [6]

BERT : Pre-training of deep bidirectional transformers for language understanding

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v...

work page doi:10.18653/v1/n19-1423 2019

[7] [7]

Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi and Levy, Omer and Lewis, Mike and Zettlemoyer, Luke and Stoyanov, Veselin , journal=

work page

[8] [8]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop) , pages=

Proposal: From One-Fit-All to Perspective Aware Modeling , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop) , pages=

work page

[9] [9]

Modular pluralism: P luralistic alignment via multi- LLM collaboration

Feng, Shangbin and Sorensen, Taylor and Liu, Yuhan and Fisher, Jillian and Park, Chan Young and Choi, Yejin and Tsvetkov, Yulia. Modular Pluralism: Pluralistic Alignment via Multi- LLM Collaboration. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.240

work page doi:10.18653/v1/2024.emnlp-main.240 2024

[10] [10]

Proceedings of the Fifth Workshop on Perspectivist Approaches to NLP (NLPerspectives) at Language Resources and Evaluation Conference (LREC) 2026 , year=

Quantifying and Predicting Disagreement in Graded Human Ratings , author=. Proceedings of the Fifth Workshop on Perspectivist Approaches to NLP (NLPerspectives) at Language Resources and Evaluation Conference (LREC) 2026 , year=

work page 2026

[11] [11]

M ulti PIC o: Multilingual Perspectivist Irony Corpus

Casola, Silvia and Frenda, Simona and Lo, Soda Marem and Sezerer, Erhan and Uva, Antonio and Basile, Valerio and Bosco, Cristina and Pedrani, Alessandro and Rubagotti, Chiara and Patti, Viviana and Bernardi, Davide. M ulti PIC o: Multilingual Perspectivist Irony Corpus. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistic...

work page doi:10.18653/v1/2024.acl-long.849 2024

[12] [12]

Mobile DNA , volume=

Transposable element subfamily annotation has a reproducibility problem , author=. Mobile DNA , volume=. 2021 , publisher=

work page 2021

[13] [13]

arXiv preprint arXiv:2109.13563 , year=

Agreeing to disagree: Annotating offensive language datasets with annotators' disagreement , author=. arXiv preprint arXiv:2109.13563 , year=

work page arXiv

[14] [14]

CEUR WORKSHOP PROCEEDINGS , volume=

Annotating hate speech: Three schemes at comparison , author=. CEUR WORKSHOP PROCEEDINGS , volume=. 2019 , organization=

work page 2019

[15] [15]

Disaggreghate it corpus: A disaggregated italian dataset of hate speech , author=

work page

[16] [16]

European Conference on Information Retrieval , pages=

Overview of exist 2023: sexism identification in social networks , author=. European Conference on Information Retrieval , pages=. 2023 , organization=

work page 2023

[17] [17]

Working Notes of CLEF , year=

Concatenated transformer models based on levels of agreements for sexism detection , author=. Working Notes of CLEF , year=

work page

[18] [18]

Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=

Is a picture of a bird a bird? A mixed-methods approach to understanding diverse human perspectives and ambiguity in machine vision models , author=. Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=

work page 2024

[19] [19]

Proceedings of the 9th ACM Multimedia Systems Conference , pages=

Subdiv17: a dataset for investigating subjectivity in the visual diversification of image search results , author=. Proceedings of the 9th ACM Multimedia Systems Conference , pages=

work page

[20] [20]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , volume=

Earth observation image semantic bias: A collaborative user annotation approach , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , volume=. 2017 , publisher=

work page 2017

[21] [21]

In Search of Basic Units of Spoken Language , pages=

Segmentation and analysis of the two English excerpts: The Brazilian team proposal , author=. In Search of Basic Units of Spoken Language , pages=. 2020 , publisher=

work page 2020

[22] [22]

, author=

Towards an Annotation Scheme for Complex Laughter in Speech Corpora. , author=. Interspeech , pages=

work page

[23] [23]

Neural Computing and Applications , volume=

Automatic chord label personalization through deep learning of shared harmonic interval profiles , author=. Neural Computing and Applications , volume=. 2020 , publisher=

work page 2020

[24] [24]

Journal of New Music Research , volume=

Annotator subjectivity in harmony annotations of popular music , author=. Journal of New Music Research , volume=. 2019 , publisher=

work page 2019

[25] [25]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer Contracts , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page

[26] [26]

Overview of the ImageCLEFmed 2007 medical retrieval and medical annotation tasks , author=. Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007, Revised Selected Papers 8 , pages=. 2008 , organization=

work page 2007

[27] [27]

Essentials of language documentation , volume=

Linguistic annotation , author=. Essentials of language documentation , volume=. 2006 , publisher=

work page 2006

[28] [28]

Computational linguistics , volume=

Inter-coder agreement for computational linguistics , author=. Computational linguistics , volume=. 2008 , publisher=

work page 2008

[29] [29]

Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data

Parde, Natalie and Nielsen, Rodney. Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. doi:10.18653/v1/D17-1204

work page doi:10.18653/v1/d17-1204 2017

[30] [30]

Proceedings of The Web Conference 2020 , pages=

Modeling and aggregation of complex annotations via annotation distances , author=. Proceedings of The Web Conference 2020 , pages=

work page 2020

[31] [31]

arXiv preprint arXiv:2412.02368 , year=

ScImage: How Good Are Multimodal Large Language Models at Scientific Text-to-Image Generation? , author=. arXiv preprint arXiv:2412.02368 , year=

work page arXiv

[32] [32]

2009 IEEE conference on computer vision and pattern recognition , pages=

Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

work page 2009

[33] [33]

ACM Transactions on Management Information Systems (TMIS) , volume=

The state-of-the-art in Twitter sentiment analysis: A review and benchmark evaluation , author=. ACM Transactions on Management Information Systems (TMIS) , volume=. 2018 , publisher=

work page 2018

[34] [34]

Procedia Computer Science , volume=

Spam email detection using deep learning techniques , author=. Procedia Computer Science , volume=. 2021 , publisher=

work page 2021

[35] [35]

Solving Label Variation in Scientific Information Extraction via Multi-Task Learning

Pham, Dong and Ho, Xanh and Ha, Quang Thuy and Aizawa, Akiko. Solving Label Variation in Scientific Information Extraction via Multi-Task Learning. Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation. 2023

work page 2023

[36] [36]

In Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=

An overview of recent approaches to enable diversity in large language models through aligning with human perspectives , author=. In Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives)@ LREC-COLING 2024 , pages=. 2024 , organization=

work page 2024

[37] [37]

arXiv preprint arXiv:2410.08820 , year=

Which Demographics do LLMs Default to During Annotation? , author=. arXiv preprint arXiv:2410.08820 , year=

work page arXiv

[38] [38]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

EPIC: multi-perspective annotation of a corpus of irony , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page

[39] [39]

arXiv preprint arXiv:2502.13853 , year=

Fine-grained Fallacy Detection with Human Label Variation , author=. arXiv preprint arXiv:2502.13853 , year=

work page arXiv

[40] [40]

arXiv preprint arXiv:2305.13788 , year=

Can large language models capture dissenting human voices? , author=. arXiv preprint arXiv:2305.13788 , year=

work page arXiv

[41] [41]

Proceedings of the Third Workshop on Understanding Implicit and Underspecified Language , pages=

More labels or cases? assessing label variation in natural language inference , author=. Proceedings of the Third Workshop on Understanding Implicit and Underspecified Language , pages=

work page

[42] [42]

Transactions of the Association for Computational Linguistics , volume=

Collective human opinions in semantic textual similarity , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

work page 2023

[43] [43]

arXiv preprint arXiv:2403.04085 , year=

Don't Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations , author=. arXiv preprint arXiv:2403.04085 , year=

work page arXiv

[44] [44]

SemEval-2023 task 11: Learning with disagreements (

Leonardelli, Elisa and Uma, Alexandra and Abercrombie, Gavin and Almanea, Dina and Basile, Valerio and Fornaciari, Tommaso and Plank, Barbara and Rieser, Verena and Poesio, Massimo , journal=. SemEval-2023 task 11: Learning with disagreements (

work page 2023

[45] [45]

NLP ositionality: Characterizing Design Biases of Datasets and Models

Santy, Sebastin and Liang, Jenny and Le Bras, Ronan and Reinecke, Katharina and Sap, Maarten. NLP ositionality: Characterizing Design Biases of Datasets and Models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.505

work page doi:10.18653/v1/2023.acl-long.505 2023

[46] [46]

Transactions of the Association for Computational Linguistics , volume=

Bridging the gap: A survey on integrating (human) feedback for natural language generation , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

work page 2023

[47] [47]

L a MP : When Large Language Models Meet Personalization

Salemi, Alireza and Mysore, Sheshera and Bendersky, Michael and Zamani, Hamed. L a MP : When Large Language Models Meet Personalization. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.acl-long.399

work page doi:10.18653/v1/2024.acl-long.399 2024

[48] [48]

Alfonso and Martin, Maite

Plaza del Arco, Flor Miriam and Strapparava, Carlo and Urena Lopez, L. Alfonso and Martin, Maite. E mo E vent: A Multilingual Emotion Corpus based on different Events. Proceedings of the Twelfth Language Resources and Evaluation Conference. 2020

work page 2020

[49] [49]

arXiv preprint arXiv:2301.10684 , year=

Consistency is key: Disentangling label variation in natural language processing with intra-annotator agreement , author=. arXiv preprint arXiv:2301.10684 , year=

work page arXiv

[50] [50]

Disagreement in Argumentation Annotation

Lindahl, Anna. Disagreement in Argumentation Annotation. Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) @ LREC-COLING 2024. 2024

work page 2024

[51] [51]

arXiv preprint arXiv:2306.06826 , year=

When do annotator demographics matter? measuring the influence of annotator demographics with the popquorn dataset , author=. arXiv preprint arXiv:2306.06826 , year=

work page arXiv

[52] [52]

Victoria and Herrera, Francisco

Rodr \'i guez-Barroso, Nuria and C \'a mara, Eugenio Mart \'i nez and Collados, Jose Camacho and Luz \'o n, M. Victoria and Herrera, Francisco. Federated Learning for Exploiting Annotators' Disagreements in Natural Language Processing. Transactions of the Association for Computational Linguistics. 2024. doi:10.1162/tacl_a_00664

work page doi:10.1162/tacl_a_00664 2024

[53] [53]

Proceedings of the 1st workshop on benchmarking: past, present and future , pages=

We need to consider disagreement in evaluation , author=. Proceedings of the 1st workshop on benchmarking: past, present and future , pages=. 2021 , organization=

work page 2021

[54] [54]

Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting

Beck, Tilman and Schuff, Hendrik and Lauscher, Anne and Gurevych, Iryna. Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 2024

work page 2024

[55] [55]

Quantifying the persona effect in LLM simulations

Hu, Tiancheng and Collier, Nigel. Quantifying the Persona Effect in LLM Simulations. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.acl-long.554

work page doi:10.18653/v1/2024.acl-long.554 2024

[56] [56]

Proceedings of the ACM on Human-Computer Interaction , volume=

Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation , author=. Proceedings of the ACM on Human-Computer Interaction , volume=. 2022 , publisher=

work page 2022

[57] [57]

PloS one , volume=

Hate speech detection: Challenges and solutions , author=. PloS one , volume=. 2019 , publisher=

work page 2019

[58] [58]

Journal of Communication Inquiry , volume=

Towards a definition of hate speech—With a focus on online contexts , author=. Journal of Communication Inquiry , volume=. 2023 , publisher=

work page 2023

[59] [59]

Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media , pages=

Reconsidering annotator disagreement about racist language: Noise or signal? , author=. Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media , pages=

work page

[60] [60]

D3CODE : Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation , April 2024

D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation , author=. arXiv preprint arXiv:2404.10857 , year=

work page arXiv

[61] [61]

2019 , publisher=

Annotating Twitter data from vulnerable populations: Evaluating disagreement between domain experts and graduate student annotators , author=. 2019 , publisher=

work page 2019

[62] [62]

Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024) , pages=

Leveraging Annotator Disagreement for Text Classification , author=. Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024) , pages=

work page 2024

[63] [63]

Proceedings of the 15th ACM Web Science Conference 2023 , pages=

Understanding misogynoir: A study of annotators’ perspectives , author=. Proceedings of the 15th ACM Web Science Conference 2023 , pages=

work page 2023

[64] [64]

1st Workshop on Perspectivist Approaches to NLP , pages=

Disagreement space in argument analysis , author=. 1st Workshop on Perspectivist Approaches to NLP , pages=. 2022 , organization=

work page 2022

[65] [65]

Text Structure and Its Ambiguities: Corpus Annotation as a Helpful Guide , author=

work page

[66] [66]

Dialogue & Discourse , volume=

Examples and specifications that prove a point: Identifying elaborative and argumentative discourse relations , author=. Dialogue & Discourse , volume=

work page

[67] [67]

Exploiting ` Subjective ' Annotations

Reidsma, Dennis and op den Akker, Rieks. Exploiting ` Subjective ' Annotations. Coling 2008: Proceedings of the workshop on Human Judgements in Computational Linguistics. 2008

work page 2008

[68] [68]

Language Resources and Evaluation , pages=

Perspectivist approaches to natural language processing: a survey , author=. Language Resources and Evaluation , pages=. 2024 , publisher=

work page 2024

[69] [69]

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels

Fleisig, Eve and Blodgett, Su Lin and Klein, Dan and Talat, Zeerak. The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024

work page 2024

[70] [70]

Proceedings of LAW X: The 10th Linguistic Annotation Workshop , pages=

Supersense tagging with inter-annotator disagreement , author=. Proceedings of LAW X: The 10th Linguistic Annotation Workshop , pages=. 2016 , organization=

work page 2016

[71] [71]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

Annotate Chinese Aspect with UMR——a Case Study on the Liitle Prince , author=. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , pages=

work page 2024

[72] [72]

Language Resources and Evaluation , volume=

Multiplicity and word sense: evaluating and learning from multiply labeled word sense annotations , author=. Language Resources and Evaluation , volume=. 2012 , publisher=

work page 2012

[73] [73]

Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

Embracing ambiguity: A comparison of annotation methodologies for crowdsourcing word sense labels , author=. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

work page 2013

[74] [74]

arXiv preprint arXiv:2402.01423 , year=

Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations , author=. arXiv preprint arXiv:2402.01423 , year=

work page arXiv

[75] [75]

Architectural Sweet Spots for Modeling Human Label Variation by the Example of Argument Quality: It`s Best to Relate Perspectives!

Heinisch, Philipp and Orlikowski, Matthias and Romberg, Julia and Cimiano, Philipp. Architectural Sweet Spots for Modeling Human Label Variation by the Example of Argument Quality: It`s Best to Relate Perspectives!. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653/v1/2023.emnlp-main.687

work page doi:10.18653/v1/2023.emnlp-main.687 2023

[76] [76]

Journal of Information Processing , volume=

Geographical entity annotated corpus of Japanese microblogs , author=. Journal of Information Processing , volume=. 2017 , publisher=

work page 2017

[77] [77]

Proceedings of the AAAI Conference on Human Computation and Crowdsourcing , volume=

Capturing ambiguity in crowdsourcing frame disambiguation , author=. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing , volume=

work page

[78] [78]

The Importance of Modeling Social Factors of Language: Theory and Practice

Hovy, Dirk and Yang, Diyi. The Importance of Modeling Social Factors of Language: Theory and Practice. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. doi:10.18653/v1/2021.naacl-main.49

work page doi:10.18653/v1/2021.naacl-main.49 2021

[79] [79]

International Conference on Information , pages=

The origin and value of disagreement among data labelers: A case study of individual differences in hate speech annotation , author=. International Conference on Information , pages=. 2022 , organization=

work page 2022

[80] [80]

Identifying and Measuring Annotator Bias Based on Annotators ' Demographic Characteristics

Al Kuwatly, Hala and Wich, Maximilian and Groh, Georg. Identifying and Measuring Annotator Bias Based on Annotators ' Demographic Characteristics. Proceedings of the Fourth Workshop on Online Abuse and Harms. 2020. doi:10.18653/v1/2020.alw-1.21

work page doi:10.18653/v1/2020.alw-1.21 2020