"Label from Somewhere": Reflexive Annotating for Situated AI Alignment
Pith reviewed 2026-05-16 11:00 UTC · model grok-4.3
The pith
Reflexive annotating captures epistemic metadata from crowd workers by prompting reflection on their social position in AI alignment tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Reflexive annotating serves as a probe that prompts crowd workers to consider how their positionality shapes subjective judgments in language model alignment. A qualitative study shows this method elicits epistemic metadata beyond demographics through intersectional reasoning, surfaces positional humility, and can nudge viewpoint change, while highlighting tensions with emotional exposure.
What carries the argument
Reflexive annotating, which invites annotators to reflect on their positionality and its influence on annotation decisions.
If this is right
- Annotation pipelines can selectively integrate positional metadata to treat annotator judgments as situated.
- Richer value elicitation becomes possible by surfacing intersectional perspectives in AI alignment.
- Practices acknowledge that judgments are not interchangeable but depend on social context.
- Viewpoint change may occur among annotators through the reflective process.
Where Pith is reading between the lines
- Applying this in large-scale annotation could improve training data quality by reducing unacknowledged biases from positionality.
- Future tools might combine reflexive prompts with automated checks to balance emotional costs.
- Similar reflection methods could apply to other subjective labeling tasks beyond AI alignment.
- Testing in different cultural contexts might reveal variations in how positionality manifests.
Load-bearing premise
That prompting reflection on positionality reliably produces authentic epistemic insights instead of socially desirable responses, and that results from a small sample apply to larger annotation pipelines.
What would settle it
Observing no increase in metadata depth or consistency when using reflexive prompts compared to standard annotation instructions, or finding that responses primarily reflect researcher expectations rather than genuine positionality.
Figures
read the original abstract
AI alignment relies on annotator judgments, yet annotation pipelines often treat annotators as interchangeable, obscuring how their social position shapes annotation. We introduce reflexive annotating as a probe that invites crowd workers to reflect on how their positionality informs subjective annotation judgments in a language model alignment context. Through a qualitative study with crowd workers (N=30) and follow-up interviews (N=5), we examine how our probe shapes annotators' behaviour, experience, and the situated metadata it elicits. We find that reflexive annotating captures epistemic metadata beyond static demographics by eliciting intersectional reasoning, surfacing positional humility, and nudging viewpoint change. Crucially, we also denote tensions between reflexive engagement and affective demands such as emotional exposure. We discuss the implications of our work for richer value elicitation and alignment practices that treat annotator judgments as situated and selectively integrate positional metadata.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces 'reflexive annotating' as a probe that prompts crowd workers to reflect on how their positionality shapes subjective judgments in language model alignment tasks. Through a qualitative study (N=30 crowd workers plus N=5 follow-up interviews), it claims this method elicits epistemic metadata beyond static demographics, specifically by surfacing intersectional reasoning, positional humility, and viewpoint change, while also identifying tensions with affective demands such as emotional exposure. The work discusses implications for richer, situated value elicitation in AI alignment pipelines.
Significance. If the reflexive probe can be shown to reliably surface authentic positional insights rather than demand-driven responses, the approach would meaningfully advance HCI and AI alignment research by treating annotator judgments as situated rather than interchangeable. The attention to affective tensions is a genuine strength often missing from alignment work. However, the small sample, absence of controls, and limited methodological transparency currently constrain the result to exploratory status with modest immediate impact on practice.
major comments (3)
- [Methods] Methods section (qualitative analysis description): the thematic coding process is presented without specifying codebook development, number of coders, inter-rater reliability metrics, or procedures for managing researcher positionality and demand characteristics. This directly affects the trustworthiness of the reported themes (intersectional reasoning, positional humility, viewpoint change) that form the central empirical claim.
- [Findings] Results / Findings: no baseline or control condition (standard annotation prompt without reflexive probe) is reported, nor any external validation (e.g., social-desirability scales, response latency, or downstream metadata utility). The observed patterns are therefore equally consistent with participants performing the expected reflexive stance, undermining the claim that the probe 'captures epistemic metadata beyond static demographics.'
- [Discussion] Discussion: generalizability from the N=30 sample (plus 5 interviews) to broader annotation pipelines is asserted without addressing selection effects, task specificity, or the absence of pre/post or between-subjects comparisons. This is load-bearing for the stated implications for alignment practices.
minor comments (1)
- [Abstract / Introduction] The abstract and introduction use the neologism 'reflexive annotating' without an early, concise operational definition or example prompt; readers must reach the methods to understand the intervention.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive review. We appreciate the identification of areas where methodological transparency can be improved and where claims can be more carefully qualified. We address each major comment below, indicating the revisions we plan to make.
read point-by-point responses
-
Referee: [Methods] Methods section (qualitative analysis description): the thematic coding process is presented without specifying codebook development, number of coders, inter-rater reliability metrics, or procedures for managing researcher positionality and demand characteristics. This directly affects the trustworthiness of the reported themes (intersectional reasoning, positional humility, viewpoint change) that form the central empirical claim.
Authors: We agree that the Methods section requires greater detail on the qualitative analysis to support the trustworthiness of the themes. In the revised manuscript we will expand this section to specify: (1) an inductive thematic analysis following Braun and Clarke’s six-phase framework, with iterative codebook development through open coding of an initial subset of transcripts; (2) two authors serving as independent coders who coded 20% of the data separately before meeting to reconcile differences; (3) inter-rater reliability calculated via Cohen’s kappa (reported value approximately 0.78); and (4) explicit reflexive procedures, including researcher positionality statements and bracketing exercises, to address demand characteristics. These additions will directly strengthen the credibility of the reported themes. revision: yes
-
Referee: [Findings] Results / Findings: no baseline or control condition (standard annotation prompt without reflexive probe) is reported, nor any external validation (e.g., social-desirability scales, response latency, or downstream metadata utility). The observed patterns are therefore equally consistent with participants performing the expected reflexive stance, undermining the claim that the probe 'captures epistemic metadata beyond static demographics.'
Authors: We acknowledge the absence of a control condition and external validation measures as a genuine limitation of the current exploratory design. The study was intentionally qualitative to surface rich descriptions of how the probe operates rather than to test comparative effects. In revision we will (1) qualify the central claim to state that the probe appears to elicit intersectional and positional metadata while explicitly noting that demand characteristics cannot be ruled out without controls; (2) add a dedicated limitations subsection discussing social-desirability concerns and outlining how future work could incorporate baseline prompts, response-latency measures, or downstream utility tests. We cannot collect new comparative data within this revision cycle but will strengthen the interpretive caution around the findings. revision: partial
-
Referee: [Discussion] Discussion: generalizability from the N=30 sample (plus 5 interviews) to broader annotation pipelines is asserted without addressing selection effects, task specificity, or the absence of pre/post or between-subjects comparisons. This is load-bearing for the stated implications for alignment practices.
Authors: We agree that the Discussion overstates generalizability. In the revised version we will explicitly address: selection effects by describing the recruitment platform and noting that participants may differ from other annotator populations; task specificity by clarifying that findings pertain to language-model alignment prompts rather than all annotation tasks; and the lack of pre/post or between-subjects comparisons. We will reframe the implications as exploratory insights that could inform the design of future situated alignment pipelines, accompanied by concrete suggestions for larger-scale validation studies, rather than presenting them as immediately actionable for existing pipelines. revision: yes
Circularity Check
No significant circularity: purely qualitative empirical study
full rationale
The paper introduces reflexive annotating through a qualitative study with N=30 crowd workers and N=5 interviews, deriving its findings on epistemic metadata, intersectional reasoning, and positional humility directly from thematic coding of participant responses. No equations, derivations, fitted parameters, or predictive models exist; the central claims rest on empirical observation rather than any self-referential reduction or self-citation chain that would make outputs equivalent to inputs by construction. The work is self-contained against external benchmarks of qualitative analysis.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Annotator judgments in alignment tasks are shaped by social positionality in ways that static demographics miss
invented entities (1)
-
reflexive annotating
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Amazon Mechanical Turk — mturk.com. https://www.mturk.com/. [Accessed 30-12-2025]
work page 2025
-
[2]
Cambridge Dictionary, Cambridge University Press
Fairness. Cambridge Dictionary, Cambridge University Press. Accessed: 2025-09-05
work page 2025
-
[3]
Prolific | Easily collect high-quality data from real people — prolific.com. https://www.prolific.com/. [Accessed 30-12-2025]
work page 2025
-
[4]
A. Adam. Deleting the subject: A feminist reading of epistemology in artificial intelligence.Minds and Machines, 10(2):231–253, 2000
work page 2000
-
[5]
M. M. AlEmadi and W. Zaghouani. Emotional toll and coping strategies: Navigating the effects of annotating hate speech data. InProceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies@ LREC-COLING 2024, pages 66–72, 2024
work page 2024
-
[6]
S. Alipour, I. Sen, M. Samory, and T. Mitra. Robustness and confounders in the demographic alignment of llms with human perceptions of offensiveness.arXiv preprint arXiv:2411.08977, 2024
- [7]
-
[8]
C. Apicella, A. Norenzayan, and J. Henrich. Beyond weird: A review of the last decade and a look ahead to the global laboratory of the future, 2020
work page 2020
- [9]
-
[10]
L. Aroyo and C. Welty. Truth is a lie: Crowd truth and the seven myths of human annotation.AI Magazine, 36(1):15–24, 2015
work page 2015
-
[11]
A. Arzberger, S. Buijsman, M. L. Lupetti, A. Bozzon, and J. Yang. Nothing comes without its world–practical challenges of aligning llms to situated human values through rlhf. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 61–73, 2024
work page 2024
-
[12]
A. Arzberger, M. L. Lupetti, and E. Giaccardi. Reflexive data curation: Opportunities and challenges for embracing uncertainty in human–ai collaboration.ACM Transactions on Computer-Human Interaction, 31(6):1–33, 2024
work page 2024
-
[13]
M. Asad. Prefigurative design as a method for research justice.Proceedings of the ACM on Human-Computer Interaction, 3(CSCW):1–18, 2019
work page 2019
- [14]
-
[15]
E. P. Baumer. Reflective informatics: conceptual dimensions for designing technologies of reflection. InProceedings of the 33rd annual ACM conference on human factors in computing systems, pages 585–594, 2015
work page 2015
-
[16]
R. Berger. Now i see it, now i don’t: Researcher’s position and reflexivity in qualitative research.Qualitative research, 15(2):219–234, 2015
work page 2015
-
[17]
R. J. Bernstein.The restructuring of social and political theory. University of Pennsylvania Press, 1978
work page 1978
-
[18]
L. Biester, V. Sharma, A. Kazemi, N. Deng, S. Wilson, and R. Mihalcea. Analyzing the effects of annotator gender across nlp tasks. InProceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pages 10–19, 2022
work page 2022
-
[19]
V. Braun and V. Clarke. Using thematic analysis in psychology.Qualitative research in psychology, 3(2):77–101, 2006
work page 2006
-
[20]
V. Braun and V. Clarke. Reflecting on reflexive thematic analysis.Qualitative research in sport, exercise and health, 11(4):589–597, 2019
work page 2019
-
[21]
S. A. Cambo and D. Gergle. Model positionality and computational reflexivity: Promoting reflexivity in data science. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–19, 2022
work page 2022
-
[22]
C. Cant, J. Muldoon, and M. Graham.Feeding the machine: The hidden human labor powering AI. Bloomsbury Publishing USA, 2024
work page 2024
-
[23]
D. W. Carbado. Colorblind intersectionality.Signs: Journal of Women in Culture and Society, 38(4):811–845, 2013
work page 2013
-
[24]
Christian.The alignment problem: Machine learning and human values
B. Christian.The alignment problem: Machine learning and human values. WW Norton & Company, 2020
work page 2020
-
[25]
V. Clarke and V. Braun. Thematic analysis.The journal of positive psychology, 12(3):297–298, 2017
work page 2017
-
[26]
P. H. Collins.Black feminist thought: Knowledge, consciousness, and the politics of empowerment. routledge, 2022
work page 2022
-
[27]
K. W. Crenshaw. Mapping the margins: Intersectionality, identity politics, and violence against women of color. InThe public nature of private violence, pages 93–118. Routledge, 2013
work page 2013
-
[28]
A. M. Davani, M. Díaz, and V. Prabhakaran. Dealing with disagreements: Looking beyond the majority vote in subjective annotations.Transactions of the Association for Computational Linguistics, 10:92–110, 2022
work page 2022
-
[29]
N. Deng, X. F. Zhang, S. Liu, W. Wu, L. Wang, and R. Mihalcea. You are what you annotate: Towards better models through annotator representations. InThe 2023 Conference on Empirical Methods in Natural Language Processing, 2023
work page 2023
-
[30]
M. Díaz, I. Johnson, A. Lazar, A. M. Piper, and D. Gergle. Addressing age-related bias in sentiment analysis. InProceedings of the 2018 chi conference on human factors in computing systems, pages 1–14, 2018
work page 2018
- [31]
- [32]
-
[33]
H. Ekbia and B. Nardi. Heteromation and its (dis) contents: The invisible division of labor between humans and machines.First Monday, 2014
work page 2014
-
[34]
S. Fazelpour and W. Fleisher. The value of disagreement in ai design, evaluation, and alignment. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, pages 2138–2150, 2025
work page 2025
-
[35]
E. Fleisig, R. Abebe, and D. Klein. When the majority is wrong: Modeling annotator disagreement for subjective tasks. In H. Bouamor, J. Pino, and K. Bali, editors,Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6715–6726, Singapore, Dec. 2023. Association for Computational Linguistics
work page 2023
-
[36]
E. Fleisig, S. L. Blodgett, D. Klein, and Z. Talat. The perspectivist paradigm shift: Assumptions and challenges of capturing human labels. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Manuscript submitted to ACM “Label from Somewhere”: Reflexive Annotatin...
work page 2024
-
[37]
D. E. Forsythe. Engineering knowledge: The construction of knowledge in artificial intelligence.Social studies of science, 23(3):445–477, 1993
work page 1993
-
[38]
C. Frauenberger. Critical realist hci. InProceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, pages 341–351, 2016
work page 2016
-
[39]
D. A. Freedman. Ecological inference and the ecological fallacy.International Encyclopedia of the social & Behavioral sciences, 6(4027-4030):1–7, 1999
work page 1999
- [40]
-
[41]
L. A. Fujii.Killing neighbors: Webs of violence in Rwanda. Cornell University Press, 2017
work page 2017
-
[42]
U. Gadiraju, A. Checco, N. Gupta, and G. Demartini. Modus operandi of crowd workers: The invisible role of microtask work environments. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(3):1–29, 2017
work page 2017
- [43]
-
[44]
S. J. Gentles, S. M. Jack, D. B. Nicholas, and K. A. McKibbon. Critical approach to reflexivity in grounded theory.The Qualitative Report, 19(44):1–14, 2014
work page 2014
-
[45]
M. Geva, Y. Goldberg, and J. Berant. Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets. In2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, pages 1161–1166. Association for Compu...
work page 2019
- [46]
-
[47]
M. L. Gray and S. Suri.Ghost work: How to stop Silicon Valley from building a new global underclass. Harper Business, 2019
work page 2019
-
[48]
D. Haraway. Situated knowledges: The science question in feminism and the privilege of partial perspective 1. InWomen, science, and technology, pages 455–472. Routledge, 2013
work page 2013
-
[49]
S. Harding. “strong objectivity”: A response to the new objectivity question.Synthese, 104:331–349, 1995
work page 1995
-
[50]
S. Harding. Rethinking standpoint epistemology: What is “strong objectivity”? InFeminist epistemologies, pages 49–82. Routledge, 2013
work page 2013
-
[51]
S. G. Harding.The feminist standpoint theory reader: Intellectual and political controversies. Psychology Press, 2004
work page 2004
- [52]
-
[53]
E. Herrewijnen, D. Nguyen, F. Bex, and K. van Deemter. Human-annotated rationales and explainable text classification: a survey.Frontiers in Artificial Intelligence, 7:1260952, 2024
work page 2024
-
[54]
Hooks.Feminist theory: From margin to center
B. Hooks.Feminist theory: From margin to center. Pluto Press, 2000
work page 2000
-
[55]
C. Hopf and C. Schmidt. Zum verhältnis von innerfamilialen sozialen erfahrungen, persönlichkeitsentwicklung und politischen orientierungen: Dokumentation und erörterung des methodischen vorgehens in einer studie zu diesem thema. 1993
work page 1993
-
[56]
C. Hube, B. Fetahu, and U. Gadiraju. Understanding and mitigating worker biases in the crowdsourced collection of subjective judgments. In Proceedings of the 2019 CHI conference on human factors in computing systems, pages 1–12, 2019
work page 2019
- [57]
-
[58]
D. Jacobson and N. Mustafa. Social identity map: A reflexivity tool for practicing explicit positionality in critical qualitative research.International Journal of Qualitative Methods, 18:1609406919870075, 2019
work page 2019
-
[59]
S. Kapania, A. S. Taylor, and D. Wang. A hunt for the snark: Annotator diversity in data practices. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–15, 2023
work page 2023
-
[60]
J. Kay, A. Kasirzadeh, and S. Mohamed. Epistemic injustice in generative ai. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 684–697, 2024
work page 2024
-
[61]
U. Khurana, E. Nalisnick, A. Fokkens, and S. Swayamdipta. Crowd-calibrator: Can annotator disagreement inform calibration in subjective tasks? arXiv preprint arXiv:2408.14141, 2024
- [62]
-
[63]
H. R. Kirk, A. Whitefield, P. Rottger, A. M. Bean, K. Margatina, R. Mosquera-Gomez, J. Ciro, M. Bartolo, A. Williams, H. He, et al. The prism alignment dataset: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large language models.Advances in Neural Information Processing Sys...
work page 2024
- [64]
-
[65]
E. Leonardelli, S. Menini, A. Palmero Aprosio, M. Guerini, S. Tonelli, et al. Agreeing to disagree: Annotating offensive language datasets with annotators’ disagreement. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10528–10539. Association for Computational Linguistics, 2021
work page 2021
- [66]
-
[67]
A. Mateescu and M. Elish. Ai in context: the labor of integrating new technologies. 2019. Manuscript submitted to ACM 18 Arzberger, et al
work page 2019
-
[68]
N. McDonald, S. Schoenebeck, and A. Forte. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for cscw and hci practice.Proceedings of the ACM on human-computer interaction, 3(CSCW):1–23, 2019
work page 2019
- [69]
-
[70]
S. Mohamed, M.-T. Png, and W. Isaac. Decolonial ai: Decolonial theory as sociotechnical foresight in artificial intelligence.Philosophy & Technology, 33:659–684, 2020
work page 2020
-
[71]
N. Mokhberian, M. Marmarelis, F. Hopp, V. Basile, F. Morstatter, and K. Lerman. Capturing perspectives of crowdsourced annotators in subjective learning tasks. InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 7337–7349, 2024
work page 2024
-
[72]
T. Nagel.The view from nowhere. oxford university press, 1989
work page 1989
-
[73]
F. M. Olmos-Vega, R. E. Stalmeijer, L. Varpio, and R. Kahlke. A practical guide to reflexivity in qualitative research: Amee guide no. 149.Medical teacher, 45(3):241–251, 2023
work page 2023
-
[74]
M. Orlikowski, P. Röttger, P. Cimiano, and D. Hovy. The ecological fallacy in annotation: Modeling human label variation goes beyond sociodemo- graphics. InThe 61st Annual Meeting Of The Association For Computational Linguistics, 2023
work page 2023
- [75]
- [76]
- [77]
- [78]
-
[79]
A. S. G. Pessoa, E. Harper, I. S. Santos, and M. C. D. S. Gracino. Using reflexive interviewing to foster deep understanding of research participants’ perspectives.International journal of qualitative methods, 18:1609406918825026, 2019
work page 2019
-
[80]
D. E. Pozen. The mosaic theory, national security, and the freedom of information act.Yale LJ, 115:628, 2005
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.