Recognition: no theorem link
Co-Constructing Alignment: A Participatory Approach to Situate AI Values
Pith reviewed 2026-05-16 12:23 UTC · model grok-4.3
The pith
Alignment between users and AI is co-constructed through their ongoing interactions rather than preset in the model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Alignment is an interactional practice co-constructed during human-AI interaction. In a workshop that paired misalignment diaries with generative design activities, researchers using large language models as research assistants reported that misalignments appear as unexpected responses and task or social breakdowns. Participants described contributing to alignment through roles that include adjusting model behavior, interpreting outputs, and using deliberate non-engagement as a strategy.
What carries the argument
The participatory workshop combining misalignment diaries with generative design activities, which makes visible how users experience misalignments in context and how they envision acting on them.
If this is right
- AI systems should provide interfaces that let users adjust or reinterpret outputs during use.
- Designs should recognize deliberate non-engagement as one legitimate way users maintain alignment.
- Alignment support must be ongoing and tied to specific tasks and social contexts rather than delivered once.
- User roles in alignment include active interpretation and response rather than passive reception of model values.
Where Pith is reading between the lines
- Interfaces could add lightweight logging features so users can note and revisit misalignments without extra effort.
- The same diary-plus-design method could be tested with other user groups such as educators or clinicians to see if patterns repeat.
- Over time, systems that treat alignment as co-construction might reduce the need for repeated retraining by letting users steer behavior in context.
Load-bearing premise
That experiences shared in one workshop with researchers using large language models reflect how alignment works for wider groups of users and in actual daily practice.
What would settle it
A longitudinal study that logs real LLM interactions, records observed misalignments, and compares them against users' later self-reports would show whether the workshop descriptions match day-to-day dynamics.
Figures
read the original abstract
As AI systems become embedded in everyday practice, value misalignment has emerged as a pressing concern. Yet, dominant alignment approaches remain model centric, treating users as passive recipients of prespecified values rather than as epistemic agents who encounter and respond to misalignment during interactions. Drawing on situated perspectives, we frame alignment as an interactional practice co-constructed during human AI interaction. We investigate how users understand and wish to contribute to this process through a participatory workshop that combines misalignment diaries with generative design activities. We surface how misalignments materialise in practice and how users envision acting on them, grounded in the context of researchers using Large Language Models as research assistants. Our findings show that misalignments are experienced less as abstract ethical violations than as unexpected responses, and task or social breakdowns. Participants articulated roles ranging from adjusting and interpreting model behaviour to deliberate non-engagement as an alignment strategy. We conclude with implications for designing systems that support alignment as an ongoing, situated, and shared practice.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that dominant model-centric approaches to AI value alignment overlook users as epistemic agents who actively encounter and respond to misalignment in practice. Drawing on situated perspectives, it reframes alignment as an interactional, co-constructed practice. This is investigated via a participatory workshop combining misalignment diaries and generative design activities with researchers who use LLMs as research assistants. The study surfaces misalignments as unexpected responses or task/social breakdowns rather than abstract ethical violations, identifies user roles such as adjusting, interpreting, and deliberate non-engagement, and derives design implications for systems that support alignment as an ongoing, situated, and shared process.
Significance. If the empirical patterns hold, the work offers a useful counterpoint to technical alignment research by grounding value alignment in everyday HCI practices. It provides concrete examples of how users already manage misalignment through interactional strategies, which could inform the design of more responsive AI interfaces and participatory alignment methods. The participatory approach itself demonstrates a method for eliciting user perspectives on alignment that may be adaptable to other domains.
major comments (1)
- The central framing of alignment as a general interactional practice rests on data from a single participatory workshop with a narrow, self-selected sample of LLM researchers. The manuscript must clarify whether the surfaced misalignment types and roles are presented as context-specific to academic LLM use or as evidence for broader dynamics; without this scoping or additional validation, the leap to design implications for general systems risks overgeneralization.
minor comments (1)
- The abstract omits basic methodological details (participant count, recruitment, analysis procedure) that are standard for qualitative HCI papers and would help readers assess the findings' grounding.
Simulated Author's Rebuttal
We thank the referee for this constructive comment on scoping and generalizability. We agree that the single-workshop design with a specific participant group requires clearer boundaries in the manuscript and have revised accordingly to avoid overgeneralization while retaining the value of the exploratory insights.
read point-by-point responses
-
Referee: The central framing of alignment as a general interactional practice rests on data from a single participatory workshop with a narrow, self-selected sample of LLM researchers. The manuscript must clarify whether the surfaced misalignment types and roles are presented as context-specific to academic LLM use or as evidence for broader dynamics; without this scoping or additional validation, the leap to design implications for general systems risks overgeneralization.
Authors: We agree that the empirical basis is a single participatory workshop with a self-selected group of LLM researchers and that this constrains claims to broader populations. The original manuscript already situates the work in the specific context of academic researchers using LLMs as research assistants (see abstract and Section 3), but we acknowledge that the transition to design implications could be read as implying wider applicability. In the revision we have added explicit scoping language in the introduction, findings, and conclusion: the observed misalignment types (unexpected responses, task/social breakdowns) and user roles (adjusting, interpreting, deliberate non-engagement) are presented as patterns identified within this academic LLM-assistant setting rather than as universal. Design implications are now framed as context-informed suggestions that illustrate how systems might support ongoing, situated alignment practices, with an explicit caveat that further validation across other user groups and domains is needed. We have also strengthened the limitations section to discuss sample characteristics and the exploratory nature of the participatory method. These changes directly address the risk of overgeneralization without requiring new data collection. revision: yes
Circularity Check
No circularity in qualitative framing or derivation
full rationale
The paper is a qualitative participatory study that frames alignment as co-constructed based on workshop findings with LLM researchers. No equations, fitted parameters, self-definitional loops, or load-bearing self-citations appear in the derivation. The central claim is inductively supported by the described misalignment diaries and design activities rather than reducing to its inputs by construction. External situated-perspective literature is invoked without smuggling ansatzes or uniqueness theorems from the authors' prior work. This is a standard honest non-finding for non-mathematical empirical papers.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Situated perspectives frame alignment as interactional practice
Reference graph
Works this paper leans on
- [1]
-
[2]
S. R. Arnstein. A ladder of citizen participation.Journal of the American Institute of planners, 35(4):216–224, 1969
work page 1969
-
[3]
L. Aroyo and C. Welty. Truth is a lie: Crowd truth and the seven myths of human annotation.AI Magazine, 36(1):15–24, 2015
work page 2015
-
[4]
A. Arzberger, S. Buijsman, M. L. Lupetti, A. Bozzon, and J. Yang. Nothing comes without its world–practical challenges of aligning llms to situated human values through rlhf. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 61–73, 2024
work page 2024
-
[5]
Y. Bai, S. Kadavath, S. Kundu, A. Askell, J. Kernion, A. Jones, A. Chen, A. Goldie, A. Mirhoseini, C. McKinnon, et al. Constitutional AI: Harmlessness from AI feedback.arXiv preprint arXiv:2212.08073, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[6]
A. Beirami, A. Agarwal, J. Berant, A. N. D’Amour, J. Eisenstein, C. Nagpal, and A. T. Suresh. Theoretical guarantees on the best-of-n alignment policy. InForty-second International Conference on Machine Learning, 2025
work page 2025
-
[7]
E. M. Bender, T. Gebru, A. McMillan-Major, and S. Shmitchell. On the dangers of stochastic parrots: Can language models be too big? InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, page 610–623, Virtual Event Canada, Mar. 2021. ACM
work page 2021
-
[8]
J. O. Berger. Statistical decision theory. InThe New Palgrave Dictionary of Economics, pages 1–6. Springer, 1987
work page 1987
-
[9]
A. Birhane, W. Isaac, V. Prabhakaran, M. Diaz, M. C. Elish, I. Gabriel, and S. Mohamed. Power to the people? opportunities and challenges for participatory ai. InProceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–8, 2022
work page 2022
-
[10]
R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, et al. On the opportunities and risks of foundation models. Technical report, Stanford Center for Research on Foundation Models, 2021. Stanford Technical Report
work page 2021
-
[11]
V. Braun and V. Clarke. Using thematic analysis in psychology.Qualitative research in psychology, 3(2):77–101, 2006
work page 2006
-
[12]
V. Braun and V. Clarke. Reflecting on reflexive thematic analysis.Qualitative research in sport, exercise and health, 11(4):589–597, 2019
work page 2019
-
[13]
P.-Y. Chen. Ai alignment dialogues: An interactive approach to ai alignment in support agents. InProceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pages 894–894, 2022
work page 2022
-
[14]
Christian.The alignment problem: Machine learning and human values
B. Christian.The alignment problem: Machine learning and human values. WW Norton & Company, 2020
work page 2020
-
[15]
P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei. Deep reinforcement learning from human preferences.Advances in neural information processing systems, 30, 2017
work page 2017
-
[16]
V. Clarke and V. Braun. Thematic analysis.The journal of positive psychology, 12(3):297–298, 2017
work page 2017
- [17]
-
[18]
A. Dahlgren Lindström, L. Methnani, L. Krause, P. Ericson, Í. M. de Rituerto de Troya, D. Coelho Mollo, and R. Dobbe. Helpful, harmless, honest? sociotechnical limits of ai alignment and safety through reinforcement learning from human feedback: Ad lindström et al.Ethics and Information Technology, 27(2):28, 2025
work page 2025
- [19]
-
[20]
F. Delgado, S. Yang, M. Madaio, and Q. Yang. The participatory turn in ai design: Theoretical foundations and the current state of practice. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–23, 2023
work page 2023
-
[21]
X. Fan, Q. Xiao, X. Zhou, J. Pei, M. Sap, Z. Lu, and H. Shen. User-driven value alignment: Understanding users’ perceptions and strategies for addressing biased and discriminatory statements in ai companions. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–19, 2025
work page 2025
-
[22]
J. Fang, Z. Bi, R. Wang, H. Jiang, Y. Gao, K. Wang, A. Zhang, J. Shi, X. Wang, and T.-S. Chua. Towards neuron attributions in multi-modal large language models. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 122867–122890. Curran Associates, ...
work page 2024
-
[23]
B. Friedman, P. H. Kahn Jr, A. Borning, and A. Huldtgren. Value sensitive design and information systems. InEarly engagement and new technologies: Opening up the laboratory, pages 55–95. Springer, 2013
work page 2013
-
[24]
I. Gabriel. Artificial intelligence, values, and alignment.Minds and machines, 30(3):411–437, 2020. Manuscript submitted to ACM Co-Constructing Alignment: A Participatory Approach to Situate AI Values 17
work page 2020
- [25]
-
[26]
J. Greenbaum and M. Kyng.Design at work: Cooperative design of computer systems. CRC Press, 2020
work page 2020
-
[27]
K. D. Gutiérrez and B. Rogoff. Cultural ways of learning: Individual traits or repertoires of practice.Educational researcher, 32(5):19–25, 2003
work page 2003
-
[28]
D. Hendrycks, C. Burns, S. Basart, A. Critch, J. Li, D. Song, and J. Steinhardt. Aligning AI With Shared Human Values. InProceedings of the Ninth International Conference on Learning Representations, ICLR ’21, pages 1–29, Online, 2021. OpenReview.net
work page 2021
-
[29]
J. Hoover, G. Portillo-Wightman, L. Yeh, S. Havaldar, A. M. Davani, Y. Lin, B. Kennedy, M. Atari, Z. Kamel, M. Mendlen, et al. Moral foundations twitter corpus: A collection of 35k tweets annotated for moral sentiment.Social Psychological and Personality Science, 11(8):1057–1071, 2020
work page 2020
-
[30]
C. Hopf and C. Schmidt.Zum Verhältnis von innerfamilialen sozialen Erfahrungen, Persönlichkeitsentwicklung und politischen Orientierungen: Dokumentation und Erörterung des methodischen Vorgehens in einer Studie zu diesem Thema. DEU, 1993
work page 1993
- [31]
-
[32]
S. Lee, Z. J. Wang, A. Chakravarthy, A. Helbling, S. Peng, M. Phute, D. H. P. Chau, and M. Kahng. Llm attributor: Interactive visual attribution for llm generation.Proceedings of the AAAI Conference on Artificial Intelligence, 39(28):29655–29657, Apr. 2025
work page 2025
-
[33]
B. Y. Lin, A. Ravichander, X. Lu, N. Dziri, M. Sclar, K. Chandu, C. Bhagavatula, and Y. Choi. The unlocking spell on base LLMs: Rethinking alignment via in-context learning. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[34]
E. Liscio, A. E. Dondera, A. Geadau, C. M. Jonker, and P. K. Murukannaiah. Cross-Domain Classification of Moral Values. InFindings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics, NAACL ’22, pages 2727–2745, Seattle, WA, USA, 2022
work page 2022
-
[35]
E. Liscio, M. van der Meer, L. C. Siebert, C. M. Jonker, N. Mouter, and P. K. Murukannaiah. Axies: Identifying and evaluating context-specific values. InProceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021), pages 799–808, 2021
work page 2021
-
[36]
N. McDonald, S. Schoenebeck, and A. Forte. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for cscw and hci practice.Proceedings of the ACM on human-computer interaction, 3(CSCW):1–23, 2019
work page 2019
-
[37]
S. Mishra. Decision-making under risk: Integrating perspectives from biology, economics, and psychology.Personality and Social Psychology Review, 18(3):280–307, 2014
work page 2014
-
[38]
M. J. Muller. The human-computer interaction handbook.Participatory design: the third space in HCI, pages 1051–1068, 2003
work page 2003
-
[39]
M. J. Muller and A. Druin. Participatory design: the third space in hci in the humancomputer interaction handbook: Fundamentals, evolving technologies and emerging applications, 2002
work page 2002
-
[40]
M. J. Muller and S. Kuhn. Participatory design.Communications of the ACM, 36(6):24–28, 1993
work page 1993
- [41]
-
[42]
B. Pan, Y. Li, W. Zhang, W. Lu, M. Xu, S. Zhou, Y. Zhu, M. Zhong, and T. Qian. A survey on training-free alignment of large language models. In C. Christodoulopoulos, T. Chakraborty, C. Rose, and V. Peng, editors,Findings of the Association for Computational Linguistics: EMNLP 2025, pages 4445–4461, Suzhou, China, Nov. 2025. Association for Computational ...
work page 2025
-
[43]
J. Parekh, P. Khayatan, M. Shukor, A. Newson, and M. Cord. A concept-based explainability framework for large multimodal models. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 135783–135818. Curran Associates, Inc., 2024
work page 2024
-
[44]
G. Pasman, S. Boess, P. Desmet, et al. Interaction vision: expressing and identifying the qualities of user-product interactions. InDS 69: Proceedings of E&PDE 2011, the 13th International Conference on Engineering and Product Design Education, London, UK, 08.-09.09. 2011, pages 149–154, 2011
work page 2011
-
[45]
Polletta.Freedom is an endless meeting: Democracy in American social movements
F. Polletta.Freedom is an endless meeting: Democracy in American social movements. University of Chicago Press, 2019
work page 2019
-
[46]
A. Pommeranz, C. Detweiler, P. Wiggers, and C. Jonker. Elicitation of situated values: need for tools to help stakeholders and designers to reflect and communicate.Ethics and Information Technology, 14(4):285–303, 2012
work page 2012
-
[47]
L. Qiu, Y. Zhao, J. Li, P. Lu, B. Peng, J. Gao, and S.-C. Zhu. Valuenet: A new dataset for human value driven dialogue system. InProceedings of the AAAI Conference on Artificial Intelligence, pages 11183–11191, 2022
work page 2022
-
[48]
R. Rafailov, A. Sharma, E. Mitchell, C. D. Manning, S. Ermon, and C. Finn. Direct preference optimization: Your language model is secretly a reward model.Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[49]
D. Ricketts and D. Lockton. Mental landscapes: Externalizing mental models through metaphors.Interactions, 26(2):86–90, 2019
work page 2019
-
[50]
Russell.Human compatible: Artificial intelligence and the problem of control
S. Russell.Human compatible: Artificial intelligence and the problem of control. Penguin, 2019
work page 2019
- [51]
-
[52]
C. Sanchez, S. Wang, K. Savolainen, F. A. Epp, and A. Salovaara. Let’s talk futures: A literature review of hci’s future orientation. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–36, 2025
work page 2025
-
[53]
E. B.-N. Sanders and P. J. Stappers.Convivial toolbox: Generative research for the front end of design. Bis, 2012
work page 2012
-
[54]
Agent Laboratory: Using LLM Agents as Research Assistants
S. Schmidgall, Y. Su, Z. Wang, X. Sun, J. Wu, X. Yu, J. Liu, Z. Liu, and E. Barsoum. Agent laboratory: Using llm agents as research assistants.arXiv preprint arXiv:2501.04227, 2025
work page internal anchor Pith review arXiv 2025
-
[55]
S. H. Schwartz. An overview of the Schwartz theory of basic values.Online Readings in Psychology and Culture, 2(1):11:1–11:20, 2012. Manuscript submitted to ACM 18 Arzberger, et al
work page 2012
- [56]
-
[57]
T. Shi, Z. Wang, L. Yang, Y.-C. Lin, Z. He, M. Wan, P. Zhou, S. Jauhar, S. Chen, S. Xia, et al. Wildfeedback: Aligning llms with in-situ user interactions and feedback.arXiv preprint arXiv:2408.15549, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[58]
J. Simonsen and T. Robertson.Routledge international handbook of participatory design, volume 711. Routledge New York, 2013
work page 2013
-
[59]
A. Sohail and L. Zhang. Using large language models to facilitate academic work in the psychological sciences.Current Psychology, 44(9):7910–7918, 2025
work page 2025
-
[60]
T. Sorensen, L. Jiang, J. D. Hwang, S. Levine, V. Pyatkin, P. West, N. Dziri, X. Lu, K. Rao, C. Bhagavatula, et al. Value kaleidoscope: Engaging ai with pluralistic human values, rights, and duties. InProceedings of the AAAI Conference on Artificial Intelligence, pages 19937–19947, 2024
work page 2024
-
[61]
N. Stiennon, L. Ouyang, J. Wu, D. Ziegler, R. Lowe, C. Voss, A. Radford, D. Amodei, and P. F. Christiano. Learning to summarize with human feedback.Advances in Neural Information Processing Systems, 33:3008–3021, 2020
work page 2020
- [62]
-
[63]
S. Tiwari. Biases and fairness in llms. InGenerative AI: Techniques, Models and Applications, pages 229–242. Springer, 2025
work page 2025
- [64]
- [65]
- [66]
-
[67]
S. Whitfield and M. A. Hofmann. Elicit: Ai literature review research assistant.Public Services Quarterly, 19(3):201–207, 2023
work page 2023
-
[68]
P.-H. Wong. Cultural differences as excuses? human rights and cultural values in global ethics and governance of ai.Philosophy & Technology, 33(4):705–715, 2020
work page 2020
-
[69]
H. Zhao, H. Chen, F. Yang, N. Liu, H. Deng, H. Cai, S. Wang, D. Yin, and M. Du. Explainability for large language models: A survey.ACM Trans. Intell. Syst. Technol., 15(2), Feb. 2024. A PARTICIPANT DEMOGRAPHICS Figure A1 illustrates the demographics of our workshop participants for the variables age, gender, and region of origin, as well as self-assigned ...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.