Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks

Alice Gao; Andrew N. Meltzoff; Katharina Reinecke; Maarten Sap

arxiv: 2605.20512 · v1 · pith:B4MK6PMNnew · submitted 2026-05-19 · 💻 cs.HC

Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks

Alice Gao , Andrew N. Meltzoff , Maarten Sap , Katharina Reinecke This is my paper

Pith reviewed 2026-05-21 06:12 UTC · model grok-4.3

classification 💻 cs.HC

keywords AI reliancevalue framingLLM writing assistancehuman-AI interactionoverreliance reductionbias awarenesswriting tasksuser personalization

0 comments

The pith

Framing an AI with specific values reduces users' reliance on its text suggestions by an average of 20 percent in writing tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether making an AI's value biases explicit can decrease over-reliance on its outputs during writing. Large language models often generate text aligned with Western values, and users frequently accept large portions of these suggestions, which can homogenize styles across different cultural backgrounds. Through a between-subjects experiment involving Indian and American participants completing AI-supported writing tasks, the authors compare a control condition to two interventions: one showing an overview of the AI's framed values and another comparing those values to the user's own. Results demonstrate that exposure to the AI's values alone lowers the share of AI-generated content in final essays by about 20 percent and increases the amount of unique text produced. This points to a practical way to encourage more individualized writing by raising awareness of AI value alignments.

Core claim

In the experiment, participants wrote essays with AI assistance under three conditions: no intervention, viewing an overview of the AI's framed values, or viewing those values compared to their own. The proportion of the final essay generated by the AI dropped by an average of 20 percent when participants saw the AI's framed values. Essays also showed more unique text in the condition where values were shown without personal comparison, suggesting that simple value disclosures can prompt users to personalize their outputs rather than default to AI suggestions.

What carries the argument

The intervention of displaying an overview of the AI's framed values, which serves to raise user awareness of potential biases and thereby decrease acceptance of AI-generated text in the final product.

If this is right

Users produce essays with a higher share of their own writing when informed about the AI's values.
Writing outputs become less homogenized and more reflective of individual perspectives.
A low-effort display of AI values can serve as an intervention to counter over-acceptance of suggestions.
The effect holds across participants from different cultural backgrounds in the tested groups.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar value-framing displays could be tested in other AI-assisted tasks such as summarization or idea generation to check if they also boost user originality.
Interface designers might consider making value disclosures a default feature to support user agency over time.
Widespread adoption could slow the convergence of global writing styles toward the AI's default value set.

Load-bearing premise

That participants notice the value overview and change their writing behavior because of it, rather than due to other unmeasured influences or experimental demand effects.

What would settle it

A follow-up study in which participants who see the AI value overview still generate final essays with the same proportion of AI text as those in the no-intervention control group.

Figures

Figures reproduced from arXiv: 2605.20512 by Alice Gao, Andrew N. Meltzoff, Katharina Reinecke, Maarten Sap.

**Figure 1.** Figure 1: A snapshot of our intervention, showing the AI’s framed values we showed participants with the AI’s answer to [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 2.** Figure 2: AI reliance metrics across our different study conditions. We observe a decrease in the AI reliance metrics (a) AI [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: AI reliance metrics across our different intervention conditions within each country. Though the (b) AI acceptance [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Quantitative writing metrics from our participants across different conditions. (a) Lexical diversity does not differ [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Our interface for the writing tasks for all participants. AI suggestions were shown in light gray and could be accepted [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

read the original abstract

Despite a global user base adopting large language models (LLMs) for daily writing tasks, model suggestions tend to align with Western values. Research has shown users commonly accept a high fraction of these AI suggestions, homogenizing writing styles and rendering outputs more ``Western'' than intended. While this suggests a need to reduce AI reliance, it remains unknown what kind of interventions could achieve this. Can framing the AI with specific values, and comparing it to one's own, make users less susceptible to overreliance and support more unique writing? We tested this hypothesis in a between-subjects online experiment with Indian and American participants (n=149) in which they were asked to perform AI-supported writing tasks, either 1) without an intervention, 2) after seeing an overview of the AI's framed values, or 3) after seeing an overview of the AI's framed values compared to their own. Our results show that seeing the AI's framed values reduces AI reliance, i.e., the proportion of the final essay generated by the AI, by an average of 20\%. Additionally, when participants saw an overview of the AI's framed values (without comparison to their own values), the final essays contain more unique text than without intervention. Our findings emphasize the importance of educating users about potential value biases in AI, showing that raising awareness with a simple overview of values encourages users to personalize their writing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A simple values overview cut AI text share by 20% in this writing experiment, but the reliance measure needs closer scrutiny.

read the letter

The main thing to know is that showing participants an overview of the AI's framed values produced a 20% average drop in the proportion of AI-generated text in their final essays, with the values-only condition also yielding more unique text than the no-intervention baseline. The study used a between-subjects design with 149 Indian and American participants doing AI-supported writing tasks across three conditions: plain assistance, values overview, and values compared to the user's own values. The cross-cultural sample and focus on actual output proportion give the result a practical flavor for thinking about value bias in LLMs. The intervention itself is low-cost and directly targets overreliance, which is a real user behavior issue. The paper applies value-framing ideas to this specific setting in a straightforward way and reports a concrete behavioral outcome rather than just attitudes. The cross-cultural element adds a small layer of generality that prior work on AI bias sometimes lacks. The soft spots sit mainly in the measurement and reporting. The headline relies on attributing text proportion to the AI, and if that uses overlap or edit-distance methods, heavy post-editing by participants could register as lower reliance even when the suggestion was initially accepted. The abstract gives no statistical tests, effect sizes, or validation details against human coding, so the 20% figure is difficult to weigh without the full methods section. The stress-test concern about distinguishing genuine rejection from superficial edits looks worth checking in the paper. This work is for HCI researchers and practitioners building or studying AI writing tools who want a testable way to reduce homogenization. A reader focused on bias mitigation or user interventions would find the setup useful to build on. It deserves a serious referee because the experiment is simple enough to evaluate and the question is timely, even if revisions on measurement validation and stats transparency are likely needed. I would send it for review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The paper reports a between-subjects online experiment (n=149 Indian and American participants) testing AI-supported writing tasks under three conditions: no intervention, overview of the AI's framed values, or overview of the AI's framed values compared to the participant's own values. The central claim is that seeing the AI's framed values reduces AI reliance—defined as the proportion of the final essay generated by the AI—by an average of 20%, with an additional finding that the non-comparison condition produces essays with more unique text.

Significance. If the measurement and statistical claims hold, the work offers a low-cost intervention for reducing overreliance on value-biased LLMs in writing tasks and for encouraging more personalized output. The between-subjects design with participants from two cultural groups provides a concrete empirical basis for HCI research on value transparency and AI literacy.

major comments (3)

[§3.2] §3.2 (Measurement of AI Reliance): The proportion of final essay text attributed to the AI is treated as a direct behavioral measure of reliance, yet the manuscript does not specify whether this is computed via string overlap, edit distance, LLM-based attribution, or another method, nor whether it was validated against human-coded ground truth. This leaves open the possibility that heavy post-editing of copied AI blocks registers as reduced reliance even when the suggestion was initially accepted.
[§4] §4 (Results): The headline 20% average reduction is stated without reported statistical tests, confidence intervals, effect sizes, or randomization checks for the between-subjects assignment. These omissions make it impossible to evaluate whether the observed difference is reliable or could be explained by unmeasured demand effects or condition-specific writing styles.
[§3.1] §3.1 (Experimental Conditions): The manuscript does not report how participants' perception of the value-framing overview was measured or whether manipulation checks confirmed that the intervention was interpreted as intended and independent of demand characteristics.

minor comments (2)

[Abstract] Abstract: The phrase 'more unique text' is used without a precise operational definition or reference to how uniqueness was quantified relative to the AI suggestions.
[Table 1] Table 1 or equivalent demographics table: Clarify the exact distribution of Indian versus American participants across the three conditions to allow assessment of cultural balance.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which help us improve the clarity and rigor of our manuscript. We address each major comment below and commit to revisions that strengthen the reporting of methods and results without altering the core findings.

read point-by-point responses

Referee: [§3.2] §3.2 (Measurement of AI Reliance): The proportion of final essay text attributed to the AI is treated as a direct behavioral measure of reliance, yet the manuscript does not specify whether this is computed via string overlap, edit distance, LLM-based attribution, or another method, nor whether it was validated against human-coded ground truth. This leaves open the possibility that heavy post-editing of copied AI blocks registers as reduced reliance even when the suggestion was initially accepted.

Authors: We agree that the exact computation method requires explicit description for reproducibility. In the revised manuscript we will expand §3.2 to state that AI reliance was quantified as the normalized Levenshtein edit distance between each AI suggestion and the corresponding segment of the final essay, yielding the proportion of retained AI-generated content. This approach intentionally accounts for post-editing rather than treating any modification as zero reliance. We did not conduct a separate human-coded validation study; we will acknowledge this as a limitation and note that the measure still reflects behavioral retention of AI text in the final output. revision: yes
Referee: [§4] §4 (Results): The headline 20% average reduction is stated without reported statistical tests, confidence intervals, effect sizes, or randomization checks for the between-subjects assignment. These omissions make it impossible to evaluate whether the observed difference is reliable or could be explained by unmeasured demand effects or condition-specific writing styles.

Authors: We concur that inferential statistics and supporting details are necessary. The revised §4 will report the results of a one-way ANOVA (or appropriate non-parametric test) comparing the three conditions on AI reliance, including F-statistic, p-value, confidence intervals around the mean difference, and effect size (Cohen’s d). We will also include randomization checks (balance tests on age, gender, and cultural background across conditions) and discuss potential demand effects as a limitation. The reported 20% figure represents the observed mean reduction; the added statistics will allow readers to assess its reliability. revision: yes
Referee: [§3.1] §3.1 (Experimental Conditions): The manuscript does not report how participants' perception of the value-framing overview was measured or whether manipulation checks confirmed that the intervention was interpreted as intended and independent of demand characteristics.

Authors: We recognize the value of explicit manipulation checks. In the revision we will add to §3.1 a description of the post-task questionnaire items that probed participants’ recall and perceived relevance of the value overview. Although formal manipulation checks were not part of the original protocol, we will report any available self-report data on value awareness and will discuss demand characteristics as a potential limitation of the online between-subjects design. If the data are insufficient, we will note this and suggest it for future studies. revision: partial

Circularity Check

0 steps flagged

No circularity: result grounded in new between-subjects experiment

full rationale

The paper's central claim—that framing an AI with values reduces AI reliance by ~20%—is presented as the direct outcome of a new between-subjects online experiment (n=149, three conditions) rather than any derivation, equation, or self-referential definition. The abstract and described methods report measured proportions of AI-generated text in submitted essays without invoking fitted parameters, prior self-citations as uniqueness theorems, or ansatzes that reduce the result to its inputs by construction. No load-bearing steps collapse the reported effect to the experimental design itself; the finding remains an independent empirical observation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on standard behavioral-experiment assumptions rather than mathematical derivations. No free parameters or invented entities are introduced. The main axioms are domain assumptions about participant comprehension and measurement validity.

axioms (2)

domain assumption Participants understand and respond to the value overview as intended by the experimenters without significant demand characteristics or misinterpretation.
Invoked implicitly when attributing the 20% reduction to the value-framing intervention in the abstract.
domain assumption The proportion of final essay text generated by the AI can be reliably measured and attributed to user behavior rather than interface artifacts.
Central to the reported outcome variable but not detailed in the abstract.

pith-pipeline@v0.9.0 · 5789 in / 1393 out tokens · 34377 ms · 2026-05-21T06:12:33.607293+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our results show that seeing the AI's framed values reduces AI reliance, i.e., the proportion of the final essay generated by the AI, by an average of 20%.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We used type-token ratio (TTR) ... cosine similarity scores between pairs of essays ... thematic analysis

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

131 extracted references · 131 canonical work pages · 2 internal anchors

[1]

Dhruv Agarwal, Mor Naaman, and Aditya Vashistha. 2025. AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). ACM, 1–21. doi:10.1145/ 3706598.3713564

work page arXiv 2025
[2]

Barrett R Anderson, Jash Hemant Shah, and Max Kreminski. 2024. Homogenization Effects of Large Language Models on Human Creative Ideation. InProceedings of the 16th Conference on Creativity & Cognition(Chicago, IL, USA)(C&C ’24). Association for Computing Machinery, New York, NY, USA, 413–425. doi:10.1145/3635636.3656204

work page doi:10.1145/3635636.3656204 2024
[3]

Arnold, Krysta Chauncey, and Krzysztof Z

Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. 2018. Sentiment Bias in Predictive Text Recommendations Results in Biased Writing. InProceedings of the 44th Graphics Interface Conference(Toronto, Canada)(GI ’18). Canadian Human-Computer Communications Society, Waterloo, CAN, 42–49. doi:10.20380/GI2018.07

work page doi:10.20380/gi2018.07 2018
[4]

Arnold, Krysta Chauncey, and Krzysztof Z

Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. 2020. Predictive text encourages predictable writing. InProceedings of the 25th International Conference on Intelligent User Interfaces(Cagliari, Italy)(IUI ’20). Association for Computing Machinery, New York, NY, USA, 128–138. doi:10.1145/3377325.3377523

work page doi:10.1145/3377325.3377523 2020
[5]

Diego Aycinena, Lucas Rentschler, Benjamin Beranek, and Jonathan F. Schulz. 2022. Social norms and dishonesty across societies. Proceedings of the National Academy of Sciences119, 31 (2022), e2120138119. arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.2120138119 doi:10.1073/pnas.2120138119

work page doi:10.1073/pnas.2120138119 2022
[6]

2006.Glossary of corpus linguistics

Paul Baker. 2006.Glossary of corpus linguistics. Edinburgh University Press

work page 2006
[7]

Ritwik Banerjee. 2018. On the interpretation of World Values Survey trust question-global expectations vs. local beliefs.European Journal of Political Economy55 (2018), 491–510

work page 2018
[8]

Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond accuracy: The role of mental models in human-AI team performance. InProceedings of the AAAI conference on human computation and crowdsourcing, Vol. 7. 2–11

work page 2019
[9]

Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel S. Weld. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. arXiv:2006.14779 [cs.AI] https://arxiv.org/abs/2006.14779

work page arXiv 2021
[10]

2009.Trasmettere valori

Daniela Barni et al. 2009.Trasmettere valori. Tre generazioni familiari a confronto. Unicopli

work page 2009
[11]

Jeffrey Basoah, Daniel Chechelnitsky, Tao Long, Katharina Reinecke, Chrysoula Zerva, Kaitlyn Zhou, Mark Díaz, and Maarten Sap. 2025. Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’25). Association fo...

work page doi:10.1145/3715275.3732045 2025
[12]

Cunningham, Erica Adams, Alisha Bose, Aditi Jain, Kaustubh Yadav, Zhengyang Yang, Katharina Reinecke, and Daniela Rosner

Jeffrey Basoah, Jay L. Cunningham, Erica Adams, Alisha Bose, Aditi Jain, Kaustubh Yadav, Zhengyang Yang, Katharina Reinecke, and Daniela Rosner. 2025. Should AI Mimic People? Understanding AI-Supported Writing Technology Among Black Users.Proc. ACM Hum.-Comput. Interact.9, 7, Article CSCW242 (Oct. 2025), 51 pages. doi:10.1145/3757423

work page doi:10.1145/3757423 2025
[13]

2000.Protecting Indigenous knowledge and heritage: A global challenge

Marie Battiste and James (Sa’ke’j) Youngblood Henderson. 2000.Protecting Indigenous knowledge and heritage: A global challenge. University of British Columbia Press

work page 2000
[14]

Mohsen Bayati, Mark Braverman, Michael Gillam, Karen M Mack, George Ruiz, Mark S Smith, and Eric Horvitz. 2014. Data-driven decisions for reducing readmissions for heart failure: general methodology and case study.PLoS One9, 10 (Oct. 2014), e109264

work page 2014
[15]

Gábor Bella, Paula Helm, Gertraud Koch, and Fausto Giunchiglia. 2024. Tackling Language Modelling Bias in Support of Linguistic Diversity. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY, USA, 562–572. doi:10.1145/3630106.3658925

work page doi:10.1145/3630106.3658925 2024
[16]

Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency(Virtual Event, Canada)(FAccT ’21). Association for Computing Machinery, New York, NY, USA, 610–623. doi:10.114...

work page doi:10.1145/3442188.3445922 2021
[17]

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, and Aylin Caliskan. 2023. Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency(Chicago, IL, ...

work page doi:10.1145/3593013.3594095 2023
[18]

Lea Boecker, David D Loschelder, and Sascha Topolinski. 2022. How individuals react emotionally to others’(mis) fortunes: A social comparison framework.Journal of Personality and Social Psychology123, 1 (2022), 55

work page 2022
[19]

Self-Expression Values,

Eduard J. Bomhoff and Mary Man-Li Gu. 2012. East Asia Remains Different: A Comment on the Index of “Self-Expression Values, ” by Inglehart and Welzel.Journal of Cross-Cultural Psychology43, 3 (2012), 373–383. arXiv:https://doi.org/10.1177/0022022111435096 doi:10.1177/0022022111435096

work page doi:10.1177/0022022111435096 2012
[20]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.Qualitative Research in Psychology3 (01 2006), 77–101. doi:10.1191/1478088706qp063oa Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page doi:10.1191/1478088706qp063oa 2006
[21]

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making.Proc. ACM Hum.-Comput. Interact.5, CSCW1, Article 188 (April 2021), 21 pages. doi:10.1145/3449287

work page internal anchor Pith review doi:10.1145/3449287 2021
[22]

Daniel Buschek, Martin Zürn, and Malin Eiband. 2021. The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY, USA, Article ...

work page doi:10.1145/3411764.3445372 2021
[23]

Allison Chen, Sunnie S. Y. Kim, Angel Franyutti, Amaya Dharmasiri, Kushin Mukherjee, Olga Russakovsky, and Judith E. Fan. 2026. Presenting Large Language Models as Companions Affects What Mental Capacities People Attribute to Them. arXiv:2510.18039 [cs.HC] https://arxiv.org/abs/2510.18039

work page arXiv 2026
[24]

Kaiping Chen, Anqi Shao, Jirayu Burapacheep, and Yixuan Li. 2024. Conversational AI and equity through assessing GPT-3’s communi- cation with diverse social groups on contentious topics.Scientific Reports14, 1 (18 Jan 2024), 1561. doi:10.1038/s41598-024-51969-w

work page doi:10.1038/s41598-024-51969-w 2024
[25]

As an AI language model, I cannot

Paramveer S. Dhillon, Somayeh Molaei, Jiaqi Li, Maximilian Golub, Shaochun Zheng, and Lionel Peter Robert. 2024. Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New Y...

work page doi:10.1145/3613904.3642134 2024
[26]

Leon Festinger. 1954. A theory of social comparison processes.Human relations7, 2 (1954), 117–140

work page 1954
[27]

Alexandra Fleischmann, Joris Lammers, Kathi Diel, Wilhelm Hofmann, and Adam D Galinsky. 2021. More threatening and more diagnostic: How moral comparisons differ from social comparisons.Journal of Personality and Social Psychology121, 5 (2021), 1057

work page 2021
[28]

Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Paul Fisher, Diane Wilson, Michael Fitzke, Mark Parkinson, Eric Horvitz, Kori Inkpen, and Besmira Nushi. 2022. Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency(Seoul, Republic o...

work page doi:10.1145/3531146.3533193 2022
[29]

Michael C Frank. 2023. Baby steps in evaluating the capacities of large language models.Nature Reviews Psychology2, 8 (2023), 451–452

work page 2023
[30]

I wouldn’t say offensive but

Vinitha Gadiraju, Shaun Kane, Sunipa Dev, Alex Taylor, Ding Wang, Remi Denton, and Robin Brewer. 2023. "I wouldn’t say offensive but... ": Disability-Centered Perspectives on Large Language Models. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency(Chicago, IL, USA)(FAccT ’23). Association for Computing Machinery, New Y...

work page arXiv 2023
[31]

Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A Smith. 2020. Realtoxicityprompts: Evaluating neural toxic degeneration in language models.arXiv preprint arXiv:2009.11462(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2020
[32]

Katy Ilonka Gero, Vivian Liu, and Lydia B. Chilton. 2021. Sparks: Inspiration for Science Writing using Language Models. arXiv:2110.07640 [cs.HC] https://arxiv.org/abs/2110.07640

work page arXiv 2021
[33]

Sourojit Ghosh and Aylin Caliskan. 2023. ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (Montréal, QC, Canada)(AIES ’23). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3600211.3604672 2023
[34]

Gill and Shaun Nichols

Michael B. Gill and Shaun Nichols. 2008. Sentimentalist Pluralism: Moral Psychology and Philosophical Ethics.Philosophical Issues18 (2008), 143–163. http://www.jstor.org/stable/27749904

work page arXiv 2008
[35]

Nicole Gillespie, Steven Lockey, Tabi Ward, A Macdade, and G Hassed. 2025. Trust, attitudes and use of artificial intelligence. (2025)

work page 2025
[36]

Ben Green and Yiling Chen. 2019. The Principles and Limits of Algorithm-in-the-Loop Decision Making.Proc. ACM Hum.-Comput. Interact.3, CSCW, Article 50 (Nov. 2019), 24 pages. doi:10.1145/3359152

work page doi:10.1145/3359152 2019
[37]

bias busting

Jessica Guynn. 2015. Google’s “bias busting” workshops target hidden prejudices.USA Today12 (2015)

work page 2015
[38]

Haerpfer, R

C. Haerpfer, R. Inglehart, A. Moreno, C. Welzel, K. Kizilova, J. Diez-Medrano, M. Lagos, P. Norris, E. Ponarin, and B. Puranen. 2024. World Values Survey Wave 7 (2017-2022) Cross-National Data-Set. doi:10.14281/18241.24 (eds.)

work page doi:10.14281/18241.24 2024
[39]

Kizilcec, Dominic DiFranzo, Zhila Aghajari, Hannah Mieczkowski, Karen Levy, Mor Naaman, Jeffrey Hancock, and Malte F

Jess Hohenstein, Rene F. Kizilcec, Dominic DiFranzo, Zhila Aghajari, Hannah Mieczkowski, Karen Levy, Mor Naaman, Jeffrey Hancock, and Malte F. Jung. 2023. Artificial intelligence in communication impacts language and social relationships.Scientific Reports13, 1 (04 Apr 2023), 5487. doi:10.1038/s41598-023-30938-9

work page doi:10.1038/s41598-023-30938-9 2023
[40]

Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, and Mor Naaman. 2023. Co-Writing with Opinionated Language Models Affects Users’ Views. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 111, 15 pages. doi:10.1145/3544548.3581196

work page doi:10.1145/3544548.3581196 2023
[41]

Rebecca L Johnson, Giada Pistilli, Natalia Menédez-González, Leslye Denisse Dias Duran, Enrico Panai, Julija Kalpokiene, and Donald Jay Bertulfo. 2022. The Ghost in the Machine has an American accent: value conflict in GPT-3. arXiv:2203.07785 [cs.CL] https://arxiv.org/ abs/2203.07785

work page arXiv 2022
[42]

Kowe Kadoma, Marianne Aubin Le Quere, Xiyu Jenny Fu, Christin Munsch, Danaë Metaxa, and Mor Naaman. 2024. The Role of Inclusion, Control, and Ownership in Workplace AI-Mediated Communication. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, A...

work page doi:10.1145/3613904.3642650 2024
[43]

Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter Young, and Vivek Ramavajjala. 2016. Smart Reply: Automated Response Suggestion for Email. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(San Francisco, California, ...

work page doi:10.1145/2939672.2939801 2016
[44]

Markelle Kelly, Aakriti Kumar, Padhraic Smyth, and Mark Steyvers. 2023. Capturing Humans’ Mental Models of AI: An Item Response Theory Approach. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency(Chicago, IL, USA)(FAccT ’23). Association for Computing Machinery, New York, NY, USA, 1723–1734. doi:10.1145/3593013.3594111

work page doi:10.1145/3593013.3594111 2023
[46]

Ariba Khan, Stephen Casper, and Dylan Hadfield-Menell. 2025. Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’25). Association for Computing Machinery, New York, NY, USA, 2151–2165. doi:10.1145/3715275.3732147

work page doi:10.1145/3715275.3732147 2025
[47]

Oliver Klingefjord, Ryan Lowe, and Joe Edelman. 2024. What are human values, and how do we align AI to them? arXiv:2404.10636 [cs.CY] https://arxiv.org/abs/2404.10636

work page arXiv 2024
[48]

Stephen M. Kosslyn. 1989. Understanding charts and graphs.Applied Cognitive Psychology3, 3 (1989), 185–225. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/acp.2350030302 doi:10.1002/acp.2350030302

work page doi:10.1002/acp.2350030302 1989
[49]

Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell me more? the effects of mental model soundness on personalizing an intelligent agent. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Austin, Texas, USA) (CHI ’12). Association for Computing Machinery, New York, NY, USA, 1–10. doi:10.1145/2207676.2207678

work page doi:10.1145/2207676.2207678 2012
[50]

Why is ’Chicago’ deceptive?

Vivian Lai, Han Liu, and Chenhao Tan. 2020. "Why is ’Chicago’ deceptive?" Towards Building Model-Driven Tutorials for Humans. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. doi:10.1145/3313831.3376873

work page doi:10.1145/3313831.3376873 2020
[51]

Cynthia Lee. 2017. Awareness as a first step toward overcoming implicit bias.Enhancing justice: Reducing bias289 (2017)

work page 2017
[52]

Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 388, 19 pages. doi:10.1145/3491102.3502030

work page doi:10.1145/3491102.3502030 2022
[53]

Lee, Jacob M

Messi H.J. Lee, Jacob M. Montgomery, and Calvin K. Lai. 2024. Large Language Models Portray Socially Subordinate Groups as More Homogeneous, Consistent with a Bias Observed in Humans. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY,...

work page arXiv 2024
[54]

Yuxuan Li, Hirokazu Shirado, and Sauvik Das. 2025. Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’25). Association for Computing Machinery, New York, NY, USA, 3303–3325. doi:10.1145/3715275.3732212

work page doi:10.1145/3715275.3732212 2025
[55]

Marjaana Lindeman and Markku Verkasalo. 2005. Measuring Values With the Short Schwartz’s Value Survey.Journal of Personality Assessment85, 2 (2005), 170–178. doi:10.1207/s15327752jpa8502_09 PMID: 16171417

work page doi:10.1207/s15327752jpa8502_09 2005
[56]

Thomas Mejtoft, Sarah Hale, and Ulrik Söderström. 2019. Design Friction. InProceedings of the 31st European Conference on Cognitive Ergonomics(BELFAST, United Kingdom)(ECCE ’19). Association for Computing Machinery, New York, NY, USA, 41–44. doi:10.1145/ 3335082.3335106

work page arXiv 2019
[57]

Jared Moore, Tanvi Deshpande, and Diyi Yang. 2024. Are Large Language Models Consistent over Value-laden Questions? arXiv:2407.02996 [cs.CL] https://arxiv.org/abs/2407.02996

work page arXiv 2024
[58]

Jimin Mun, Wei Bin Au Yeong, Wesley Hanwen Deng, Jana Schaich Borg, and Maarten Sap. 2025. Why (not) use AI? Analyzing People’s Reasoning and Conditions for AI Acceptability. InAIES. https://arxiv.org/abs/2502.07287

work page arXiv 2025
[59]

Jimin Mun, Liwei Jiang, Jenny Liang, Inyoung Cheong, Nicole DeCario, Yejin Choi, Tadayoshi Kohno, and Maarten Sap. 2024. Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits. InAIES. https://arxiv.org/abs/2403.14791

work page arXiv 2024
[60]

Deepa Muralidhar, Rafik Belloum, and Ashwin Ashok. 2025. Operationalizing selective transparency using progressive disclosure in artificial intelligence clinical diagnosis systems.International Journal of Human-Computer Studies204 (2025), 103591. doi:10.1016/j.ijhcs. 2025.103591

work page doi:10.1016/j.ijhcs 2025
[61]

1988.The psychology of everyday things.Basic books

Donald A Norman. 1988.The psychology of everyday things.Basic books

work page 1988
[62]

Gregory B Northcraft and Margaret Ann Neale. 1990. Organizational behavior: A management challenge.(No Title)(1990)

work page 1990
[63]

Stefan Palan and Christian Schitter. 2018. Prolific.ac—A subject pool for online experiments.Journal of Behavioral and Experimental Finance17 (2018), 22–27. doi:10.1016/j.jbef.2017.12.004

work page doi:10.1016/j.jbef.2017.12.004 2018
[64]

Joon Sung Park, Rick Barber, Alex Kirlik, and Karrie Karahalios. 2019. A Slow Algorithm Improves Users’ Assessments of the Algorithm’s Accuracy.Proc. ACM Hum.-Comput. Interact.3, CSCW, Article 102 (Nov. 2019), 15 pages. doi:10.1145/3359204 Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page doi:10.1145/3359204 2019
[65]

Savvas Petridis, Nicholas Diakopoulos, Kevin Crowston, Mark Hansen, Keren Henderson, Stan Jastrzebski, Jeffrey V Nickerson, and Lydia B Chilton. 2023. AngleKindling: Supporting Journalistic Angle Ideation with Large Language Models. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for C...

work page doi:10.1145/3544548.3580907 2023
[66]

Ritika Poddar, Rashmi Sinha, Mor Naaman, and Maurice Jakesch. 2023. AI Writing Assistants Influence Topic Choice in Self-Presentation. InExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI EA ’23). Association for Computing Machinery, New York, NY, USA, Article 29, 6 pages. doi:10.1145/3544549.3585893

work page doi:10.1145/3544549.3585893 2023
[67]

Devin G Pope, Joseph Price, and Justin Wolfers. 2018. Awareness reduces racial bias.Management Science64, 11 (2018), 4988–4995

work page 2018
[68]

Neil Rathi, Dan Jurafsky, and Kaitlyn Zhou. 2025. Humans overrely on overconfident language models, across languages. arXiv:2507.06306 [cs.CL] https://arxiv.org/abs/2507.06306

work page arXiv 2025
[69]

Claudia Russo, Francesca Danioni, Ioana Zagrean, and Daniela Barni. 2022. Changing Personal Values through Value-Manipulation Tasks: A Systematic Literature Review Based on Schwartz’s Theory of Basic Human Values.Eur J Investig Health Psychol Educ12, 7 (June 2022), 692–715

work page 2022
[70]

Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, and Dirk Hovy. 2024. Po- litical Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models. arXiv:2402.16786 [cs.CL] https://arxiv.org/abs/2402.16786

work page arXiv 2024
[71]

Liang, Ronan Le Bras, Katharina Reinecke, and Maarten Sap

Sebastin Santy, Jenny T. Liang, Ronan Le Bras, Katharina Reinecke, and Maarten Sap. 2023. NLPositionality: Characterizing Design Biases of Datasets and Models. arXiv:2306.01943 [cs.CL] https://arxiv.org/abs/2306.01943

work page arXiv 2023
[72]

Smith, and James Pennebaker

Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, and James Pennebaker. 2020. Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for...

work page doi:10.18653/v1/2020.acl-main.178 2020
[73]

Shalom Schwartz. 2006. A Theory of Cultural Value Orientations: Explication and Applications.Comparative Sociology5, 2-3 (2006), 137 – 182. doi:10.1163/156913306778667357

work page doi:10.1163/156913306778667357 2006
[74]

Richard Shiffrin and Melanie Mitchell. 2023. Probing the psychology of AI models.Proceedings of the National Academy of Sciences120, 10 (2023), e2300963120. arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.2300963120 doi:10.1073/pnas.2300963120

work page doi:10.1073/pnas.2300963120 2023
[75]

Herbert A Simon and Allen Newell. 1971. Human problem solving: The state of the theory in 1970.American psychologist26, 2 (1971), 145

work page 1971
[76]

Glassman

Nikhil Singh, Guillermo Bernal, Daria Savchenko, and Elena L. Glassman. 2023. Where to Hide a Stolen Elephant: Leaps in Creative Writing with Multimodal Machine Intelligence.ACM Trans. Comput.-Hum. Interact.30, 5, Article 68 (Sept. 2023), 57 pages. doi:10.1145/3511599

work page doi:10.1145/3511599 2023
[77]

Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, and Yejin Choi

Taylor Sorensen, Liwei Jiang, Jena D. Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, and Yejin Choi. 2024. Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties.Proceedings of the AAAI Conference on Artificial Intelligence38, 18 (March 20...

work page doi:10.1609/aaai.v38i18.29970 2024
[78]

Aaron Springer and Steve Whittaker. 2020. Progressive Disclosure: When, Why, and How Do Users Want Algorithmic Transparency Information?ACM Trans. Interact. Intell. Syst.10, 4, Article 29 (Oct. 2020), 32 pages. doi:10.1145/3374218

work page doi:10.1145/3374218 2020
[79]

Kate Sweeny, James A Shepperd, and Jennifer L Howell. 2012. Do as I say (not as I do): Inconsistency between behavior and values. Basic and applied social psychology34, 2 (2012), 128–135

work page 2012
[80]

Yan Tao, Olga Viberg, Ryan S Baker, and René F Kizilcec. 2024. Cultural bias and cultural alignment of large language models.PNAS Nexus3, 9 (09 2024), pgae346. arXiv:https://academic.oup.com/pnasnexus/article-pdf/3/9/pgae346/59151559/pgae346.pdf doi:10.1093/ pnasnexus/pgae346

work page 2024
[81]

Peter Todd and Izak Benbasat. 1994. The Influence of Decision Aids on Choice Strategies: An Experimental Analysis of the Role of Cognitive Effort.Organizational Behavior and Human Decision Processes60, 1 (1994), 36–74. doi:10.1006/obhd.1994.1074

work page doi:10.1006/obhd.1994.1074 1994

Showing first 80 references.

[1] [1]

Dhruv Agarwal, Mor Naaman, and Aditya Vashistha. 2025. AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). ACM, 1–21. doi:10.1145/ 3706598.3713564

work page arXiv 2025

[2] [2]

Barrett R Anderson, Jash Hemant Shah, and Max Kreminski. 2024. Homogenization Effects of Large Language Models on Human Creative Ideation. InProceedings of the 16th Conference on Creativity & Cognition(Chicago, IL, USA)(C&C ’24). Association for Computing Machinery, New York, NY, USA, 413–425. doi:10.1145/3635636.3656204

work page doi:10.1145/3635636.3656204 2024

[3] [3]

Arnold, Krysta Chauncey, and Krzysztof Z

Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. 2018. Sentiment Bias in Predictive Text Recommendations Results in Biased Writing. InProceedings of the 44th Graphics Interface Conference(Toronto, Canada)(GI ’18). Canadian Human-Computer Communications Society, Waterloo, CAN, 42–49. doi:10.20380/GI2018.07

work page doi:10.20380/gi2018.07 2018

[4] [4]

Arnold, Krysta Chauncey, and Krzysztof Z

Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. 2020. Predictive text encourages predictable writing. InProceedings of the 25th International Conference on Intelligent User Interfaces(Cagliari, Italy)(IUI ’20). Association for Computing Machinery, New York, NY, USA, 128–138. doi:10.1145/3377325.3377523

work page doi:10.1145/3377325.3377523 2020

[5] [5]

Diego Aycinena, Lucas Rentschler, Benjamin Beranek, and Jonathan F. Schulz. 2022. Social norms and dishonesty across societies. Proceedings of the National Academy of Sciences119, 31 (2022), e2120138119. arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.2120138119 doi:10.1073/pnas.2120138119

work page doi:10.1073/pnas.2120138119 2022

[6] [6]

2006.Glossary of corpus linguistics

Paul Baker. 2006.Glossary of corpus linguistics. Edinburgh University Press

work page 2006

[7] [7]

Ritwik Banerjee. 2018. On the interpretation of World Values Survey trust question-global expectations vs. local beliefs.European Journal of Political Economy55 (2018), 491–510

work page 2018

[8] [8]

Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond accuracy: The role of mental models in human-AI team performance. InProceedings of the AAAI conference on human computation and crowdsourcing, Vol. 7. 2–11

work page 2019

[9] [9]

Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel S. Weld. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. arXiv:2006.14779 [cs.AI] https://arxiv.org/abs/2006.14779

work page arXiv 2021

[10] [10]

2009.Trasmettere valori

Daniela Barni et al. 2009.Trasmettere valori. Tre generazioni familiari a confronto. Unicopli

work page 2009

[11] [11]

Jeffrey Basoah, Daniel Chechelnitsky, Tao Long, Katharina Reinecke, Chrysoula Zerva, Kaitlyn Zhou, Mark Díaz, and Maarten Sap. 2025. Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’25). Association fo...

work page doi:10.1145/3715275.3732045 2025

[12] [12]

Cunningham, Erica Adams, Alisha Bose, Aditi Jain, Kaustubh Yadav, Zhengyang Yang, Katharina Reinecke, and Daniela Rosner

Jeffrey Basoah, Jay L. Cunningham, Erica Adams, Alisha Bose, Aditi Jain, Kaustubh Yadav, Zhengyang Yang, Katharina Reinecke, and Daniela Rosner. 2025. Should AI Mimic People? Understanding AI-Supported Writing Technology Among Black Users.Proc. ACM Hum.-Comput. Interact.9, 7, Article CSCW242 (Oct. 2025), 51 pages. doi:10.1145/3757423

work page doi:10.1145/3757423 2025

[13] [13]

2000.Protecting Indigenous knowledge and heritage: A global challenge

Marie Battiste and James (Sa’ke’j) Youngblood Henderson. 2000.Protecting Indigenous knowledge and heritage: A global challenge. University of British Columbia Press

work page 2000

[14] [14]

Mohsen Bayati, Mark Braverman, Michael Gillam, Karen M Mack, George Ruiz, Mark S Smith, and Eric Horvitz. 2014. Data-driven decisions for reducing readmissions for heart failure: general methodology and case study.PLoS One9, 10 (Oct. 2014), e109264

work page 2014

[15] [15]

Gábor Bella, Paula Helm, Gertraud Koch, and Fausto Giunchiglia. 2024. Tackling Language Modelling Bias in Support of Linguistic Diversity. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY, USA, 562–572. doi:10.1145/3630106.3658925

work page doi:10.1145/3630106.3658925 2024

[16] [16]

Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency(Virtual Event, Canada)(FAccT ’21). Association for Computing Machinery, New York, NY, USA, 610–623. doi:10.114...

work page doi:10.1145/3442188.3445922 2021

[17] [17]

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, and Aylin Caliskan. 2023. Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency(Chicago, IL, ...

work page doi:10.1145/3593013.3594095 2023

[18] [18]

Lea Boecker, David D Loschelder, and Sascha Topolinski. 2022. How individuals react emotionally to others’(mis) fortunes: A social comparison framework.Journal of Personality and Social Psychology123, 1 (2022), 55

work page 2022

[19] [19]

Self-Expression Values,

Eduard J. Bomhoff and Mary Man-Li Gu. 2012. East Asia Remains Different: A Comment on the Index of “Self-Expression Values, ” by Inglehart and Welzel.Journal of Cross-Cultural Psychology43, 3 (2012), 373–383. arXiv:https://doi.org/10.1177/0022022111435096 doi:10.1177/0022022111435096

work page doi:10.1177/0022022111435096 2012

[20] [20]

Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.Qualitative Research in Psychology3 (01 2006), 77–101. doi:10.1191/1478088706qp063oa Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page doi:10.1191/1478088706qp063oa 2006

[21] [21]

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making.Proc. ACM Hum.-Comput. Interact.5, CSCW1, Article 188 (April 2021), 21 pages. doi:10.1145/3449287

work page internal anchor Pith review doi:10.1145/3449287 2021

[22] [22]

Daniel Buschek, Martin Zürn, and Malin Eiband. 2021. The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY, USA, Article ...

work page doi:10.1145/3411764.3445372 2021

[23] [23]

Allison Chen, Sunnie S. Y. Kim, Angel Franyutti, Amaya Dharmasiri, Kushin Mukherjee, Olga Russakovsky, and Judith E. Fan. 2026. Presenting Large Language Models as Companions Affects What Mental Capacities People Attribute to Them. arXiv:2510.18039 [cs.HC] https://arxiv.org/abs/2510.18039

work page arXiv 2026

[24] [24]

Kaiping Chen, Anqi Shao, Jirayu Burapacheep, and Yixuan Li. 2024. Conversational AI and equity through assessing GPT-3’s communi- cation with diverse social groups on contentious topics.Scientific Reports14, 1 (18 Jan 2024), 1561. doi:10.1038/s41598-024-51969-w

work page doi:10.1038/s41598-024-51969-w 2024

[25] [25]

As an AI language model, I cannot

Paramveer S. Dhillon, Somayeh Molaei, Jiaqi Li, Maximilian Golub, Shaochun Zheng, and Lionel Peter Robert. 2024. Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New Y...

work page doi:10.1145/3613904.3642134 2024

[26] [26]

Leon Festinger. 1954. A theory of social comparison processes.Human relations7, 2 (1954), 117–140

work page 1954

[27] [27]

Alexandra Fleischmann, Joris Lammers, Kathi Diel, Wilhelm Hofmann, and Adam D Galinsky. 2021. More threatening and more diagnostic: How moral comparisons differ from social comparisons.Journal of Personality and Social Psychology121, 5 (2021), 1057

work page 2021

[28] [28]

Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Paul Fisher, Diane Wilson, Michael Fitzke, Mark Parkinson, Eric Horvitz, Kori Inkpen, and Besmira Nushi. 2022. Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency(Seoul, Republic o...

work page doi:10.1145/3531146.3533193 2022

[29] [29]

Michael C Frank. 2023. Baby steps in evaluating the capacities of large language models.Nature Reviews Psychology2, 8 (2023), 451–452

work page 2023

[30] [30]

I wouldn’t say offensive but

Vinitha Gadiraju, Shaun Kane, Sunipa Dev, Alex Taylor, Ding Wang, Remi Denton, and Robin Brewer. 2023. "I wouldn’t say offensive but... ": Disability-Centered Perspectives on Large Language Models. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency(Chicago, IL, USA)(FAccT ’23). Association for Computing Machinery, New Y...

work page arXiv 2023

[31] [31]

Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A Smith. 2020. Realtoxicityprompts: Evaluating neural toxic degeneration in language models.arXiv preprint arXiv:2009.11462(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2020

[32] [32]

Katy Ilonka Gero, Vivian Liu, and Lydia B. Chilton. 2021. Sparks: Inspiration for Science Writing using Language Models. arXiv:2110.07640 [cs.HC] https://arxiv.org/abs/2110.07640

work page arXiv 2021

[33] [33]

Sourojit Ghosh and Aylin Caliskan. 2023. ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (Montréal, QC, Canada)(AIES ’23). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3600211.3604672 2023

[34] [34]

Gill and Shaun Nichols

Michael B. Gill and Shaun Nichols. 2008. Sentimentalist Pluralism: Moral Psychology and Philosophical Ethics.Philosophical Issues18 (2008), 143–163. http://www.jstor.org/stable/27749904

work page arXiv 2008

[35] [35]

Nicole Gillespie, Steven Lockey, Tabi Ward, A Macdade, and G Hassed. 2025. Trust, attitudes and use of artificial intelligence. (2025)

work page 2025

[36] [36]

Ben Green and Yiling Chen. 2019. The Principles and Limits of Algorithm-in-the-Loop Decision Making.Proc. ACM Hum.-Comput. Interact.3, CSCW, Article 50 (Nov. 2019), 24 pages. doi:10.1145/3359152

work page doi:10.1145/3359152 2019

[37] [37]

bias busting

Jessica Guynn. 2015. Google’s “bias busting” workshops target hidden prejudices.USA Today12 (2015)

work page 2015

[38] [38]

Haerpfer, R

C. Haerpfer, R. Inglehart, A. Moreno, C. Welzel, K. Kizilova, J. Diez-Medrano, M. Lagos, P. Norris, E. Ponarin, and B. Puranen. 2024. World Values Survey Wave 7 (2017-2022) Cross-National Data-Set. doi:10.14281/18241.24 (eds.)

work page doi:10.14281/18241.24 2024

[39] [39]

Kizilcec, Dominic DiFranzo, Zhila Aghajari, Hannah Mieczkowski, Karen Levy, Mor Naaman, Jeffrey Hancock, and Malte F

Jess Hohenstein, Rene F. Kizilcec, Dominic DiFranzo, Zhila Aghajari, Hannah Mieczkowski, Karen Levy, Mor Naaman, Jeffrey Hancock, and Malte F. Jung. 2023. Artificial intelligence in communication impacts language and social relationships.Scientific Reports13, 1 (04 Apr 2023), 5487. doi:10.1038/s41598-023-30938-9

work page doi:10.1038/s41598-023-30938-9 2023

[40] [40]

Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, and Mor Naaman. 2023. Co-Writing with Opinionated Language Models Affects Users’ Views. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 111, 15 pages. doi:10.1145/3544548.3581196

work page doi:10.1145/3544548.3581196 2023

[41] [41]

Rebecca L Johnson, Giada Pistilli, Natalia Menédez-González, Leslye Denisse Dias Duran, Enrico Panai, Julija Kalpokiene, and Donald Jay Bertulfo. 2022. The Ghost in the Machine has an American accent: value conflict in GPT-3. arXiv:2203.07785 [cs.CL] https://arxiv.org/ abs/2203.07785

work page arXiv 2022

[42] [42]

Kowe Kadoma, Marianne Aubin Le Quere, Xiyu Jenny Fu, Christin Munsch, Danaë Metaxa, and Mor Naaman. 2024. The Role of Inclusion, Control, and Ownership in Workplace AI-Mediated Communication. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, A...

work page doi:10.1145/3613904.3642650 2024

[43] [43]

Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter Young, and Vivek Ramavajjala. 2016. Smart Reply: Automated Response Suggestion for Email. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(San Francisco, California, ...

work page doi:10.1145/2939672.2939801 2016

[44] [44]

Markelle Kelly, Aakriti Kumar, Padhraic Smyth, and Mark Steyvers. 2023. Capturing Humans’ Mental Models of AI: An Item Response Theory Approach. InProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency(Chicago, IL, USA)(FAccT ’23). Association for Computing Machinery, New York, NY, USA, 1723–1734. doi:10.1145/3593013.3594111

work page doi:10.1145/3593013.3594111 2023

[45] [46]

Ariba Khan, Stephen Casper, and Dylan Hadfield-Menell. 2025. Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’25). Association for Computing Machinery, New York, NY, USA, 2151–2165. doi:10.1145/3715275.3732147

work page doi:10.1145/3715275.3732147 2025

[46] [47]

Oliver Klingefjord, Ryan Lowe, and Joe Edelman. 2024. What are human values, and how do we align AI to them? arXiv:2404.10636 [cs.CY] https://arxiv.org/abs/2404.10636

work page arXiv 2024

[47] [48]

Stephen M. Kosslyn. 1989. Understanding charts and graphs.Applied Cognitive Psychology3, 3 (1989), 185–225. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/acp.2350030302 doi:10.1002/acp.2350030302

work page doi:10.1002/acp.2350030302 1989

[48] [49]

Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell me more? the effects of mental model soundness on personalizing an intelligent agent. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Austin, Texas, USA) (CHI ’12). Association for Computing Machinery, New York, NY, USA, 1–10. doi:10.1145/2207676.2207678

work page doi:10.1145/2207676.2207678 2012

[49] [50]

Why is ’Chicago’ deceptive?

Vivian Lai, Han Liu, and Chenhao Tan. 2020. "Why is ’Chicago’ deceptive?" Towards Building Model-Driven Tutorials for Humans. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. doi:10.1145/3313831.3376873

work page doi:10.1145/3313831.3376873 2020

[50] [51]

Cynthia Lee. 2017. Awareness as a first step toward overcoming implicit bias.Enhancing justice: Reducing bias289 (2017)

work page 2017

[51] [52]

Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 388, 19 pages. doi:10.1145/3491102.3502030

work page doi:10.1145/3491102.3502030 2022

[52] [53]

Lee, Jacob M

Messi H.J. Lee, Jacob M. Montgomery, and Calvin K. Lai. 2024. Large Language Models Portray Socially Subordinate Groups as More Homogeneous, Consistent with a Bias Observed in Humans. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency(Rio de Janeiro, Brazil)(FAccT ’24). Association for Computing Machinery, New York, NY,...

work page arXiv 2024

[53] [54]

Yuxuan Li, Hirokazu Shirado, and Sauvik Das. 2025. Actions Speak Louder than Words: Agent Decisions Reveal Implicit Biases in Language Models. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’25). Association for Computing Machinery, New York, NY, USA, 3303–3325. doi:10.1145/3715275.3732212

work page doi:10.1145/3715275.3732212 2025

[54] [55]

Marjaana Lindeman and Markku Verkasalo. 2005. Measuring Values With the Short Schwartz’s Value Survey.Journal of Personality Assessment85, 2 (2005), 170–178. doi:10.1207/s15327752jpa8502_09 PMID: 16171417

work page doi:10.1207/s15327752jpa8502_09 2005

[55] [56]

Thomas Mejtoft, Sarah Hale, and Ulrik Söderström. 2019. Design Friction. InProceedings of the 31st European Conference on Cognitive Ergonomics(BELFAST, United Kingdom)(ECCE ’19). Association for Computing Machinery, New York, NY, USA, 41–44. doi:10.1145/ 3335082.3335106

work page arXiv 2019

[56] [57]

Jared Moore, Tanvi Deshpande, and Diyi Yang. 2024. Are Large Language Models Consistent over Value-laden Questions? arXiv:2407.02996 [cs.CL] https://arxiv.org/abs/2407.02996

work page arXiv 2024

[57] [58]

Jimin Mun, Wei Bin Au Yeong, Wesley Hanwen Deng, Jana Schaich Borg, and Maarten Sap. 2025. Why (not) use AI? Analyzing People’s Reasoning and Conditions for AI Acceptability. InAIES. https://arxiv.org/abs/2502.07287

work page arXiv 2025

[58] [59]

Jimin Mun, Liwei Jiang, Jenny Liang, Inyoung Cheong, Nicole DeCario, Yejin Choi, Tadayoshi Kohno, and Maarten Sap. 2024. Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits. InAIES. https://arxiv.org/abs/2403.14791

work page arXiv 2024

[59] [60]

Deepa Muralidhar, Rafik Belloum, and Ashwin Ashok. 2025. Operationalizing selective transparency using progressive disclosure in artificial intelligence clinical diagnosis systems.International Journal of Human-Computer Studies204 (2025), 103591. doi:10.1016/j.ijhcs. 2025.103591

work page doi:10.1016/j.ijhcs 2025

[60] [61]

1988.The psychology of everyday things.Basic books

Donald A Norman. 1988.The psychology of everyday things.Basic books

work page 1988

[61] [62]

Gregory B Northcraft and Margaret Ann Neale. 1990. Organizational behavior: A management challenge.(No Title)(1990)

work page 1990

[62] [63]

Stefan Palan and Christian Schitter. 2018. Prolific.ac—A subject pool for online experiments.Journal of Behavioral and Experimental Finance17 (2018), 22–27. doi:10.1016/j.jbef.2017.12.004

work page doi:10.1016/j.jbef.2017.12.004 2018

[63] [64]

Joon Sung Park, Rick Barber, Alex Kirlik, and Karrie Karahalios. 2019. A Slow Algorithm Improves Users’ Assessments of the Algorithm’s Accuracy.Proc. ACM Hum.-Comput. Interact.3, CSCW, Article 102 (Nov. 2019), 15 pages. doi:10.1145/3359204 Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

work page doi:10.1145/3359204 2019

[64] [65]

Savvas Petridis, Nicholas Diakopoulos, Kevin Crowston, Mark Hansen, Keren Henderson, Stan Jastrzebski, Jeffrey V Nickerson, and Lydia B Chilton. 2023. AngleKindling: Supporting Journalistic Angle Ideation with Large Language Models. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for C...

work page doi:10.1145/3544548.3580907 2023

[65] [66]

Ritika Poddar, Rashmi Sinha, Mor Naaman, and Maurice Jakesch. 2023. AI Writing Assistants Influence Topic Choice in Self-Presentation. InExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI EA ’23). Association for Computing Machinery, New York, NY, USA, Article 29, 6 pages. doi:10.1145/3544549.3585893

work page doi:10.1145/3544549.3585893 2023

[66] [67]

Devin G Pope, Joseph Price, and Justin Wolfers. 2018. Awareness reduces racial bias.Management Science64, 11 (2018), 4988–4995

work page 2018

[67] [68]

Neil Rathi, Dan Jurafsky, and Kaitlyn Zhou. 2025. Humans overrely on overconfident language models, across languages. arXiv:2507.06306 [cs.CL] https://arxiv.org/abs/2507.06306

work page arXiv 2025

[68] [69]

Claudia Russo, Francesca Danioni, Ioana Zagrean, and Daniela Barni. 2022. Changing Personal Values through Value-Manipulation Tasks: A Systematic Literature Review Based on Schwartz’s Theory of Basic Human Values.Eur J Investig Health Psychol Educ12, 7 (June 2022), 692–715

work page 2022

[69] [70]

Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, and Dirk Hovy. 2024. Po- litical Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models. arXiv:2402.16786 [cs.CL] https://arxiv.org/abs/2402.16786

work page arXiv 2024

[70] [71]

Liang, Ronan Le Bras, Katharina Reinecke, and Maarten Sap

Sebastin Santy, Jenny T. Liang, Ronan Le Bras, Katharina Reinecke, and Maarten Sap. 2023. NLPositionality: Characterizing Design Biases of Datasets and Models. arXiv:2306.01943 [cs.CL] https://arxiv.org/abs/2306.01943

work page arXiv 2023

[71] [72]

Smith, and James Pennebaker

Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, and James Pennebaker. 2020. Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for...

work page doi:10.18653/v1/2020.acl-main.178 2020

[72] [73]

Shalom Schwartz. 2006. A Theory of Cultural Value Orientations: Explication and Applications.Comparative Sociology5, 2-3 (2006), 137 – 182. doi:10.1163/156913306778667357

work page doi:10.1163/156913306778667357 2006

[73] [74]

Richard Shiffrin and Melanie Mitchell. 2023. Probing the psychology of AI models.Proceedings of the National Academy of Sciences120, 10 (2023), e2300963120. arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.2300963120 doi:10.1073/pnas.2300963120

work page doi:10.1073/pnas.2300963120 2023

[74] [75]

Herbert A Simon and Allen Newell. 1971. Human problem solving: The state of the theory in 1970.American psychologist26, 2 (1971), 145

work page 1971

[75] [76]

Glassman

Nikhil Singh, Guillermo Bernal, Daria Savchenko, and Elena L. Glassman. 2023. Where to Hide a Stolen Elephant: Leaps in Creative Writing with Multimodal Machine Intelligence.ACM Trans. Comput.-Hum. Interact.30, 5, Article 68 (Sept. 2023), 57 pages. doi:10.1145/3511599

work page doi:10.1145/3511599 2023

[76] [77]

Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, and Yejin Choi

Taylor Sorensen, Liwei Jiang, Jena D. Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, and Yejin Choi. 2024. Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties.Proceedings of the AAAI Conference on Artificial Intelligence38, 18 (March 20...

work page doi:10.1609/aaai.v38i18.29970 2024

[77] [78]

Aaron Springer and Steve Whittaker. 2020. Progressive Disclosure: When, Why, and How Do Users Want Algorithmic Transparency Information?ACM Trans. Interact. Intell. Syst.10, 4, Article 29 (Oct. 2020), 32 pages. doi:10.1145/3374218

work page doi:10.1145/3374218 2020

[78] [79]

Kate Sweeny, James A Shepperd, and Jennifer L Howell. 2012. Do as I say (not as I do): Inconsistency between behavior and values. Basic and applied social psychology34, 2 (2012), 128–135

work page 2012

[79] [80]

Yan Tao, Olga Viberg, Ryan S Baker, and René F Kizilcec. 2024. Cultural bias and cultural alignment of large language models.PNAS Nexus3, 9 (09 2024), pgae346. arXiv:https://academic.oup.com/pnasnexus/article-pdf/3/9/pgae346/59151559/pgae346.pdf doi:10.1093/ pnasnexus/pgae346

work page 2024

[80] [81]

Peter Todd and Izak Benbasat. 1994. The Influence of Decision Aids on Choice Strategies: An Experimental Analysis of the Role of Cognitive Effort.Organizational Behavior and Human Decision Processes60, 1 (1994), 36–74. doi:10.1006/obhd.1994.1074

work page doi:10.1006/obhd.1994.1074 1994