Recognition Without Authorization: LLMs and the Moral Order of Online Advice
Pith reviewed 2026-05-08 09:52 UTC · model grok-4.3
The pith
LLMs spot the same relationship problems as human users but give far less directive advice on what to do.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across models, LLMs identify many of the same dynamics as human commenters, but are markedly less likely to convert that recognition into directive authorization for action. The gap is sharpest where community consensus is strongest: on high-consensus posts involving abuse or safety threats, models recommend exit at roughly half the human rate while maintaining elevated levels of hedging, validation, and therapeutic framing.
What carries the argument
Recognition without authorization: the capacity to register harm or problematic dynamics while withholding socially ratified permission for consequential action.
Load-bearing premise
Divergence from subreddit consensus can be attributed primarily to structural features of LLMs such as safety alignment and training-data averaging rather than to study-specific factors like prompt phrasing or how posts were selected and coded.
What would settle it
Re-running the comparison on a new sample of high-consensus abuse posts from the same subreddit, using varied prompt wordings and additional models, and checking whether the roughly 50 percent lower exit recommendation rate remains stable.
Figures
read the original abstract
Large language models are increasingly used to mediate everyday interpersonal dilemmas, yet how their advisory defaults interact with the concentrated moral orders of specific communities remains poorly understood. This article compares four assistant-style LLMs with community-endorsed advice on 11,565 posts from r/relationship_advice, using the subreddit as a concentrated, vote-ratified moral formation whose prescriptive clarity makes divergence measurable. Across models, LLMs identify many of the same dynamics as human commenters, but are markedly less likely to convert that recognition into directive authorization for action. The gap is sharpest where community consensus is strongest: on high-consensus posts involving abuse or safety threats, models recommend exit at roughly half the human rate while maintaining elevated levels of hedging, validation, and therapeutic framing. The article describes this pattern as recognition without authorization: the capacity to register harm while withholding socially ratified permission for consequential action. This divergence is not incidental but structural: a portable advisory style that remains validating, risk-averse, and weakly directive across contexts. Safety alignment is one plausible contributor to this pattern, alongside training-data averaging and broader assistant design. The article argues that model divergence can be reframed from a technical error to a way of seeing what standardized assistant norms flatten when they encounter situated moral worlds.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper empirically compares four assistant-style LLMs against 11,565 human comments from r/relationship_advice, arguing that models recognize similar interpersonal dynamics (e.g., abuse or safety threats) but convert that recognition into directive authorization for action at markedly lower rates than the subreddit community, especially on high-consensus posts. This pattern is labeled 'recognition without authorization' and attributed primarily to structural LLM features such as safety alignment, training-data averaging, and assistant design rather than prompt or coding artifacts. The work reframes the divergence as evidence of how standardized AI norms flatten situated moral orders.
Significance. If the measurement holds, the result identifies a systematic flattening effect in LLM advice that is most pronounced precisely where community consensus is strongest, with potential implications for AI deployment in interpersonal domains. The scale of the comparison (over 11k posts) and the focus on vote-ratified subreddit norms provide a concrete testbed for studying advisory divergence; the paper also supplies a portable conceptual label that could be applied to other contexts.
major comments (3)
- [Methods] Methods (prompting and model invocation): the manuscript provides no details on the exact prompts, temperature settings, system instructions, or few-shot examples used to elicit advice from the four LLMs. Without this, it is impossible to rule out that elevated hedging and reduced directive language are artifacts of prompt phrasing rather than intrinsic model properties, directly undermining the claim that the gap is 'structural.'
- [Methods / Results] Coding of 'directive authorization' and 'high-consensus' posts: the operationalization of these central variables is not described, nor is inter-rater reliability or robustness to alternative coding schemes reported. The skeptic concern is therefore load-bearing: if annotation criteria for 'authorization' (vs. hedging/validation) were applied inconsistently or if 'high-consensus' was defined post-hoc, the reported halving of exit recommendations on abuse posts could be measurement-induced rather than evidence of LLM flattening.
- [Discussion] Attribution section: the argument that divergence arises primarily from safety alignment and training averaging (rather than study-specific factors) rests on the untested assumption that the experimental frame is neutral. A concrete test—e.g., ablation with deliberately directive prompts or comparison to non-aligned base models—is absent, leaving the causal reframing from 'technical error' to 'flattened moral worlds' unsupported by the data presented.
minor comments (2)
- [Abstract / Results] The abstract states 'models recommend exit at roughly half the human rate' but the main text should report the exact percentages, confidence intervals, and the precise definition of the 'exit' category used in both human and LLM annotations.
- [Results] Table or figure presenting the per-model breakdown by consensus level would improve readability; currently the comparison is summarized at a high level without showing variance across the four LLMs.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback, which identifies key areas where the manuscript can be strengthened through greater methodological transparency and more cautious interpretive framing. We address each major comment below and specify the revisions we will undertake.
read point-by-point responses
-
Referee: [Methods] Methods (prompting and model invocation): the manuscript provides no details on the exact prompts, temperature settings, system instructions, or few-shot examples used to elicit advice from the four LLMs. Without this, it is impossible to rule out that elevated hedging and reduced directive language are artifacts of prompt phrasing rather than intrinsic model properties, directly undermining the claim that the gap is 'structural.'
Authors: We agree that the original manuscript omitted critical details on prompting and model parameters, which limits reproducibility and leaves open the possibility of prompt-induced effects. In the revised version, we will add a dedicated methods subsection that reports the exact system prompts, user templates, temperature (uniformly set to 0.7), top-p settings, and any few-shot or instruction elements used for each of the four models. We will also include a brief robustness discussion noting that the reduced directive pattern was observed consistently across models despite their differing alignment procedures, though we acknowledge that prompt variation remains a factor that future work should test directly. revision: yes
-
Referee: [Methods / Results] Coding of 'directive authorization' and 'high-consensus' posts: the operationalization of these central variables is not described, nor is inter-rater reliability or robustness to alternative coding schemes reported. The skeptic concern is therefore load-bearing: if annotation criteria for 'authorization' (vs. hedging/validation) were applied inconsistently or if 'high-consensus' was defined post-hoc, the reported halving of exit recommendations on abuse posts could be measurement-induced rather than evidence of LLM flattening.
Authors: We accept that the manuscript insufficiently documented the coding procedures for directive authorization and high-consensus posts. The revision will provide an explicit operationalization, including decision rules and illustrative examples distinguishing directive exit recommendations from hedging or validation-only responses. We will report inter-rater reliability (Cohen's kappa) from the annotation process and include sensitivity analyses using alternative thresholds for consensus (e.g., top 10% vs. top 20% by vote ratio). These additions will allow readers to evaluate whether the observed gap on abuse-related posts is robust to coding choices. revision: yes
-
Referee: [Discussion] Attribution section: the argument that divergence arises primarily from safety alignment and training averaging (rather than study-specific factors) rests on the untested assumption that the experimental frame is neutral. A concrete test—e.g., ablation with deliberately directive prompts or comparison to non-aligned base models—is absent, leaving the causal reframing from 'technical error' to 'flattened moral worlds' unsupported by the data presented.
Authors: The referee correctly identifies that we did not conduct ablations or comparisons to base models, so direct causal evidence for safety alignment as the dominant driver is not present. We will revise the discussion to present safety alignment, training-data averaging, and assistant-style design as plausible and mutually reinforcing contributors rather than claiming primacy for any single factor. The core empirical claim—that LLMs exhibit recognition without authorization relative to the subreddit's vote-ratified norms—will be retained as a descriptive finding supported by the scale of the human-LLM comparison. We will add an explicit limitations paragraph calling for future ablation studies while arguing that the current design already demonstrates a systematic flattening effect worth conceptualizing independently of full causal decomposition. revision: partial
Circularity Check
No circularity: empirical comparison of LLM and human advice
full rationale
The paper conducts an external empirical study comparing four LLMs against 11,565 human comments from r/relationship_advice, measuring divergence in recognition of dynamics versus directive authorization. No equations, derivations, or fitted parameters are present; the pattern labeled 'recognition without authorization' is defined directly from the observed output differences rather than by construction from inputs. Claims about structural features (safety alignment, training averaging) are presented as plausible contributors supported by the data contrast, not as self-referential definitions or predictions forced by prior self-citations. The work is self-contained against the subreddit benchmark with no load-bearing reductions to its own assumptions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption r/relationship_advice constitutes a concentrated, vote-ratified moral formation whose prescriptive clarity makes divergence measurable.
invented entities (1)
-
recognition without authorization
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Arsenault, A. C. and Kreps, S. (2026), ‘Whose voice counts? the role of large language models in public commenting’,Big Data & Society13(1), 20539517261419341
2026
-
[2]
M., Gebru, T., McMillan-Major, A
Bender, E. M., Gebru, T., McMillan-Major, A. and Shmitchell, S. (2021), On the dangers of stochastic parrots: Can language models be too big?,in‘Proceedings of the ACM Confer- ence on Fairness, Accountability, and Transparency (FAccT)’, ACM, pp. 610–623
2021
-
[3]
Bowker, G. C. and Star, S. L. (1999),Sorting Things Out: Classification and Its Consequences, MIT Press, Cambridge, MA
1999
-
[4]
(2025), ‘Cosine capital: Large language models and the embedding of all things’, Big Data & Society12(4)
Brunila, M. (2025), ‘Cosine capital: Large language models and the embedding of all things’, Big Data & Society12(4)
2025
-
[5]
and Jurafsky, D
Cheng, M., Yu, S., Lee, C., Khadpe, P., Ibrahim, L. and Jurafsky, D. (2025), ‘ELEPHANT: Measuring and understanding social sycophancy in LLMs’. Preprint
2025
-
[6]
Emmison, M., Butler, C. W. and Danby, S. (2011), ‘Script proposals: A device for empowering clients in counselling’,Discourse studies13(1), 3–26. 32
2011
-
[7]
(2018),Custodians of the Internet: Platforms, Content Moderation, and the Hid- den Decisions That Shape Social Media, Yale University Press, New Haven
Gillespie, T. (2018),Custodians of the Internet: Platforms, Content Moderation, and the Hid- den Decisions That Shape Social Media, Yale University Press, New Haven
2018
-
[8]
and Stewart, B
Grimmer, J. and Stewart, B. M. (2013), ‘Text as data: The promise and pitfalls of automatic content analysis methods for political texts’,Political Analysis21(3), 267–297
2013
-
[9]
Huang, Y .-H. et al. (2024), ChatGPT giving relationship advice – how reliable is it?,in‘Pro- ceedings of the International AAAI Conference on Web and Social Media’
2024
-
[10]
(1998),Hedging in Scientific Research Articles, Pragmatics and Beyond New Se- ries, John Benjamins, Amsterdam
Hyland, K. (1998),Hedging in Scientific Research Articles, Pragmatics and Beyond New Se- ries, John Benjamins, Amsterdam
1998
-
[11]
(2005),Metadiscourse: Exploring Interaction in Writing, Continuum, London
Hyland, K. (2005),Metadiscourse: Exploring Interaction in Writing, Continuum, London
2005
-
[12]
Iliadis, A. and Russo, F. (2016), ‘Critical data studies: An introduction’,Big Data & Society 3(2), 1–7. URL:https://doi.org/10.1177/2053951716674238
-
[13]
(2008),Saving the Modern Soul: Therapy, Emotions, and the Culture of Self-Help, University of California Press, Berkeley
Illouz, E. (2008),Saving the Modern Soul: Therapy, Emotions, and the Culture of Self-Help, University of California Press, Berkeley
2008
-
[14]
and Raghavan, M
Kleinberg, J. and Raghavan, M. (2021), ‘Algorithmic monoculture and social welfare’,Pro- ceedings of the National Academy of Sciences118(22), e2018340118
2021
-
[15]
(2025), ‘An interactional model for politeness evaluation’,Journal of Politeness Re- search21(1), 135–164
Lu, X. (2025), ‘An interactional model for politeness evaluation’,Journal of Politeness Re- search21(1), 135–164
2025
-
[16]
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A. et al. (2022), ‘Training language models to follow instructions with human feedback’,Advances in neural information processing systems35, 27730–27744
2022
-
[17]
Palmer, F. R. (2001),Mood and Modality, 2nd edn, Cambridge University Press, Cambridge
2001
-
[18]
M., Hoyle, A., Sun, S., Resnik, P
Pham, C. M., Hoyle, A., Sun, S., Resnik, P. and Iyyer, M. (2024), TopicGPT: A prompt-based topic modeling framework,in‘Proceedings of the 2024 Conference of the North American 33 Chapter of the Association for Computational Linguistics (NAACL)’. URL:https://aclanthology.org/2024.naacl-long.164/
2024
-
[19]
Reagle, J. (2025), ‘A history of the advice genre on Reddit: Evolutionary paths and sibling rivalries’,First Monday30(2). URL:https://doi.org/10.5210/fm.v30i2.13729
-
[20]
(1999),Governing the Soul: The Shaping of the Private Self, 2nd edn, Free Associa- tion Books, London
Rose, N. (1999),Governing the Soul: The Shaping of the Private Self, 2nd edn, Free Associa- tion Books, London
1999
-
[21]
Sachdeva, P. S. and van Nuenen, T. (2025), Normative evaluation of large language models with everyday moral dilemmas,in‘Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT)’, ACM. URL:https: // github. com/ dlab-projects/ normative_ evaluation_ llms_ everyday_ dilemmas
2025
-
[22]
and Hashimoto, T
Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P. and Hashimoto, T. (2023), Whose opinions do language models reflect?,in‘Proceedings of the 40th International Conference on Machine Learning (ICML)’, PMLR. URL:https://proceedings.mlr.press/v202/santurkar23a.html
2023
-
[23]
(1917), Art as technique,inL
Shklovsky, V . (1917), Art as technique,inL. T. Lemon and M. J. Reis, eds, ‘Russian Formalist Criticism: Four Essays’, University of Nebraska Press
1917
-
[24]
Stevanovic, M. and Peräkylä, A. (2012), ‘Deontic authority in interaction: The right to an- nounce, propose, and decide’,Research on Language and Social Interaction45(3), 297–321. URL:https://doi.org/10.1080/08351813.2012.699260
-
[25]
and Singh, R
Vecchione, B. and Singh, R. (2025), ‘Artificial intelligence is mental: Evaluating the role of large-language models in supporting mental health and well-being’,Big Data & Society 12(4), 20539517251383884
2025
-
[26]
and Annus, S
Vicsek, L., Zajko, M., Vancsó, A., Takacs, J. and Annus, S. (2025), ‘Cross-cultural challenges in generative AI: Addressing homophobia in diverse sociocultural contexts’,Big Data & Society12(4). 34 V owels, L. M., Francois-Walcott, R. R. R. and Darwiche, J. (2024), ‘Ai in relationship coun- selling: Evaluating ChatGPT’s therapeutic capabilities in providi...
2025
-
[27]
Yun, H. S. and Bickmore, T. (2025), ‘Online health information–seeking in the era of large language models: Cross-sectional web-based survey study’,Journal of Medical Internet Re- search27, e68560. 35
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.