pith. sign in

arxiv: 2605.22975 · v1 · pith:A3PYQVYYnew · submitted 2026-05-21 · 💻 cs.CL · cs.CY

When AI Takes Sides on Questions of Faith: Persistent Asymmetries in AI-Mediated Faith Guidance

Pith reviewed 2026-05-25 05:48 UTC · model grok-4.3

classification 💻 cs.CL cs.CY
keywords large language modelsreligious conversionAI biasfaith guidanceasymmetryLLM evaluationethics
0
0 comments X

The pith

Large language models give asymmetric advice on religious conversions, favoring some faiths over others.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether LLMs treat questions about switching religions symmetrically and finds they do not. Models consistently use more encouraging language for transitions toward Catholic, Bahá'í, and Sikh faiths while using more discouraging language for transitions toward Atheism, Agnosticism, or Jehovah's Witnesses. This pattern holds across 20 models and 182 religion pairings when the same query is reversed. A reader would care because these repeatable differences could shape real user decisions if AI systems are used for personal guidance at scale. The asymmetries appear tied to model behavior rather than the scoring method alone.

Core claim

When prompted for advice on hypothetical faith transitions and then asked the reversed question, every tested LLM produced consistent asymmetries: higher support for joining some religions and lower support for leaving them, while the opposite held for others. Catholic, Bahá'í, and Sikh faiths received broadly favorable treatment on average, whereas Atheists, Agnostics, and Jehovah's Witnesses were primarily disfavored. The pattern varied by model size and provider yet remained reproducible across multiple trials, phrasings, and dataset variations.

What carries the argument

A human-verified LLM-as-a-judge framework that scores the encouraging versus discouraging language in model responses to simulated user queries about joining or leaving a given religion.

If this is right

  • All 20 tested models exhibit reproducible asymmetry in religious advice.
  • The specific pattern of favored and disfavored religions differs by model size and provider.
  • Asymmetries remain stable across changes in question phrasing and the set of religion pairings.
  • Any imbalances that are reproduced at scale could carry real-world effects on users.
  • The observed preferences are a property of model behavior rather than an artifact of the evaluation method.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • AI developers may want to audit training data or alignment processes for similar religion-related patterns before deploying models in advisory roles.
  • Individuals using AI for faith-related questions could benefit from cross-checking outputs against multiple models or human sources.
  • The results suggest a need to examine whether similar asymmetries appear in other domains involving personal identity or belief.
  • Controlled experiments could test whether targeted fine-tuning on balanced conversion examples reduces the observed differences.

Load-bearing premise

The LLM judge accurately captures the tested models' genuine preferences instead of injecting its own systematic biases when scoring language.

What would settle it

Re-running the full set of queries with a different judge model or with human scorers and finding that the direction or strength of the asymmetries changes or disappears.

read the original abstract

We ask whether large language models (LLMs) treat queries about religious conversion symmetrically. The answer is no. When asked for advice on hypothetical faith transitions from one religion to another, then asked the reversed question, models exhibited consistent asymmetries, favoring some religions while subtly discouraging conversion to others. On average Catholic, Bah\'a'\'i, and Sikh religions were broadly favored (high support for joining, low support for leaving), while Atheists, Agnostics, and Jehovah's Witnesses were primarily disfavored. Patterns varied by model size and model provider, with Grok 4.20 exhibiting the strongest asymmetries. We tested 20 commercial and open-source language models across 182 religion pairings using a human-verified LLM-as-a-judge framework. Each model was probed via interactions with a simulated user asking for advice on a potential faith conversion. Models tended to use more encouraging language for some faith transitions over others; these patterns were systematically repeatable across multiple trials. All LLMs tested exhibited reproducible asymmetry, though the pattern of preferences differed for each. Overall preferences persist across multiple question phrasings and variations in the religious pairing dataset. Taken together, these results suggest that asymmetry is a robust property of model behavior rather than an artifact of how the models' answers were scored. It is important to consider that any imbalances deployed and reproduced en masse can have real-world implications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports an empirical study of 20 commercial and open-source LLMs probed on 182 religion-pair queries about hypothetical faith transitions. Using a human-verified LLM-as-a-judge framework, the authors find reproducible asymmetries: models on average favor conversions toward Catholic, Bahá'í, and Sikh faiths (high encouragement to join, low encouragement to leave) while disfavoring transitions involving Atheism, Agnosticism, and Jehovah's Witnesses. Patterns vary by model and provider (strongest in Grok 4.20) but persist across phrasings and are claimed not to be scoring artifacts.

Significance. If the measured asymmetries reflect the probed models' output distributions rather than downstream judge artifacts, the work documents a reproducible form of value-laden bias in LLMs on sensitive personal-advice domains. The multi-model scope and human-verification step are strengths; however, the absence of detailed prompting, rubric, and statistical controls in the reported methods limits the strength of the robustness claim.

major comments (2)
  1. [Methods] Methods section: the description of the LLM-as-a-judge prompt, scoring rubric, and human-verification protocol is insufficient to evaluate whether the judge introduces religion-correlated lexical biases that could produce or amplify the reported pattern (Catholic/Bahá'í/Sikh favored, Atheist/Agnostic/JW disfavored). No inter-annotator agreement statistics or ablation replacing the judge with full human scoring on the full set are provided, which is load-bearing for the central claim that asymmetries reside in the probed models.
  2. [Abstract and Results] Abstract and Results: the assertion that 'results are not an artifact of how the models' answers were scored' and that asymmetries 'persist across multiple question phrasings' lacks quantitative support such as effect-size comparisons, statistical tests for phrasing invariance, or exclusion criteria for outlier responses. Without these, it is impossible to determine whether the asymmetries survive basic robustness checks.
minor comments (2)
  1. [Abstract] The abstract states '182 religion pairings' but does not list the exact set of religions or the pairing construction method; a table or appendix listing the 182 pairs would improve reproducibility.
  2. [Results] Model-size and provider variation is mentioned but not accompanied by a table breaking down asymmetry strength by model family or parameter count.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below and commit to revisions that strengthen the methodological transparency and quantitative robustness of the claims.

read point-by-point responses
  1. Referee: [Methods] Methods section: the description of the LLM-as-a-judge prompt, scoring rubric, and human-verification protocol is insufficient to evaluate whether the judge introduces religion-correlated lexical biases that could produce or amplify the reported pattern (Catholic/Bahá'í/Sikh favored, Atheist/Agnostic/JW disfavored). No inter-annotator agreement statistics or ablation replacing the judge with full human scoring on the full set are provided, which is load-bearing for the central claim that asymmetries reside in the probed models.

    Authors: We agree the original Methods section lacked sufficient detail. In revision we will add the complete LLM-as-a-judge prompts, the full scoring rubric with anchor examples, and a precise description of the human-verification protocol. We will also report inter-annotator agreement (Fleiss' kappa) on a 100-response subset double-annotated by three humans. A full human re-scoring of every response is not feasible at the scale of the study; instead we will add an ablation on a stratified 200-response sample comparing judge scores to human scores, confirming high agreement and absence of religion-correlated systematic discrepancies. revision: partial

  2. Referee: [Abstract and Results] Abstract and Results: the assertion that 'results are not an artifact of how the models' answers were scored' and that asymmetries 'persist across multiple question phrasings' lacks quantitative support such as effect-size comparisons, statistical tests for phrasing invariance, or exclusion criteria for outlier responses. Without these, it is impossible to determine whether the asymmetries survive basic robustness checks.

    Authors: We accept that the original text would be strengthened by explicit quantitative evidence. The revised Results section will include standardized effect-size comparisons (Cohen's d) between primary and alternative phrasings, statistical tests (repeated-measures ANOVA with post-hoc contrasts) for phrasing invariance, and documented outlier detection/exclusion criteria together with sensitivity analyses demonstrating that the reported asymmetries remain statistically significant after outlier removal. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical measurement study with independent observations

full rationale

This paper reports an empirical study that probes multiple LLMs with 182 religion-pair queries, scores responses via a human-verified LLM-as-a-judge framework, and aggregates observed asymmetries in language use. No equations, derivations, fitted parameters, or predictions appear in the provided text. The central claim rests on repeatable patterns across models, phrasings, and trials rather than any self-referential reduction, self-citation chain, or ansatz. The setup is self-contained against external benchmarks (human verification and cross-model consistency), so no load-bearing step reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is an empirical measurement paper; no mathematical free parameters, invented physical entities, or non-standard axioms are introduced in the abstract.

axioms (1)
  • domain assumption LLM-generated text can be reliably scored for encouragement versus discouragement by another LLM after human verification of the judge.
    The paper relies on an LLM-as-a-judge framework to quantify asymmetries.

pith-pipeline@v0.9.0 · 5796 in / 1308 out tokens · 19213 ms · 2026-05-25T05:48:03.977548+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages · 1 internal anchor

  1. [1]

    2025 , month =

    Hackett, Conrad and Stonawski, Marcin and Tong, Yunping and Kramer, Stephanie and Shi, Anne and Fahmy, Dalia , title =. 2025 , month =

  2. [2]

    International Journal of Latin American Religions , volume=

    Religious Dynamics and Transitions in Brazil: Insights from National Census Data , author=. International Journal of Latin American Religions , volume=. 2025 , publisher=

  3. [3]

    Pharos Journal of Theology , volume=

    Contemporary Trends in the Development of the Religious Consciousness of the Peoples of Central Asia , author=. Pharos Journal of Theology , volume=

  4. [4]

    American Journal of Epidemiology , pages=

    Religious switching and mental disorders in young adulthood: evidence from Finnish population register data , author=. American Journal of Epidemiology , pages=. 2025 , publisher=

  5. [5]

    Social Forces , volume=

    Tracking the restructuring of American religion: Religious affiliation and patterns of religious mobility, 1973--1998 , author=. Social Forces , volume=. 2001 , publisher=

  6. [6]

    The International Journal for the Psychology of Religion , volume=

    Religious deidentification and positive and negative youth functioning in adolescence: A longitudinal study , author=. The International Journal for the Psychology of Religion , volume=. 2026 , publisher=

  7. [7]

    2009 , publisher=

    Soul searching: The religious and spiritual lives of American teenagers , author=. 2009 , publisher=

  8. [8]

    Journal of Psychological Perspective , volume=

    Navigating Emerging Adulthood: The Role of Religious Coping in Promoting Flourishing Among Indonesian University Students , author=. Journal of Psychological Perspective , volume=

  9. [9]

    Journal for the Scientific Study of Religion , volume=

    Leaving Haredi Judaism: Coping Resources and Perceived Social Support During Community Transitions and Religious Disaffiliation , author=. Journal for the Scientific Study of Religion , volume=. 2025 , publisher=

  10. [10]

    Review of Religious Research , pages=

    Social predictors of retention in and switching from the religious faith of family of origin: Another look using religious tradition self-identification , author=. Review of Religious Research , pages=. 2003 , publisher=

  11. [11]

    Journal for the Scientific Study of Religion , volume=

    Religious switching: Preference development, maintenance, and change , author=. Journal for the Scientific Study of Religion , volume=. 2003 , publisher=

  12. [12]

    Handbook of the Sociology of Religion , pages=

    Religious socialization: Sources of influence and influences of agency , author=. Handbook of the Sociology of Religion , pages=. 2003 , publisher=

  13. [13]

    Journal of health and social behavior , volume=

    High-cost religion, religious switching, and health , author=. Journal of health and social behavior , volume=. 2010 , publisher=

  14. [14]

    Social forces , volume=

    Preferences, constraints, and choices in religious markets: An examination of religious switching and apostasy , author=. Social forces , volume=. 1995 , publisher=

  15. [15]

    Review of Religious Research , volume=

    Non-affiliation, non-denominationalism, religious switching, and denominational switching: Longitudinal analysis of the effects on religiosity , author=. Review of Religious Research , volume=. 2015 , publisher=

  16. [16]

    Oxford research encyclopedia of religion , year=

    Religion, new media, and digital culture , author=. Oxford research encyclopedia of religion , year=

  17. [17]

    Studies on Religion and Philosophy , volume=

    The digital age of religious communication: The shaping and challenges of religious beliefs through social media , author=. Studies on Religion and Philosophy , volume=

  18. [18]

    Young , volume=

    The role of religion in young Muslims’ and Christians’ self-presentation on social media , author=. Young , volume=. 2022 , publisher=

  19. [19]

    Journalism and Media , volume=

    Digital media discourse and the secularization of Germany: A textual analysis of news reporting in 2020--2024 , author=. Journalism and Media , volume=. 2025 , publisher=

  20. [20]

    Scientific Reports , volume=

    Cognitive bias in generative AI influences religious education , author=. Scientific Reports , volume=. 2025 , publisher=

  21. [21]

    Religions , volume=

    Sparking Religious Conversion through AI? , author=. Religions , volume=. 2022 , publisher=

  22. [22]

    International Review of Psychiatry , volume=

    Spiritual confusion in the era of artificial intelligence: a psychology of religion perspective , author=. International Review of Psychiatry , volume=. 2025 , publisher=

  23. [23]

    Journal for the Scientific Study of Religion , volume=

    Religious identity, expression, and civility in social media: results of data mining Latter-Day Saint Twitter accounts , author=. Journal for the Scientific Study of Religion , volume=. 2017 , publisher=

  24. [24]

    Journal of Religions & Peace Studies , year=

    Social Media as a Tool for Religious Expression in Nigeria , author=. Journal of Religions & Peace Studies , year=

  25. [25]

    The Wiley Blackwell Companion to Religion and Materiality , pages=

    Religion and digital media , author=. The Wiley Blackwell Companion to Religion and Materiality , pages=. 2020 , publisher=

  26. [26]

    A Companion to Applied Philosophy of AI , pages=

    AI-aided Moral Enhancement: Exploring Opportunities and Challenges , author=. A Companion to Applied Philosophy of AI , pages=. 2025 , publisher=

  27. [27]

    Bioethics , volume=

    Moral enhancement and cheapened achievement: Psychedelics, virtual reality and AI , author=. Bioethics , volume=. 2025 , publisher=

  28. [28]

    Proceedings of the National Academy of Sciences , volume=

    Exposure to automation explains religious declines , author=. Proceedings of the National Academy of Sciences , volume=. 2023 , publisher=

  29. [29]

    Proceedings of the National Academy of Sciences , volume=

    Testing theories of political persuasion using AI , author=. Proceedings of the National Academy of Sciences , volume=. 2025 , publisher=

  30. [30]

    PNAS nexus , volume=

    The persuasive effects of political microtargeting in the age of generative artificial intelligence , author=. PNAS nexus , volume=. 2024 , publisher=

  31. [31]

    OSF Preprints , volume=

    Artificial intelligence can persuade humans on political issues , author=. OSF Preprints , volume=

  32. [32]

    Scientific Reports , volume=

    The potential of generative AI for personalized persuasion at scale , author=. Scientific Reports , volume=. 2024 , publisher=

  33. [33]

    Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , pages=

    Persistent anti-muslim bias in large language models , author=. Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , pages=

  34. [34]

    arXiv preprint arXiv:2208.04417 , year=

    Debiased large language models still associate muslims with uniquely violent acts , author=. arXiv preprint arXiv:2208.04417 , year=

  35. [35]

    Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

    Divine LLaMAs: Bias, stereotypes, stigmatization, and emotion representation of religion in large language models , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

  36. [36]

    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Knowledge of cultural moral norms in large language models , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  37. [37]

    Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

    How deep is representational bias in llms? the cases of caste and religion , author=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=

  38. [38]

    Nature Machine Intelligence , volume=

    What large language models know and what people think they know , author=. Nature Machine Intelligence , volume=. 2025 , publisher=

  39. [39]

    Nature Communications , volume=

    LLM-generated messages can persuade humans on policy issues , author=. Nature Communications , volume=. 2025 , publisher=

  40. [40]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    Why AI Is WEIRD and shouldn't be this way: towards AI for everyone, with everyone, by everyone , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

  41. [41]

    Political science , volume=

    Large language models can argue in convincing ways about politics, but humans dislike AI authors: implications for governance , author=. Political science , volume=. 2023 , publisher=

  42. [42]

    Philosophy & Technology , volume=

    Inauthentic value shifts induced by AI decision--support? A WEIRD concern , author=. Philosophy & Technology , volume=. 2025 , publisher=

  43. [43]

    arXiv preprint arXiv:2404.09329 , year=

    Large language models are as persuasive as humans, but how? About the cognitive effort and moral-emotional language of LLM arguments , author=. arXiv preprint arXiv:2404.09329 , year=

  44. [44]

    Proceedings of the International AAAI Conference on Web and Social Media , volume=

    The persuasive power of large language models , author=. Proceedings of the International AAAI Conference on Web and Social Media , volume=

  45. [45]

    Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

    Large language models help humans verify truthfulness--except when they are convincingly wrong , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

  46. [46]

    Can LLMs Persuade Humans with Deception?

    " Can LLMs Persuade Humans with Deception?": From a Deceptive Strategy Taxonomy to a Large-Scale Empirical Study , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=

  47. [47]

    Nature Human Behaviour , volume=

    On the conversational persuasiveness of GPT-4 , author=. Nature Human Behaviour , volume=. 2025 , publisher=

  48. [48]

    Computers in Human Behavior , volume=

    Trust and reliance on AI—An experimental study on the extent and costs of overreliance on AI , author=. Computers in Human Behavior , volume=. 2024 , publisher=

  49. [49]

    A General Language Assistant as a Laboratory for Alignment

    A general language assistant as a laboratory for alignment , author=. arXiv preprint arXiv:2112.00861 , year=

  50. [50]

    ACM transactions on intelligent systems and technology , volume=

    A survey on evaluation of large language models , author=. ACM transactions on intelligent systems and technology , volume=. 2024 , publisher=

  51. [51]

    arXiv preprint arXiv:2511.02781 , year=

    Measuring AI Diffusion: A Population-Normalized Metric for Tracking Global AI Usage , author=. arXiv preprint arXiv:2511.02781 , year=

  52. [52]

    The Most Used AI Chatbots in 2025: Global Usage, Trends, and Platform Comparisons of ChatGPT, Gemini, Copilot, and Claude , year =

  53. [53]

    Advances in neural information processing systems , volume=

    Judging llm-as-a-judge with mt-bench and chatbot arena , author=. Advances in neural information processing systems , volume=

  54. [54]

    StereoSet: Measuring stereotypical bias in pretrained language models , author=. Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers) , pages=

  55. [55]

    Findings of the Association for Computational Linguistics: ACL 2022 , pages=

    BBQ: A hand-built bias benchmark for question answering , author=. Findings of the Association for Computational Linguistics: ACL 2022 , pages=

  56. [56]

    Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) , pages=

    CrowS-pairs: A challenge dataset for measuring social biases in masked language models , author=. Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) , pages=

  57. [57]

    Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages=

    Bold: Dataset and metrics for measuring biases in open-ended language generation , author=. Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages=

  58. [58]

    Journal of management , volume=

    Not so subtle: A meta-analytic investigation of the correlates of subtle and overt discrimination , author=. Journal of management , volume=. 2016 , publisher=

  59. [59]

    Religious Landscape Study (RLS) , year =

  60. [60]

    The Innovation , year=

    A survey on llm-as-a-judge , author=. The Innovation , year=

  61. [61]

    AI Chatbot Market Share Worldwide , year =

  62. [62]

    the method of paired comparisons , author=

    Rank analysis of incomplete block designs: I. the method of paired comparisons , author=. Biometrika , volume=. 1952 , publisher=

  63. [63]

    2023 , publisher =

    Fairness and Machine Learning: Limitations and Opportunities , author =. 2023 , publisher =

  64. [64]

    Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , articleno =

    Schneiders, Eike and Seabrooke, Tina and Krook, Joshua and Hyde, Richard and Leesakul, Natalie and Clos, Jeremie and Fischer, Joel E , title =. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , articleno =. 2025 , isbn =. doi:10.1145/3706598.3713470 , abstract =

  65. [65]

    2026 , howpublished =

    Shen, Judy Hanwen and Carter, Shan and Dargan, Richard and Gillotte, Jessica and Handa, Kunal and Hong, Jerry and Huang, Saffron and Jagadish, Kamya and Kearney, Matt and Levinstein, Ben and Linthicum, Ryn and McCain, Miles and Millar, Thomas and Julapalli, Mo and Price, Sara and Stern, Michael and Saunders, David and Tamkin, Alex and Vallone, Andrea and ...

  66. [66]

    Search Interest for ``religion'' in the United States (Last 5 Years) , year =

  67. [67]

    International Conference on Learning Representations , volume=

    Towards understanding sycophancy in language models , author=. International Conference on Learning Representations , volume=

  68. [68]

    International Conference on Learning Representations , volume=

    Trust or escalate: Llm judges with provable guarantees for human agreement , author=. International Conference on Learning Representations , volume=

  69. [69]

    Proceedings of the ACM on Software Engineering , volume=

    Can llms replace human evaluators? an empirical study of llm-as-a-judge in software engineering , author=. Proceedings of the ACM on Software Engineering , volume=. 2025 , publisher=

  70. [70]

    2026 , institution=

    Omissive Bias in Religious Representation: Benchmarking LLM Answers to Everyday Ethical Decision-making , author=. 2026 , institution=

  71. [71]

    2026 , month = may, day =

    Rafael Sotelo , title =. 2026 , month = may, day =

  72. [72]

    Scaling AI for everyone , year =

  73. [73]

    Google's Gemini Hits 750m MAUs After Gemini 3 , year =

  74. [74]

    Religious Bias in

    Reade, Walter and Carty, Sheryl and Fulda, Nancy , year=. Religious Bias in