When AI Takes Sides on Questions of Faith: Persistent Asymmetries in AI-Mediated Faith Guidance
Pith reviewed 2026-05-25 05:48 UTC · model grok-4.3
The pith
Large language models give asymmetric advice on religious conversions, favoring some faiths over others.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When prompted for advice on hypothetical faith transitions and then asked the reversed question, every tested LLM produced consistent asymmetries: higher support for joining some religions and lower support for leaving them, while the opposite held for others. Catholic, Bahá'í, and Sikh faiths received broadly favorable treatment on average, whereas Atheists, Agnostics, and Jehovah's Witnesses were primarily disfavored. The pattern varied by model size and provider yet remained reproducible across multiple trials, phrasings, and dataset variations.
What carries the argument
A human-verified LLM-as-a-judge framework that scores the encouraging versus discouraging language in model responses to simulated user queries about joining or leaving a given religion.
If this is right
- All 20 tested models exhibit reproducible asymmetry in religious advice.
- The specific pattern of favored and disfavored religions differs by model size and provider.
- Asymmetries remain stable across changes in question phrasing and the set of religion pairings.
- Any imbalances that are reproduced at scale could carry real-world effects on users.
- The observed preferences are a property of model behavior rather than an artifact of the evaluation method.
Where Pith is reading between the lines
- AI developers may want to audit training data or alignment processes for similar religion-related patterns before deploying models in advisory roles.
- Individuals using AI for faith-related questions could benefit from cross-checking outputs against multiple models or human sources.
- The results suggest a need to examine whether similar asymmetries appear in other domains involving personal identity or belief.
- Controlled experiments could test whether targeted fine-tuning on balanced conversion examples reduces the observed differences.
Load-bearing premise
The LLM judge accurately captures the tested models' genuine preferences instead of injecting its own systematic biases when scoring language.
What would settle it
Re-running the full set of queries with a different judge model or with human scorers and finding that the direction or strength of the asymmetries changes or disappears.
read the original abstract
We ask whether large language models (LLMs) treat queries about religious conversion symmetrically. The answer is no. When asked for advice on hypothetical faith transitions from one religion to another, then asked the reversed question, models exhibited consistent asymmetries, favoring some religions while subtly discouraging conversion to others. On average Catholic, Bah\'a'\'i, and Sikh religions were broadly favored (high support for joining, low support for leaving), while Atheists, Agnostics, and Jehovah's Witnesses were primarily disfavored. Patterns varied by model size and model provider, with Grok 4.20 exhibiting the strongest asymmetries. We tested 20 commercial and open-source language models across 182 religion pairings using a human-verified LLM-as-a-judge framework. Each model was probed via interactions with a simulated user asking for advice on a potential faith conversion. Models tended to use more encouraging language for some faith transitions over others; these patterns were systematically repeatable across multiple trials. All LLMs tested exhibited reproducible asymmetry, though the pattern of preferences differed for each. Overall preferences persist across multiple question phrasings and variations in the religious pairing dataset. Taken together, these results suggest that asymmetry is a robust property of model behavior rather than an artifact of how the models' answers were scored. It is important to consider that any imbalances deployed and reproduced en masse can have real-world implications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports an empirical study of 20 commercial and open-source LLMs probed on 182 religion-pair queries about hypothetical faith transitions. Using a human-verified LLM-as-a-judge framework, the authors find reproducible asymmetries: models on average favor conversions toward Catholic, Bahá'í, and Sikh faiths (high encouragement to join, low encouragement to leave) while disfavoring transitions involving Atheism, Agnosticism, and Jehovah's Witnesses. Patterns vary by model and provider (strongest in Grok 4.20) but persist across phrasings and are claimed not to be scoring artifacts.
Significance. If the measured asymmetries reflect the probed models' output distributions rather than downstream judge artifacts, the work documents a reproducible form of value-laden bias in LLMs on sensitive personal-advice domains. The multi-model scope and human-verification step are strengths; however, the absence of detailed prompting, rubric, and statistical controls in the reported methods limits the strength of the robustness claim.
major comments (2)
- [Methods] Methods section: the description of the LLM-as-a-judge prompt, scoring rubric, and human-verification protocol is insufficient to evaluate whether the judge introduces religion-correlated lexical biases that could produce or amplify the reported pattern (Catholic/Bahá'í/Sikh favored, Atheist/Agnostic/JW disfavored). No inter-annotator agreement statistics or ablation replacing the judge with full human scoring on the full set are provided, which is load-bearing for the central claim that asymmetries reside in the probed models.
- [Abstract and Results] Abstract and Results: the assertion that 'results are not an artifact of how the models' answers were scored' and that asymmetries 'persist across multiple question phrasings' lacks quantitative support such as effect-size comparisons, statistical tests for phrasing invariance, or exclusion criteria for outlier responses. Without these, it is impossible to determine whether the asymmetries survive basic robustness checks.
minor comments (2)
- [Abstract] The abstract states '182 religion pairings' but does not list the exact set of religions or the pairing construction method; a table or appendix listing the 182 pairs would improve reproducibility.
- [Results] Model-size and provider variation is mentioned but not accompanied by a table breaking down asymmetry strength by model family or parameter count.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We address each major comment below and commit to revisions that strengthen the methodological transparency and quantitative robustness of the claims.
read point-by-point responses
-
Referee: [Methods] Methods section: the description of the LLM-as-a-judge prompt, scoring rubric, and human-verification protocol is insufficient to evaluate whether the judge introduces religion-correlated lexical biases that could produce or amplify the reported pattern (Catholic/Bahá'í/Sikh favored, Atheist/Agnostic/JW disfavored). No inter-annotator agreement statistics or ablation replacing the judge with full human scoring on the full set are provided, which is load-bearing for the central claim that asymmetries reside in the probed models.
Authors: We agree the original Methods section lacked sufficient detail. In revision we will add the complete LLM-as-a-judge prompts, the full scoring rubric with anchor examples, and a precise description of the human-verification protocol. We will also report inter-annotator agreement (Fleiss' kappa) on a 100-response subset double-annotated by three humans. A full human re-scoring of every response is not feasible at the scale of the study; instead we will add an ablation on a stratified 200-response sample comparing judge scores to human scores, confirming high agreement and absence of religion-correlated systematic discrepancies. revision: partial
-
Referee: [Abstract and Results] Abstract and Results: the assertion that 'results are not an artifact of how the models' answers were scored' and that asymmetries 'persist across multiple question phrasings' lacks quantitative support such as effect-size comparisons, statistical tests for phrasing invariance, or exclusion criteria for outlier responses. Without these, it is impossible to determine whether the asymmetries survive basic robustness checks.
Authors: We accept that the original text would be strengthened by explicit quantitative evidence. The revised Results section will include standardized effect-size comparisons (Cohen's d) between primary and alternative phrasings, statistical tests (repeated-measures ANOVA with post-hoc contrasts) for phrasing invariance, and documented outlier detection/exclusion criteria together with sensitivity analyses demonstrating that the reported asymmetries remain statistically significant after outlier removal. revision: yes
Circularity Check
No circularity: empirical measurement study with independent observations
full rationale
This paper reports an empirical study that probes multiple LLMs with 182 religion-pair queries, scores responses via a human-verified LLM-as-a-judge framework, and aggregates observed asymmetries in language use. No equations, derivations, fitted parameters, or predictions appear in the provided text. The central claim rests on repeatable patterns across models, phrasings, and trials rather than any self-referential reduction, self-citation chain, or ansatz. The setup is self-contained against external benchmarks (human verification and cross-model consistency), so no load-bearing step reduces to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM-generated text can be reliably scored for encouragement versus discouragement by another LLM after human verification of the judge.
Reference graph
Works this paper leans on
-
[1]
Hackett, Conrad and Stonawski, Marcin and Tong, Yunping and Kramer, Stephanie and Shi, Anne and Fahmy, Dalia , title =. 2025 , month =
work page 2025
-
[2]
International Journal of Latin American Religions , volume=
Religious Dynamics and Transitions in Brazil: Insights from National Census Data , author=. International Journal of Latin American Religions , volume=. 2025 , publisher=
work page 2025
-
[3]
Pharos Journal of Theology , volume=
Contemporary Trends in the Development of the Religious Consciousness of the Peoples of Central Asia , author=. Pharos Journal of Theology , volume=
-
[4]
American Journal of Epidemiology , pages=
Religious switching and mental disorders in young adulthood: evidence from Finnish population register data , author=. American Journal of Epidemiology , pages=. 2025 , publisher=
work page 2025
-
[5]
Tracking the restructuring of American religion: Religious affiliation and patterns of religious mobility, 1973--1998 , author=. Social Forces , volume=. 2001 , publisher=
work page 1973
-
[6]
The International Journal for the Psychology of Religion , volume=
Religious deidentification and positive and negative youth functioning in adolescence: A longitudinal study , author=. The International Journal for the Psychology of Religion , volume=. 2026 , publisher=
work page 2026
-
[7]
Soul searching: The religious and spiritual lives of American teenagers , author=. 2009 , publisher=
work page 2009
-
[8]
Journal of Psychological Perspective , volume=
Navigating Emerging Adulthood: The Role of Religious Coping in Promoting Flourishing Among Indonesian University Students , author=. Journal of Psychological Perspective , volume=
-
[9]
Journal for the Scientific Study of Religion , volume=
Leaving Haredi Judaism: Coping Resources and Perceived Social Support During Community Transitions and Religious Disaffiliation , author=. Journal for the Scientific Study of Religion , volume=. 2025 , publisher=
work page 2025
-
[10]
Review of Religious Research , pages=
Social predictors of retention in and switching from the religious faith of family of origin: Another look using religious tradition self-identification , author=. Review of Religious Research , pages=. 2003 , publisher=
work page 2003
-
[11]
Journal for the Scientific Study of Religion , volume=
Religious switching: Preference development, maintenance, and change , author=. Journal for the Scientific Study of Religion , volume=. 2003 , publisher=
work page 2003
-
[12]
Handbook of the Sociology of Religion , pages=
Religious socialization: Sources of influence and influences of agency , author=. Handbook of the Sociology of Religion , pages=. 2003 , publisher=
work page 2003
-
[13]
Journal of health and social behavior , volume=
High-cost religion, religious switching, and health , author=. Journal of health and social behavior , volume=. 2010 , publisher=
work page 2010
-
[14]
Preferences, constraints, and choices in religious markets: An examination of religious switching and apostasy , author=. Social forces , volume=. 1995 , publisher=
work page 1995
-
[15]
Review of Religious Research , volume=
Non-affiliation, non-denominationalism, religious switching, and denominational switching: Longitudinal analysis of the effects on religiosity , author=. Review of Religious Research , volume=. 2015 , publisher=
work page 2015
-
[16]
Oxford research encyclopedia of religion , year=
Religion, new media, and digital culture , author=. Oxford research encyclopedia of religion , year=
-
[17]
Studies on Religion and Philosophy , volume=
The digital age of religious communication: The shaping and challenges of religious beliefs through social media , author=. Studies on Religion and Philosophy , volume=
-
[18]
The role of religion in young Muslims’ and Christians’ self-presentation on social media , author=. Young , volume=. 2022 , publisher=
work page 2022
-
[19]
Journalism and Media , volume=
Digital media discourse and the secularization of Germany: A textual analysis of news reporting in 2020--2024 , author=. Journalism and Media , volume=. 2025 , publisher=
work page 2020
-
[20]
Cognitive bias in generative AI influences religious education , author=. Scientific Reports , volume=. 2025 , publisher=
work page 2025
-
[21]
Sparking Religious Conversion through AI? , author=. Religions , volume=. 2022 , publisher=
work page 2022
-
[22]
International Review of Psychiatry , volume=
Spiritual confusion in the era of artificial intelligence: a psychology of religion perspective , author=. International Review of Psychiatry , volume=. 2025 , publisher=
work page 2025
-
[23]
Journal for the Scientific Study of Religion , volume=
Religious identity, expression, and civility in social media: results of data mining Latter-Day Saint Twitter accounts , author=. Journal for the Scientific Study of Religion , volume=. 2017 , publisher=
work page 2017
-
[24]
Journal of Religions & Peace Studies , year=
Social Media as a Tool for Religious Expression in Nigeria , author=. Journal of Religions & Peace Studies , year=
-
[25]
The Wiley Blackwell Companion to Religion and Materiality , pages=
Religion and digital media , author=. The Wiley Blackwell Companion to Religion and Materiality , pages=. 2020 , publisher=
work page 2020
-
[26]
A Companion to Applied Philosophy of AI , pages=
AI-aided Moral Enhancement: Exploring Opportunities and Challenges , author=. A Companion to Applied Philosophy of AI , pages=. 2025 , publisher=
work page 2025
-
[27]
Moral enhancement and cheapened achievement: Psychedelics, virtual reality and AI , author=. Bioethics , volume=. 2025 , publisher=
work page 2025
-
[28]
Proceedings of the National Academy of Sciences , volume=
Exposure to automation explains religious declines , author=. Proceedings of the National Academy of Sciences , volume=. 2023 , publisher=
work page 2023
-
[29]
Proceedings of the National Academy of Sciences , volume=
Testing theories of political persuasion using AI , author=. Proceedings of the National Academy of Sciences , volume=. 2025 , publisher=
work page 2025
-
[30]
The persuasive effects of political microtargeting in the age of generative artificial intelligence , author=. PNAS nexus , volume=. 2024 , publisher=
work page 2024
-
[31]
Artificial intelligence can persuade humans on political issues , author=. OSF Preprints , volume=
-
[32]
The potential of generative AI for personalized persuasion at scale , author=. Scientific Reports , volume=. 2024 , publisher=
work page 2024
-
[33]
Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , pages=
Persistent anti-muslim bias in large language models , author=. Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , pages=
work page 2021
-
[34]
arXiv preprint arXiv:2208.04417 , year=
Debiased large language models still associate muslims with uniquely violent acts , author=. arXiv preprint arXiv:2208.04417 , year=
-
[35]
Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=
Divine LLaMAs: Bias, stereotypes, stigmatization, and emotion representation of religion in large language models , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=
work page 2024
-
[36]
Knowledge of cultural moral norms in large language models , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
-
[37]
Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=
How deep is representational bias in llms? the cases of caste and religion , author=. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume=
-
[38]
Nature Machine Intelligence , volume=
What large language models know and what people think they know , author=. Nature Machine Intelligence , volume=. 2025 , publisher=
work page 2025
-
[39]
Nature Communications , volume=
LLM-generated messages can persuade humans on policy issues , author=. Nature Communications , volume=. 2025 , publisher=
work page 2025
-
[40]
Proceedings of the AAAI conference on artificial intelligence , volume=
Why AI Is WEIRD and shouldn't be this way: towards AI for everyone, with everyone, by everyone , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[41]
Large language models can argue in convincing ways about politics, but humans dislike AI authors: implications for governance , author=. Political science , volume=. 2023 , publisher=
work page 2023
-
[42]
Philosophy & Technology , volume=
Inauthentic value shifts induced by AI decision--support? A WEIRD concern , author=. Philosophy & Technology , volume=. 2025 , publisher=
work page 2025
-
[43]
arXiv preprint arXiv:2404.09329 , year=
Large language models are as persuasive as humans, but how? About the cognitive effort and moral-emotional language of LLM arguments , author=. arXiv preprint arXiv:2404.09329 , year=
-
[44]
Proceedings of the International AAAI Conference on Web and Social Media , volume=
The persuasive power of large language models , author=. Proceedings of the International AAAI Conference on Web and Social Media , volume=
-
[45]
Large language models help humans verify truthfulness--except when they are convincingly wrong , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=
work page 2024
-
[46]
Can LLMs Persuade Humans with Deception?
" Can LLMs Persuade Humans with Deception?": From a Deceptive Strategy Taxonomy to a Large-Scale Empirical Study , author=. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems , pages=
work page 2026
-
[47]
Nature Human Behaviour , volume=
On the conversational persuasiveness of GPT-4 , author=. Nature Human Behaviour , volume=. 2025 , publisher=
work page 2025
-
[48]
Computers in Human Behavior , volume=
Trust and reliance on AI—An experimental study on the extent and costs of overreliance on AI , author=. Computers in Human Behavior , volume=. 2024 , publisher=
work page 2024
-
[49]
A General Language Assistant as a Laboratory for Alignment
A general language assistant as a laboratory for alignment , author=. arXiv preprint arXiv:2112.00861 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[50]
ACM transactions on intelligent systems and technology , volume=
A survey on evaluation of large language models , author=. ACM transactions on intelligent systems and technology , volume=. 2024 , publisher=
work page 2024
-
[51]
arXiv preprint arXiv:2511.02781 , year=
Measuring AI Diffusion: A Population-Normalized Metric for Tracking Global AI Usage , author=. arXiv preprint arXiv:2511.02781 , year=
-
[52]
The Most Used AI Chatbots in 2025: Global Usage, Trends, and Platform Comparisons of ChatGPT, Gemini, Copilot, and Claude , year =
work page 2025
-
[53]
Advances in neural information processing systems , volume=
Judging llm-as-a-judge with mt-bench and chatbot arena , author=. Advances in neural information processing systems , volume=
-
[54]
StereoSet: Measuring stereotypical bias in pretrained language models , author=. Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers) , pages=
-
[55]
Findings of the Association for Computational Linguistics: ACL 2022 , pages=
BBQ: A hand-built bias benchmark for question answering , author=. Findings of the Association for Computational Linguistics: ACL 2022 , pages=
work page 2022
-
[56]
CrowS-pairs: A challenge dataset for measuring social biases in masked language models , author=. Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) , pages=
work page 2020
-
[57]
Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages=
Bold: Dataset and metrics for measuring biases in open-ended language generation , author=. Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages=
work page 2021
-
[58]
Journal of management , volume=
Not so subtle: A meta-analytic investigation of the correlates of subtle and overt discrimination , author=. Journal of management , volume=. 2016 , publisher=
work page 2016
-
[59]
Religious Landscape Study (RLS) , year =
- [60]
-
[61]
AI Chatbot Market Share Worldwide , year =
-
[62]
the method of paired comparisons , author=
Rank analysis of incomplete block designs: I. the method of paired comparisons , author=. Biometrika , volume=. 1952 , publisher=
work page 1952
-
[63]
Fairness and Machine Learning: Limitations and Opportunities , author =. 2023 , publisher =
work page 2023
-
[64]
Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , articleno =
Schneiders, Eike and Seabrooke, Tina and Krook, Joshua and Hyde, Richard and Leesakul, Natalie and Clos, Jeremie and Fischer, Joel E , title =. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , articleno =. 2025 , isbn =. doi:10.1145/3706598.3713470 , abstract =
-
[65]
Shen, Judy Hanwen and Carter, Shan and Dargan, Richard and Gillotte, Jessica and Handa, Kunal and Hong, Jerry and Huang, Saffron and Jagadish, Kamya and Kearney, Matt and Levinstein, Ben and Linthicum, Ryn and McCain, Miles and Millar, Thomas and Julapalli, Mo and Price, Sara and Stern, Michael and Saunders, David and Tamkin, Alex and Vallone, Andrea and ...
work page 2026
-
[66]
Search Interest for ``religion'' in the United States (Last 5 Years) , year =
-
[67]
International Conference on Learning Representations , volume=
Towards understanding sycophancy in language models , author=. International Conference on Learning Representations , volume=
-
[68]
International Conference on Learning Representations , volume=
Trust or escalate: Llm judges with provable guarantees for human agreement , author=. International Conference on Learning Representations , volume=
-
[69]
Proceedings of the ACM on Software Engineering , volume=
Can llms replace human evaluators? an empirical study of llm-as-a-judge in software engineering , author=. Proceedings of the ACM on Software Engineering , volume=. 2025 , publisher=
work page 2025
-
[70]
Omissive Bias in Religious Representation: Benchmarking LLM Answers to Everyday Ethical Decision-making , author=. 2026 , institution=
work page 2026
- [71]
-
[72]
Scaling AI for everyone , year =
-
[73]
Google's Gemini Hits 750m MAUs After Gemini 3 , year =
-
[74]
Reade, Walter and Carty, Sheryl and Fulda, Nancy , year=. Religious Bias in
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.