pith. machine review for the scientific record. sign in

arxiv: 2511.13658 · v2 · pith:QNCXLJHTnew · submitted 2025-11-17 · 💻 cs.CL · cs.LG

Why is "Chicago" Predictive of Deceptive Reviews? Using LLMs to Discover Language Phenomena from Lexical Cues

Pith reviewed 2026-05-17 21:27 UTC · model grok-4.3

classification 💻 cs.CL cs.LG
keywords deceptive reviewslexical cueslanguage phenomenaLLMsconjecture-then-validatedeception detectiononline reviewsexplainability
0
0 comments X

The pith

A conjecture-then-validate process lets LLMs convert lexical cues such as 'Chicago' into human-readable language phenomena that predict deceptive reviews better than direct prompting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks whether large language models can turn subtle, fragmented word patterns that classifiers use to flag fake reviews into explanations people can actually understand and apply. It introduces a two-step method in which the model first generates candidate language phenomena behind a given lexical cue and then checks those candidates against real review data. Phenomena produced this way turn out to be backed by empirical patterns, hold up across related review domains, and outperform explanations drawn from the model's prior knowledge or simple in-context examples. The result matters because it offers a way to help users judge review credibility in settings where full machine-learning detectors are not available or trusted.

Core claim

The central claim is that language phenomena obtained from lexical cues through a conjecture-then-validate framework are empirically grounded in data, generalizable across similar domains, and more predictive of deception than phenomena derived from LLMs' prior knowledge or in-context learning.

What carries the argument

The conjecture-then-validate framework that first generates candidate language phenomena for a lexical cue and then empirically validates them against labeled review data.

If this is right

  • The derived phenomena let people assess review credibility directly without needing a running deception classifier.
  • The phenomena remain predictive when moved to other review domains that share similar lexical cues.
  • They outperform language phenomena that an LLM generates from its built-in knowledge or from a few in-context examples.
  • The approach supplies explicit, inspectable reasons that can increase user trust in automated signals of deception.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same framework could be tested on lexical cues in other short-form deceptive text such as social media posts or product descriptions.
  • If the validated phenomena prove stable, they could serve as lightweight features that improve the transparency of existing black-box detectors.
  • Extending the validation step to measure causal impact rather than mere correlation would strengthen claims about why the phenomena matter.

Load-bearing premise

The conjectures that survive validation are causally connected to deceptive intent rather than being artifacts created by the prompting process or by selecting only the guesses that worked.

What would settle it

Testing the same lexical cues on a fresh dataset from a similar domain and finding that the validated phenomena show no better correlation with deception labels than the original cues or random guesses would falsify the central claim.

read the original abstract

Deceptive reviews mislead consumers, harm businesses, and undermine trust in online marketplaces. Machine learning classifiers can learn from large amounts of data to distinguish deceptive reviews from genuine ones. However, the distinguishing features learned by these classifiers are often subtle, fragmented, and difficult for humans to interpret, which can hinder user understanding and trust. In this work, we study whether large language models (LLMs) can translate such unintuitive lexical cues into human-understandable language phenomena. We propose a conjecture-then-validate framework, and show that language phenomena obtained in this manner are empirically grounded in data, generalizable across similar domains, and more predictive than phenomena derived from LLMs' prior knowledge or in-context learning. Such phenomena can aid people in critically assessing the credibility of online reviews in environments where deception detection classifiers are unavailable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a conjecture-then-validate framework in which LLMs first generate candidate language phenomena from lexical cues (such as the word 'Chicago' being predictive of deceptive reviews) identified by ML classifiers, then validate those phenomena on held-out data. It claims that the resulting phenomena are empirically grounded, generalize across similar review domains, and outperform phenomena derived solely from the LLM's prior knowledge or in-context learning in predictive power for deception detection.

Significance. If the empirical claims are substantiated, the work offers a practical route to making black-box deception classifiers more interpretable for end users, which could increase trust in online marketplaces. The core idea of using LLMs to surface data-driven linguistic patterns rather than relying on prompting alone is a useful contribution to the interpretability literature in NLP.

major comments (2)
  1. [§3] §3 (Conjecture-then-Validate Framework): The procedure for generating and retaining conjectures is not accompanied by an exhaustive list of all generated conjectures together with their validation statistics. Without this or a pre-specified selection rule, it is impossible to rule out post-hoc selection bias, which directly undermines the claim that the retained phenomena are strictly more predictive than prior-knowledge or ICL baselines.
  2. [§4.2] §4.2 (Predictiveness Experiments): The reported superiority over baselines lacks explicit controls for LLM hallucination or confirmation bias during the validation step (e.g., no mention of using a separate judge model or human verification protocol). This leaves open the possibility that measured gains reflect the LLM's pre-trained knowledge of review patterns rather than the data-driven conjecture process.
minor comments (2)
  1. [Abstract / §1] The abstract and introduction use the phrase 'empirically grounded' without a precise operational definition; a short paragraph clarifying what counts as empirical grounding (e.g., statistical significance on a held-out test set after correction for multiple comparisons) would improve clarity.
  2. [Figure 2] Figure 2 (example phenomena) would benefit from an additional column showing the raw lexical cue frequency in deceptive vs. genuine reviews to allow readers to assess the strength of the original signal.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and insightful comments. We address each major comment below and describe the revisions we will make to improve transparency and rigor.

read point-by-point responses
  1. Referee: [§3] §3 (Conjecture-then-Validate Framework): The procedure for generating and retaining conjectures is not accompanied by an exhaustive list of all generated conjectures together with their validation statistics. Without this or a pre-specified selection rule, it is impossible to rule out post-hoc selection bias, which directly undermines the claim that the retained phenomena are strictly more predictive than prior-knowledge or ICL baselines.

    Authors: We agree that full transparency is required to rule out selection bias. In the revised manuscript we will append an exhaustive list of every conjecture generated by the LLM together with its validation statistics on the held-out set. We will also state the pre-specified retention rule explicitly: a conjecture is retained only if it yields a statistically significant lift (p < 0.05, Bonferroni-corrected) over the strongest baseline on the validation data. These additions will allow readers to reproduce the filtering process. revision: yes

  2. Referee: [§4.2] §4.2 (Predictiveness Experiments): The reported superiority over baselines lacks explicit controls for LLM hallucination or confirmation bias during the validation step (e.g., no mention of using a separate judge model or human verification protocol). This leaves open the possibility that measured gains reflect the LLM's pre-trained knowledge of review patterns rather than the data-driven conjecture process.

    Authors: The core validation step already consists of fitting a downstream classifier on held-out data using the conjectured phenomena as features; this empirical test is independent of the LLM that proposed the phenomena. Nevertheless, to address the referee's concern we will add two explicit controls: (1) a separate judge LLM that re-evaluates the validation predictions without access to the original conjecture prompt, and (2) a small human verification study on a random sample of retained phenomena. These controls will be reported in §4.2 and the appendix. revision: yes

Circularity Check

0 steps flagged

No significant circularity; conjecture-then-validate remains data-driven and externally benchmarked

full rationale

The paper's central derivation uses LLMs to generate conjectures from lexical cues then validates them against held-out data, claiming the resulting phenomena are more predictive than those from prior knowledge or ICL. This chain does not reduce to self-definition or fitted-input renaming because the validation step is described as an independent empirical check that retains only phenomena showing measurable lift on the target task. No equations equate the output directly to the conjecture generator, no self-citation supplies a uniqueness theorem, and the comparison baselines are external to the discovery loop. The method is therefore self-contained against the reported benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated premise that LLMs can reliably conjecture and validate language phenomena without introducing systematic bias from their training data or prompting style.

axioms (1)
  • domain assumption LLMs possess sufficient linguistic knowledge to generate plausible conjectures about deceptive language from isolated lexical cues.
    Invoked implicitly when the framework uses LLMs to translate cues into phenomena.

pith-pipeline@v0.9.0 · 5446 in / 1120 out tokens · 24036 ms · 2026-05-17T21:27:08.261114+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We propose a conjecture-then-validate framework, and show that language phenomena obtained in this manner are empirically grounded in data, generalizable across similar domains, and more predictive than phenomena derived from LLMs' prior knowledge or in-context learning.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 3 internal anchors

  1. [1]

    2023.Amazon, Booking.com, Expedia Group, Glassdoor, Tripadvisor, and Trustpilot launch first global Coalition for Trusted Reviews

    About Amazon Team. 2023.Amazon, Booking.com, Expedia Group, Glassdoor, Tripadvisor, and Trustpilot launch first global Coalition for Trusted Reviews. About Amazon EU. https://www.aboutamazon.eu/news/policy/amazon-booking- com-expedia-group-glassdoor-tripadvisor-and-trustpilot-launch-first-global- coalition-for-trusted-reviews

  2. [2]

    Amazon Web Services

    Inc. Amazon Web Services. 2025.Amazon Nova Foundation Models. https: //aws.amazon.com/ai/generative-ai/nova/ Accessed: 2025-11-04

  3. [3]

    Charles F Bond Jr and Bella M DePaulo. 2006. Accuracy of deception judgments. Personality and social psychology Review10, 3 (2006), 214–234

  4. [4]

    Vadim Borisov and Gjergji Kasneci. 2022. Relational Local Explanations.arXiv preprint arXiv:2212.12374(2022)

  5. [5]

    Tanya Goyal, Junyi Jessy Li, and Greg Durrett. 2022. News summarization and evaluation in the era of gpt-3.arXiv preprint arXiv:2209.12356(2022)

  6. [6]

    Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. 2021. Measuring Massive Multitask Language Under- standing.Proceedings of the International Conference on Learning Representations (ICLR)(2021)

  7. [7]

    Alon Jacovi, Hendrik Schuff, Heike Adel, Ngoc Thang Vu, and Yoav Goldberg

  8. [8]

    InFindings of the Association for Computational Linguistics: ACL 2023

    Neighboring Words Affect Human Interpretation of Saliency Explanations. InFindings of the Association for Computational Linguistics: ACL 2023. 11816– 11833

  9. [9]

    Joseph D Janizek, Pascal Sturmfels, and Su-In Lee. 2021. Explaining explanations: Axiomatic feature interactions for deep networks.Journal of Machine Learning Research22, 104 (2021), 1–54

  10. [10]

    Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T Joshi, Hanna Moazam, et al. 2023. Dspy: Compiling declarative language model calls into self-improving pipelines.arXiv preprint arXiv:2310.03714(2023)

  11. [11]

    Why is’ Chicago’deceptive?

    Vivian Lai, Han Liu, and Chenhao Tan. 2020. " Why is’ Chicago’deceptive?" Towards Building Model-Driven Tutorials for Humans. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13

  12. [12]

    Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency. 29–38

  13. [13]

    Jiwei Li, Myle Ott, and Claire Cardie. 2013. Identifying manipulated offerings on review portals. InProceedings of the 2013 conference on empirical methods in natural language processing. 1933–1942

  14. [14]

    2025.Gemini 2.5 Flash

    DeepMind (Google LLC). 2025.Gemini 2.5 Flash. https://deepmind.google/ models/gemini/flash/ Accessed: 2025-11-04

  15. [15]

    Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions.Advances in neural information processing systems30 (2017)

  16. [16]

    David Martens, James Hinns, Camille Dams, Mark Vergouwen, and Theodoros Evgeniou. 2023. Tell me a story! narrative-driven xai with large language models. arXiv preprint arXiv:2309.17057(2023)

  17. [17]

    2025.Introducing GPT-5

    OpenAI. 2025.Introducing GPT-5. https://openai.com/index/introducing-gpt-5/ Accessed: 2025-11-04

  18. [18]

    Myle Ott, Claire Cardie, and Jeffrey T Hancock. 2013. Negative deceptive opinion spam. InProceedings of the 2013 conference of the north american chapter of the association for computational linguistics: human language technologies. 497–501

  19. [19]

    Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination.arXiv preprint arXiv:1107.4557(2011)

  20. [20]

    Bo Pan, Zhen Xiong, Guanchen Wu, Zheng Zhang, Yifei Zhang, and Liang Zhao

  21. [21]

    TAGExplainer: Narrating Graph Explanations for Text-Attributed Graph Learning Models.arXiv preprint arXiv:2410.15268(2024)

  22. [22]

    Himangshu Paul and Alexander Nikolaev. 2021. Fake review detection on online E- commerce platforms: a systematic literature review.Data Mining and Knowledge Discovery35, 5 (2021), 1830–1881

  23. [23]

    2025.Introducing Claude Haiku 4.5

    Anthropic PBC. 2025.Introducing Claude Haiku 4.5. https://www.anthropic.com/ news/claude-haiku-4-5 Accessed: 2025-11-04

  24. [24]

    Jiaming Qu, Jaime Arguello, and Yue Wang. 2021. A Study of Explainability Features to Scrutinize Faceted Filtering Results. InProceedings of the 30th ACM International Conference on Information & Knowledge Management. 1498–1507

  25. [25]

    Problems

    Jiaming Qu, Jaime Arguello, and Yue Wang. 2024. Why is" Problems" Predictive of Positive Sentiment? A Case Study of Explaining Unintuitive Features in Senti- ment Classification. InThe 2024 ACM Conference on Fairness, Accountability, and Transparency. 161–172

  26. [26]

    Jiaming Qu, Jaime Arguello, and Yue Wang. 2025. Understanding the Effects of Explaining Predictive but Unintuitive Features in Human-XAI Interaction. InPro- ceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. 296–311

  27. [27]

    Why should i trust you?

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should i trust you?" Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144

  28. [28]

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High- precision model-agnostic explanations. InProceedings of the AAAI conference on artificial intelligence, Vol. 32

  29. [29]

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. InInternational conference on machine learning. PMLR, 3319– 3328

  30. [30]

    Alon Talmor, Jonathan Herzig, Nicholas Lourie, and Jonathan Berant. 2019. Com- monsenseQA: A Question Answering Challenge Targeting Commonsense Knowl- edge. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4149–4158

  31. [31]

    Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupati- raju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, et al. 2024. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295(2024)

  32. [32]

    Michael Tsang, Sirisha Rambhatla, and Yan Liu. 2020. How does this interaction affect me? interpretable attribution for feature interactions.Advances in neural information processing systems33 (2020), 6147–6159

  33. [33]

    Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang. 2024. Can Large Language Models Transform Computational Social Sci- ence?Computational Linguistics50, 1 (2024), 237–291

  34. [34]

    Alexandra Zytek, Sara Pidò, and Kalyan Veeramachaneni. 2024. LLMs for XAI: Future Directions for Explaining Explanations.arXiv preprint arXiv:2405.06064 (2024)