arxiv: 2511.13658 · v2 · pith:QNCXLJHTnew · submitted 2025-11-17 · 💻 cs.CL · cs.LG

Why is "Chicago" Predictive of Deceptive Reviews? Using LLMs to Discover Language Phenomena from Lexical Cues

Jiaming Qu , Mengtian Guo , Yue Wang This is my paper

Pith reviewed 2026-05-17 21:27 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords deceptive reviewslexical cueslanguage phenomenaLLMsconjecture-then-validatedeception detectiononline reviewsexplainability

0 comments

The pith

A conjecture-then-validate process lets LLMs convert lexical cues such as 'Chicago' into human-readable language phenomena that predict deceptive reviews better than direct prompting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks whether large language models can turn subtle, fragmented word patterns that classifiers use to flag fake reviews into explanations people can actually understand and apply. It introduces a two-step method in which the model first generates candidate language phenomena behind a given lexical cue and then checks those candidates against real review data. Phenomena produced this way turn out to be backed by empirical patterns, hold up across related review domains, and outperform explanations drawn from the model's prior knowledge or simple in-context examples. The result matters because it offers a way to help users judge review credibility in settings where full machine-learning detectors are not available or trusted.

Core claim

The central claim is that language phenomena obtained from lexical cues through a conjecture-then-validate framework are empirically grounded in data, generalizable across similar domains, and more predictive of deception than phenomena derived from LLMs' prior knowledge or in-context learning.

What carries the argument

The conjecture-then-validate framework that first generates candidate language phenomena for a lexical cue and then empirically validates them against labeled review data.

If this is right

The derived phenomena let people assess review credibility directly without needing a running deception classifier.
The phenomena remain predictive when moved to other review domains that share similar lexical cues.
They outperform language phenomena that an LLM generates from its built-in knowledge or from a few in-context examples.
The approach supplies explicit, inspectable reasons that can increase user trust in automated signals of deception.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same framework could be tested on lexical cues in other short-form deceptive text such as social media posts or product descriptions.
If the validated phenomena prove stable, they could serve as lightweight features that improve the transparency of existing black-box detectors.
Extending the validation step to measure causal impact rather than mere correlation would strengthen claims about why the phenomena matter.

Load-bearing premise

The conjectures that survive validation are causally connected to deceptive intent rather than being artifacts created by the prompting process or by selecting only the guesses that worked.

What would settle it

Testing the same lexical cues on a fresh dataset from a similar domain and finding that the validated phenomena show no better correlation with deception labels than the original cues or random guesses would falsify the central claim.

read the original abstract

Deceptive reviews mislead consumers, harm businesses, and undermine trust in online marketplaces. Machine learning classifiers can learn from large amounts of data to distinguish deceptive reviews from genuine ones. However, the distinguishing features learned by these classifiers are often subtle, fragmented, and difficult for humans to interpret, which can hinder user understanding and trust. In this work, we study whether large language models (LLMs) can translate such unintuitive lexical cues into human-understandable language phenomena. We propose a conjecture-then-validate framework, and show that language phenomena obtained in this manner are empirically grounded in data, generalizable across similar domains, and more predictive than phenomena derived from LLMs' prior knowledge or in-context learning. Such phenomena can aid people in critically assessing the credibility of online reviews in environments where deception detection classifiers are unavailable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main move is a conjecture-then-validate loop that turns a deception classifier's lexical cues into LLM-generated language phenomena, but the validation details are too thin to tell if the claimed gains are real or just post-hoc selection.

read the letter

The core claim is that starting from a classifier's lexical signals, having an LLM conjecture human-readable language phenomena, and then validating them produces explanations that are more predictive than direct LLM prompting or prior knowledge. That pipeline is the new piece here, and it targets a real usability gap: people who want to spot deceptive reviews without running the black-box model themselves. The authors show some examples like the word Chicago flagging reviews and turn that into testable patterns, which is a practical framing for marketplace tools. They also try to check generalizability across domains, which is a step beyond pure explanation work. That part earns credit for trying to ground the output in actual classifier behavior rather than letting the LLM freewheel. The writing is clear on the motivation and the high-level loop. The soft spots are in the execution details that the abstract leaves out. Without seeing exhaustive lists of all conjectures generated, the exact validation metrics, or controls that separate data-driven discovery from the LLM's pre-trained patterns on reviews, it's hard to rule out selection bias or circularity. The stress-test note on post-hoc retention of successful conjectures lands as a legitimate worry here; if only the winners get reported, the superiority claim weakens. The comparison to in-context learning baselines also needs tighter controls to show the conjecture step adds something beyond what the model already knows. This is aimed at NLP researchers working on interpretability for deception or misinformation detection, plus anyone building consumer-facing review tools. A reader who wants concrete ways to make classifiers more usable will find the idea worth testing, even if the current evidence is preliminary. It deserves a serious referee to pressure-test the validation protocol and selection rules. I would send it for review with requests for full conjecture logs, pre-specified success criteria, and stronger baseline comparisons.

Referee Report

2 major / 2 minor

Summary. The paper introduces a conjecture-then-validate framework in which LLMs first generate candidate language phenomena from lexical cues (such as the word 'Chicago' being predictive of deceptive reviews) identified by ML classifiers, then validate those phenomena on held-out data. It claims that the resulting phenomena are empirically grounded, generalize across similar review domains, and outperform phenomena derived solely from the LLM's prior knowledge or in-context learning in predictive power for deception detection.

Significance. If the empirical claims are substantiated, the work offers a practical route to making black-box deception classifiers more interpretable for end users, which could increase trust in online marketplaces. The core idea of using LLMs to surface data-driven linguistic patterns rather than relying on prompting alone is a useful contribution to the interpretability literature in NLP.

major comments (2)

[§3] §3 (Conjecture-then-Validate Framework): The procedure for generating and retaining conjectures is not accompanied by an exhaustive list of all generated conjectures together with their validation statistics. Without this or a pre-specified selection rule, it is impossible to rule out post-hoc selection bias, which directly undermines the claim that the retained phenomena are strictly more predictive than prior-knowledge or ICL baselines.
[§4.2] §4.2 (Predictiveness Experiments): The reported superiority over baselines lacks explicit controls for LLM hallucination or confirmation bias during the validation step (e.g., no mention of using a separate judge model or human verification protocol). This leaves open the possibility that measured gains reflect the LLM's pre-trained knowledge of review patterns rather than the data-driven conjecture process.

minor comments (2)

[Abstract / §1] The abstract and introduction use the phrase 'empirically grounded' without a precise operational definition; a short paragraph clarifying what counts as empirical grounding (e.g., statistical significance on a held-out test set after correction for multiple comparisons) would improve clarity.
[Figure 2] Figure 2 (example phenomena) would benefit from an additional column showing the raw lexical cue frequency in deceptive vs. genuine reviews to allow readers to assess the strength of the original signal.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and insightful comments. We address each major comment below and describe the revisions we will make to improve transparency and rigor.

read point-by-point responses

Referee: [§3] §3 (Conjecture-then-Validate Framework): The procedure for generating and retaining conjectures is not accompanied by an exhaustive list of all generated conjectures together with their validation statistics. Without this or a pre-specified selection rule, it is impossible to rule out post-hoc selection bias, which directly undermines the claim that the retained phenomena are strictly more predictive than prior-knowledge or ICL baselines.

Authors: We agree that full transparency is required to rule out selection bias. In the revised manuscript we will append an exhaustive list of every conjecture generated by the LLM together with its validation statistics on the held-out set. We will also state the pre-specified retention rule explicitly: a conjecture is retained only if it yields a statistically significant lift (p < 0.05, Bonferroni-corrected) over the strongest baseline on the validation data. These additions will allow readers to reproduce the filtering process. revision: yes
Referee: [§4.2] §4.2 (Predictiveness Experiments): The reported superiority over baselines lacks explicit controls for LLM hallucination or confirmation bias during the validation step (e.g., no mention of using a separate judge model or human verification protocol). This leaves open the possibility that measured gains reflect the LLM's pre-trained knowledge of review patterns rather than the data-driven conjecture process.

Authors: The core validation step already consists of fitting a downstream classifier on held-out data using the conjectured phenomena as features; this empirical test is independent of the LLM that proposed the phenomena. Nevertheless, to address the referee's concern we will add two explicit controls: (1) a separate judge LLM that re-evaluates the validation predictions without access to the original conjecture prompt, and (2) a small human verification study on a random sample of retained phenomena. These controls will be reported in §4.2 and the appendix. revision: yes

Circularity Check

0 steps flagged

No significant circularity; conjecture-then-validate remains data-driven and externally benchmarked

full rationale

The paper's central derivation uses LLMs to generate conjectures from lexical cues then validates them against held-out data, claiming the resulting phenomena are more predictive than those from prior knowledge or ICL. This chain does not reduce to self-definition or fitted-input renaming because the validation step is described as an independent empirical check that retains only phenomena showing measurable lift on the target task. No equations equate the output directly to the conjecture generator, no self-citation supplies a uniqueness theorem, and the comparison baselines are external to the discovery loop. The method is therefore self-contained against the reported benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated premise that LLMs can reliably conjecture and validate language phenomena without introducing systematic bias from their training data or prompting style.

axioms (1)

domain assumption LLMs possess sufficient linguistic knowledge to generate plausible conjectures about deceptive language from isolated lexical cues.
Invoked implicitly when the framework uses LLMs to translate cues into phenomena.

pith-pipeline@v0.9.0 · 5446 in / 1120 out tokens · 24036 ms · 2026-05-17T21:27:08.261114+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a conjecture-then-validate framework, and show that language phenomena obtained in this manner are empirically grounded in data, generalizable across similar domains, and more predictive than phenomena derived from LLMs' prior knowledge or in-context learning.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 3 internal anchors

[1]

2023.Amazon, Booking.com, Expedia Group, Glassdoor, Tripadvisor, and Trustpilot launch first global Coalition for Trusted Reviews

About Amazon Team. 2023.Amazon, Booking.com, Expedia Group, Glassdoor, Tripadvisor, and Trustpilot launch first global Coalition for Trusted Reviews. About Amazon EU. https://www.aboutamazon.eu/news/policy/amazon-booking- com-expedia-group-glassdoor-tripadvisor-and-trustpilot-launch-first-global- coalition-for-trusted-reviews

work page 2023
[2]

Amazon Web Services

Inc. Amazon Web Services. 2025.Amazon Nova Foundation Models. https: //aws.amazon.com/ai/generative-ai/nova/ Accessed: 2025-11-04

work page 2025
[3]

Charles F Bond Jr and Bella M DePaulo. 2006. Accuracy of deception judgments. Personality and social psychology Review10, 3 (2006), 214–234

work page 2006
[4]

Vadim Borisov and Gjergji Kasneci. 2022. Relational Local Explanations.arXiv preprint arXiv:2212.12374(2022)

work page arXiv 2022
[5]

Tanya Goyal, Junyi Jessy Li, and Greg Durrett. 2022. News summarization and evaluation in the era of gpt-3.arXiv preprint arXiv:2209.12356(2022)

work page arXiv 2022
[6]

Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. 2021. Measuring Massive Multitask Language Under- standing.Proceedings of the International Conference on Learning Representations (ICLR)(2021)

work page 2021
[7]

Alon Jacovi, Hendrik Schuff, Heike Adel, Ngoc Thang Vu, and Yoav Goldberg

work page
[8]

InFindings of the Association for Computational Linguistics: ACL 2023

Neighboring Words Affect Human Interpretation of Saliency Explanations. InFindings of the Association for Computational Linguistics: ACL 2023. 11816– 11833

work page 2023
[9]

Joseph D Janizek, Pascal Sturmfels, and Su-In Lee. 2021. Explaining explanations: Axiomatic feature interactions for deep networks.Journal of Machine Learning Research22, 104 (2021), 1–54

work page 2021
[10]

Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T Joshi, Hanna Moazam, et al. 2023. Dspy: Compiling declarative language model calls into self-improving pipelines.arXiv preprint arXiv:2310.03714(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[11]

Why is’ Chicago’deceptive?

Vivian Lai, Han Liu, and Chenhao Tan. 2020. " Why is’ Chicago’deceptive?" Towards Building Model-Driven Tutorials for Humans. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13

work page 2020
[12]

Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency. 29–38

work page 2019
[13]

Jiwei Li, Myle Ott, and Claire Cardie. 2013. Identifying manipulated offerings on review portals. InProceedings of the 2013 conference on empirical methods in natural language processing. 1933–1942

work page 2013
[14]

2025.Gemini 2.5 Flash

DeepMind (Google LLC). 2025.Gemini 2.5 Flash. https://deepmind.google/ models/gemini/flash/ Accessed: 2025-11-04

work page 2025
[15]

Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions.Advances in neural information processing systems30 (2017)

work page 2017
[16]

David Martens, James Hinns, Camille Dams, Mark Vergouwen, and Theodoros Evgeniou. 2023. Tell me a story! narrative-driven xai with large language models. arXiv preprint arXiv:2309.17057(2023)

work page arXiv 2023
[17]

2025.Introducing GPT-5

OpenAI. 2025.Introducing GPT-5. https://openai.com/index/introducing-gpt-5/ Accessed: 2025-11-04

work page 2025
[18]

Myle Ott, Claire Cardie, and Jeffrey T Hancock. 2013. Negative deceptive opinion spam. InProceedings of the 2013 conference of the north american chapter of the association for computational linguistics: human language technologies. 497–501

work page 2013
[19]

Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination.arXiv preprint arXiv:1107.4557(2011)

work page internal anchor Pith review Pith/arXiv arXiv 2011
[20]

Bo Pan, Zhen Xiong, Guanchen Wu, Zheng Zhang, Yifei Zhang, and Liang Zhao

work page
[21]

TAGExplainer: Narrating Graph Explanations for Text-Attributed Graph Learning Models.arXiv preprint arXiv:2410.15268(2024)

work page arXiv 2024
[22]

Himangshu Paul and Alexander Nikolaev. 2021. Fake review detection on online E- commerce platforms: a systematic literature review.Data Mining and Knowledge Discovery35, 5 (2021), 1830–1881

work page 2021
[23]

2025.Introducing Claude Haiku 4.5

Anthropic PBC. 2025.Introducing Claude Haiku 4.5. https://www.anthropic.com/ news/claude-haiku-4-5 Accessed: 2025-11-04

work page 2025
[24]

Jiaming Qu, Jaime Arguello, and Yue Wang. 2021. A Study of Explainability Features to Scrutinize Faceted Filtering Results. InProceedings of the 30th ACM International Conference on Information & Knowledge Management. 1498–1507

work page 2021
[25]

Problems

Jiaming Qu, Jaime Arguello, and Yue Wang. 2024. Why is" Problems" Predictive of Positive Sentiment? A Case Study of Explaining Unintuitive Features in Senti- ment Classification. InThe 2024 ACM Conference on Fairness, Accountability, and Transparency. 161–172

work page 2024
[26]

Jiaming Qu, Jaime Arguello, and Yue Wang. 2025. Understanding the Effects of Explaining Predictive but Unintuitive Features in Human-XAI Interaction. InPro- ceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. 296–311

work page 2025
[27]

Why should i trust you?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should i trust you?" Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144

work page 2016
[28]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High- precision model-agnostic explanations. InProceedings of the AAAI conference on artificial intelligence, Vol. 32

work page 2018
[29]

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. InInternational conference on machine learning. PMLR, 3319– 3328

work page 2017
[30]

Alon Talmor, Jonathan Herzig, Nicholas Lourie, and Jonathan Berant. 2019. Com- monsenseQA: A Question Answering Challenge Targeting Commonsense Knowl- edge. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4149–4158

work page 2019
[31]

Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupati- raju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, et al. 2024. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[32]

Michael Tsang, Sirisha Rambhatla, and Yan Liu. 2020. How does this interaction affect me? interpretable attribution for feature interactions.Advances in neural information processing systems33 (2020), 6147–6159

work page 2020
[33]

Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang. 2024. Can Large Language Models Transform Computational Social Sci- ence?Computational Linguistics50, 1 (2024), 237–291

work page 2024
[34]

Alexandra Zytek, Sara Pidò, and Kalyan Veeramachaneni. 2024. LLMs for XAI: Future Directions for Explaining Explanations.arXiv preprint arXiv:2405.06064 (2024)

work page arXiv 2024