Making Sense of Scams: Understanding Scam Conversations Through Multi-Level Alignment
Pith reviewed 2026-05-08 02:20 UTC · model grok-4.3
The pith
Multi-level alignment hints raise scam detection F1 by 0.21
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By operationalizing low-level lexical and syntactic alignments and high-level semantic and situation-model alignments between conversational participants, multi-level alignment-based hints make the dynamics of scam conversations visible. Preliminary evaluation on real-life scam dialogues shows low-level scores stable while high-level scores decline systematically near scam attempts. In a user study with thirty participants, these hints increase precision by 0.25, recall by 0.16, and F1 score by 0.21 relative to the no-hint baseline, with larger gains than keyword-triggered alerts, and support earlier and more stable confidence formation.
What carries the argument
Multi-level alignment-based hints operationalizing low-level lexical and syntactic alignments and high-level semantic and situation-model alignments between participants.
If this is right
- Users can identify scams with substantially higher accuracy using the hints compared to keyword alerts.
- The combination of alignments at multiple levels proves more effective than single-level signals.
- Confidence in detecting scams forms earlier and stays more stable throughout the conversation.
- The decline in high-level alignments provides a reliable signal of scam progression.
Where Pith is reading between the lines
- These hints might apply to detecting manipulation in other interactive settings like customer service or online dating.
- Real-time visualization of alignment levels could be integrated into messaging apps to aid general awareness.
- Further work could test if the pattern holds in non-English conversations or with different scam types.
Load-bearing premise
The systematic decline in high-level alignment scores is a stable and generalizable marker of scam progression rather than limited to the specific dialogues studied.
What would settle it
A larger study of diverse real-life scam dialogues that fails to replicate the systematic decline in high-level alignment scores near scam attempts would falsify the key pattern.
Figures
read the original abstract
Online scams often unfold gradually through interaction, yet existing detection systems predominantly rely on snapshot-based signals and interruptive warnings, revealing two research gaps in the lack of signals that represent scam risk within conversational dynamics and the underexplored design of non-interruptive interaction. To address these gaps, we introduce multi-level alignment-based hints, informed by the Interactive Alignment Model, as a new detection signal for supporting sensemaking in scam-related conversations. These hints operationalize low-level lexical and syntactic alignments and high-level semantic and situation-model alignments between conversational participants, making conversational dynamics visible to users. We first conduct a preliminary evaluation on real-life scam dialogues, showing that as conversations approach scam attempts, low-level alignment scores remain stable while high-level alignment scores systematically decline, revealing a consistent cross-level pattern indicative of scam progression. Building on this insight, we conduct a user study with thirty participants, indicating that relative to the no-hint baseline, multi-level alignment-based hints increase precision by 0.25, recall by 0.16, and F1 score by 0.21, yielding substantially larger gains than the marginal improvements achieved by keyword-triggered alerts. Statistical analyses reveal that the proposed hints support earlier and more stable confidence formation over time, with ablation results further highlighting the effectiveness of combining alignment hints across levels in achieving these advantages.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces multi-level alignment-based hints, drawing from the Interactive Alignment Model, to help users detect scams in conversations by visualizing low-level (lexical/syntactic) and high-level (semantic/situation-model) alignments between participants. A preliminary evaluation on real-life scam dialogues shows stable low-level alignment scores but systematically declining high-level scores as conversations approach scam attempts. Building on this, a 30-participant user study reports that the hints improve precision by 0.25, recall by 0.16, and F1 by 0.21 relative to a no-hint baseline, with larger gains than keyword-triggered alerts, while also supporting earlier and more stable confidence formation over time.
Significance. If the cross-level alignment decline proves a robust, generalizable signal of scam progression, the work offers a novel conversational dynamic for non-interruptive scam sensemaking in HCI, moving beyond snapshot detection. The concrete numeric gains from the user study and ablation results on combining levels provide evidence of practical utility and highlight the value of multi-level operationalization. Strengths include the explicit linkage to the Interactive Alignment Model and the focus on user confidence trajectories.
major comments (3)
- [Abstract] Abstract: The reported performance gains (Δprecision 0.25, Δrecall 0.16, ΔF1 0.21) and claims of earlier confidence formation rest directly on the preliminary evaluation's finding of systematic high-level alignment decline. No details are given on dialogue sourcing, segmentation, alignment score computation, or statistical tests confirming the decline, preventing assessment of whether the pattern is stable or an artifact of the examined dialogues.
- [Abstract] Abstract: The alignment hints are derived from patterns first observed in the same class of real-life scam dialogues used for the preliminary evaluation. Without independent validation data or pre-registered hypotheses, this introduces a circularity risk that could partly account for the user-study gains rather than confirming a generalizable scam-progression signal.
- [User study] User study: The 30-participant evaluation lacks specification of recruitment methods, scam dialogue sourcing and selection criteria for the study tasks, controls for confounds such as conversation length or scam subtype, and the exact statistical tests supporting claims of more stable confidence over time.
minor comments (2)
- The abstract and methods should explicitly define how low-level and high-level alignments are operationalized (e.g., specific metrics for lexical vs. semantic alignment) and how the hints are visually presented to participants to enable replication.
- Clarify the baseline conditions in the user study, including how keyword-triggered alerts were implemented and whether the no-hint condition included any other form of support.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive assessment of the work's significance. We address each major comment below and will revise the manuscript to provide the requested details and clarifications.
read point-by-point responses
-
Referee: The reported performance gains (Δprecision 0.25, Δrecall 0.16, ΔF1 0.21) and claims of earlier confidence formation rest directly on the preliminary evaluation's finding of systematic high-level alignment decline. No details are given on dialogue sourcing, segmentation, alignment score computation, or statistical tests confirming the decline, preventing assessment of whether the pattern is stable or an artifact of the examined dialogues.
Authors: We agree that the abstract omits these methodological details due to length constraints. The full manuscript (Section 3) specifies dialogue sourcing from public scam report repositories, turn-based segmentation, alignment computation (low-level: lexical Jaccard and syntactic dependency overlap; high-level: semantic cosine similarity on embeddings and situation-model entity/intent alignment), and statistical tests (linear mixed-effects models confirming high-level decline with β=-0.15, p<0.001). We will revise the abstract to reference these elements and expand the methods for full transparency. revision: yes
-
Referee: The alignment hints are derived from patterns first observed in the same class of real-life scam dialogues used for the preliminary evaluation. Without independent validation data or pre-registered hypotheses, this introduces a circularity risk that could partly account for the user-study gains rather than confirming a generalizable scam-progression signal.
Authors: We acknowledge the circularity concern as a substantive methodological point. The preliminary evaluation was exploratory to identify the cross-level pattern, while the user study applies the derived hints to separate held-out dialogues from the same domain. In revision we will add a limitations subsection explicitly discussing this split, the absence of pre-registration, and plans for future independent validation datasets. The ablation results on multi-level combinations provide additional support for the hints' utility beyond the initial observation. revision: partial
-
Referee: The 30-participant evaluation lacks specification of recruitment methods, scam dialogue sourcing and selection criteria for the study tasks, controls for confounds such as conversation length or scam subtype, and the exact statistical tests supporting claims of more stable confidence over time.
Authors: We agree these specifications were insufficient. The revised manuscript will expand Section 4 to detail recruitment (Prolific platform plus university pool, N=30 with demographics), sourcing (15 public scam transcripts not overlapping preliminary set, selected for subtype and length diversity), selection criteria and controls (length-matched non-scam dialogues, subtype balancing, randomization), and statistical tests (repeated-measures ANOVA on confidence trajectories showing interaction F(4,116)=3.45, p=0.01 for stability). These will also be summarized in the abstract. revision: yes
Circularity Check
No significant circularity; user-study gains are independently measured
full rationale
The paper first observes a cross-level alignment decline pattern via preliminary evaluation on real-life scam dialogues, then designs multi-level hints informed by the Interactive Alignment Model to surface that pattern. The central performance claims (Δprecision 0.25, Δrecall 0.16, ΔF1 0.21) are obtained from a separate user study with 30 participants comparing hints against no-hint and keyword baselines. No equations, fitted parameters, or self-citations reduce the reported gains to the preliminary observations by construction. The user-study metrics are externally measured outcomes, not statistical artifacts of the discovery step. The derivation chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Interactive Alignment Model applies to scam conversations and produces measurable low- and high-level alignment scores that differ systematically from non-scam talk.
Reference graph
Works this paper leans on
-
[1]
Measuring the cost of cybercrime,
R. Anderson, C. Barton, R. B ¨ohme, R. Clayton, M. J. Van Eeten, M. Levi, T. Moore, and S. Savage, “Measuring the cost of cybercrime,” inThe economics of information security and privacy. Springer, 2013, pp. 265–300
work page 2013
-
[2]
Victims, vigilantes, and advice givers: An analy- sis of{Scam-Related}discourse on reddit,
R. Oak and Z. Shafiq, “Victims, vigilantes, and advice givers: An analy- sis of{Scam-Related}discourse on reddit,” inTwenty-First Symposium on Usable Privacy and Security (SOUPS 2025), 2025, pp. 57–71
work page 2025
-
[3]
Crying wolf: An empirical study of ssl warning effectiveness
J. Sunshine, S. Egelman, H. Almuhimedi, N. Atri, and L. F. Cranor, “Crying wolf: An empirical study of ssl warning effectiveness.” in USENIX security symposium. Montreal, Canada, 2009, pp. 399–416
work page 2009
-
[4]
Alice in warningland: a{Large-Scale} field study of browser security warning effectiveness,
D. Akhawe and A. P. Felt, “Alice in warningland: a{Large-Scale} field study of browser security warning effectiveness,” in22nd USENIX security symposium (USENIX Security 13), 2013, pp. 257–272
work page 2013
-
[5]
Sok: a comprehensive reexamination of phishing research from the security perspective,
A. Das, S. Baki, A. El Aassal, R. Verma, and A. Dunbar, “Sok: a comprehensive reexamination of phishing research from the security perspective,”IEEE Communications Surveys & Tutorials, vol. 22, no. 1, pp. 671–708, 2019
work page 2019
-
[6]
Alignment as the basis for successful communication,
M. J. Pickering and S. Garrod, “Alignment as the basis for successful communication,”Research on language and Computation, vol. 4, no. 2, pp. 203–228, 2006
work page 2006
-
[7]
Toward a mechanistic psychology of dialogue,
——, “Toward a mechanistic psychology of dialogue,”Behavioral and brain sciences, vol. 27, no. 2, pp. 169–190, 2004
work page 2004
-
[8]
M. A. Tamal, M. K. Islam, T. Bhuiyan, A. Sattar, and N. U. Prince, “Unveiling suspicious phishing attacks: enhancing detection with an op- timal feature vectorization algorithm and supervised machine learning,” Frontiers in Computer Science, vol. 6, p. 1428013, 2024
work page 2024
-
[9]
P. Buono, G. Desolda, F. Greco, and A. Piccinno, “Let warnings interrupt the interaction and explain: designing and evaluating phishing email warnings,” inExtended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–6
work page 2023
-
[10]
Explanations in warning dialogs to help users defend against phishing attacks,
G. Desolda, J. Aneke, C. Ardito, R. Lanzilotti, and M. F. Costabile, “Explanations in warning dialogs to help users defend against phishing attacks,”International Journal of Human-Computer Studies, vol. 176, p. 103056, 2023
work page 2023
-
[11]
Understanding by addressees and overhearers,
M. F. Schober and H. H. Clark, “Understanding by addressees and overhearers,”Cognitive psychology, vol. 21, no. 2, pp. 211–232, 1989
work page 1989
-
[12]
Syntactic co- ordination in dialogue,
H. P. Branigan, M. J. Pickering, and A. A. Cleland, “Syntactic co- ordination in dialogue,”Cognition, vol. 75, no. 2, pp. B13–B25, 2000
work page 2000
-
[13]
Joint action, interactive alignment, and dialog,
S. Garrod and M. J. Pickering, “Joint action, interactive alignment, and dialog,”Topics in Cognitive Science, vol. 1, no. 2, pp. 292–304, 2009
work page 2009
-
[14]
P. Faber, “Lovefraud02,” 2024. [Online]. Available: https://data. mendeley.com/datasets/kmhvb4x5d8/1
work page 2024
-
[15]
Mark my words! linguistic style accommodation in social media,
C. Danescu-Niculescu-Mizil, M. Gamon, and S. Dumais, “Mark my words! linguistic style accommodation in social media,” inProceedings of the 20th international conference on World wide web, 2011, pp. 745– 754
work page 2011
-
[16]
H. H. Clark,Using language. Cambridge university press, 1996
work page 1996
-
[17]
Llm- based class diagram derivation from user stories with chain-of-thought promptings,
Y . Li, J. Keung, X. Ma, C. Y . Chong, J. Zhang, and Y . Liao, “Llm- based class diagram derivation from user stories with chain-of-thought promptings,” in2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, 2024, pp. 45–50
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.