Read This Paper to Get $50 Million:* An Analysis of Mobile Messaging Scams Using Reddit Data

Allison Lu; Bernardo B. P. Medeiros; Kevin R. B. Butler; Patrick Traynor

arxiv: 2605.16656 · v1 · pith:A5GFGOONnew · submitted 2026-05-15 · 💻 cs.CR · cs.CY

Read This Paper to Get 50 Million:* An Analysis of Mobile Messaging Scams Using Reddit Data

Allison Lu , Bernardo B. P. Medeiros , Kevin R. B. Butler , Patrick Traynor This is my paper

Pith reviewed 2026-05-20 15:58 UTC · model grok-4.3

classification 💻 cs.CR cs.CY

keywords mobile messaging scamsSMS fraudreply-based scamsclick-based scamsscam detection toolsReddit user reportsphishing trendscybersecurity measurement

0 comments

The pith

Reply-based mobile messaging scams grow nearly twice as fast as click-based ones and evade off-the-shelf detectors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gathers 175430 user-reported mobile messaging scams posted on Reddit from June 2020 through December 2025. It separates them into reply-based scams, which ask the recipient to text back, and click-based scams, which direct users to links or calls. Reply-based scams make up half the reports yet expand at a 99.98 percent compound annual growth rate, almost double the 57.29 percent rate for click-based scams. Even though messages within each category share consistent text patterns and phone-number origins, commercial and open-source detectors perform worst on the reply-based group. The results point to the need for detectors that handle this faster-growing category more effectively.

Core claim

Analysis of the Reddit dataset shows that reply-based scams constitute 50 percent of reports and exhibit a compound annual growth rate of 99.98 percent, nearly twice that of click-based scams at 57.29 percent, while current off-the-shelf detection tools achieve their lowest performance on reply-based messages despite measurable similarities in text content and phone-number sources within categories.

What carries the argument

Large-scale collection and categorization of Reddit user reports into reply-based versus click-based mobile messaging scams, followed by measurement of compound annual growth rates and direct testing of commercial and open-source detector accuracy on shared attributes such as text phrasing and originating phone numbers.

If this is right

Reply-based scams require prioritized attention because their faster growth is shifting the overall threat composition.
Consistent text and phone-number patterns within each scam category offer usable signals for improved detection rules.
Existing commercial and open-source tools leave measurable gaps that allow reply-based campaigns to succeed at higher rates.
The measured growth rates imply that scam operations are scaling rapidly and will continue to outpace static detectors without updates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Security teams could group campaigns by reply versus click behavior to allocate blocking resources more efficiently.
Detection systems may gain accuracy by modeling short reply conversations rather than treating each message in isolation.
Public awareness campaigns could emphasize caution with any message that requests a direct response, given the steeper rise in that category.

Load-bearing premise

Reddit posts supply a representative and correctly labeled sample of actual mobile messaging scams with little selection bias or confusion between reply-based and click-based types.

What would settle it

A carrier-level or large-scale user survey of real SMS and messaging traffic over the same years that finds materially different growth rates between reply-based and click-based scams or markedly higher detector success rates on reply-based examples.

Figures

Figures reproduced from arXiv: 2605.16656 by Allison Lu, Bernardo B. P. Medeiros, Kevin R. B. Butler, Patrick Traynor.

**Figure 2.** Figure 2: Diagram of our data selection and processing [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Year-by-year message volume by scam category, showing an increasing trend for all scams. The most common scam [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Among persistent click-based categories, Account Pay [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Numerous global shipping/delivery companies are [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 5.** Figure 5: • Reddit Modmail Scam: These scams are a part of the Prize/Gift scam category, containing the least categorical similarity, but still following similar scripts. This scam leads Reddit users to a dating site using a link, typically ending with the phrase “this is not a scam." The introduction of highly templated scam campaigns across scam categories shows that, even as tactics evolve, scammers continue to … view at source ↗

**Figure 6.** Figure 6: Fraudulent phone number origin distribution. Most numbers are from the U.S. and Canada, but the origin distribution [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 8.** Figure 8: The click-based E-Commerce and Account Paymen [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: LLMs exhibit mixed performance when classifying [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗

**Figure 11.** Figure 11: Domains are diverse and messages often use link [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗

read the original abstract

Mobile messaging scams--fraudulent messages delivered over SMS and other mobile applications--have become a persistent and evolving security threat, yet the attributes underlying these campaigns remain unclear. This study seeks to address this gap by examining trends in mobile messaging scams and testing the effectiveness of commercial and open-source off-the-shelf detection tools. We characterize mobile messaging scam operations, focusing on how phone numbers, URLs, and text content are used across campaigns. To achieve this objective, we collect and measure a dataset of 175,430 user-reported mobile messaging scams from Reddit between June 2020 and December 2025. While reply-based scams constitute only 50% of our dataset, their compound annual growth rate (99.98%) is nearly twice that of click-based scams (57.29%). Critically, reply-based scams also show the lowest detector performance--despite identifiable similarities in text content and phone number origin within categories--indicating that current off-the-shelf tools are ineffective. These results suggest that further development of detectors is necessary to defend against this rapidly changing ecosystem. By examining a range of message attributes, this work provides new insights into mobile messaging scams, informing the design of more targeted and robust detection methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper quantifies faster growth for reply-based scams from Reddit reports and shows detector weaknesses, though the data's representativeness is the main open question.

read the letter

The standout result is the growth rate difference: reply-based scams at nearly 100% compound annual growth compared to 57% for click-based, based on 175k Reddit posts, along with evidence that detectors do worse on the reply type. The authors pulled a large collection of user reports from Reddit over five and a half years. They looked at patterns in phone numbers, links, and message text across the scams. Then they ran off-the-shelf detection tools on the set to measure how well they catch each kind. This gives a practical view of trends and tool gaps that prior work on scam characterization did not quantify at this scale. The direct performance comparison is a plus because it uses the same data for both the trend analysis and the detector tests. On the downside, everything rests on Reddit user reports being a good proxy for actual scam activity. People might post more about reply-based scams because they require responding or seem more personal, which could skew the growth numbers. Without details on how they decided what counts as reply-based versus click-based or checks for consistent labeling, it's hard to know how solid the split is. That part feels under-specified from what is shown. This kind of paper is aimed at security researchers and practitioners building or evaluating mobile messaging protections. Someone studying consumer fraud or SMS threats would get usable numbers and ideas for better detectors from it. The empirical core is strong enough that it should go to peer review rather than get desk rejected. A referee could help tighten the methods section on data handling and bias discussion. I'd recommend sending it out for review.

Referee Report

2 major / 2 minor

Summary. The paper collects and analyzes a dataset of 175,430 user-reported mobile messaging scams from Reddit spanning June 2020 to December 2025. It partitions the reports into reply-based and click-based categories, finding that reply-based scams comprise 50% of the data yet exhibit a compound annual growth rate of 99.98% (nearly double the 57.29% for click-based scams). The study further tests commercial and open-source off-the-shelf detectors, reports the lowest performance on reply-based scams despite observable similarities in text content and phone-number origins within categories, and concludes that current tools are ineffective, motivating improved detection methods.

Significance. If the empirical trends and detector comparisons hold after validation, the work supplies a large-scale, directly measured view of mobile scam evolution that could guide targeted detector design. The scale of the Reddit corpus and the explicit CAGR comparison between scam types constitute measurable, falsifiable observations that other researchers could replicate or refute with independent data sources.

major comments (2)

[Data collection and classification] Data collection and classification section: The central claims (50% share, 99.98% vs. 57.29% CAGR, and lowest detector performance for reply-based scams) rest on the accuracy of partitioning the 175,430 reports into reply-based versus click-based categories. The manuscript provides no explicit classification rules, inter-rater reliability statistics, or ground-truth validation against actual message flows or carrier logs; without these, systematic mislabeling or selection effects cannot be excluded and directly weaken the comparative growth-rate and ineffectiveness conclusions.
[Detector evaluation] Detector evaluation subsection: The assertion that reply-based scams exhibit the lowest detector performance is load-bearing for the recommendation to develop new tools, yet the text does not report per-category quantitative metrics (precision, recall, or F1), the exact set of tools and versions tested, or the decision thresholds applied. This omission prevents assessment of whether the performance gap is statistically or practically significant.

minor comments (2)

[Abstract] The abstract states the dataset spans 'June 2020 and December 2025' but does not clarify whether the endpoint is inclusive or how partial-year data for 2025 were annualized for CAGR computation.
[Figures and tables] Figure captions and table headers should explicitly define 'reply-based' and 'click-based' to avoid reader ambiguity when interpreting the growth-rate and detector results.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thoughtful and constructive review of our manuscript. We address each of the major comments below and describe the changes we will make to strengthen the paper.

read point-by-point responses

Referee: [Data collection and classification] Data collection and classification section: The central claims (50% share, 99.98% vs. 57.29% CAGR, and lowest detector performance for reply-based scams) rest on the accuracy of partitioning the 175,430 reports into reply-based versus click-based categories. The manuscript provides no explicit classification rules, inter-rater reliability statistics, or ground-truth validation against actual message flows or carrier logs; without these, systematic mislabeling or selection effects cannot be excluded and directly weaken the comparative growth-rate and ineffectiveness conclusions.

Authors: We agree that explicit classification rules should have been included. The reports were classified as reply-based if the scam message requested the recipient to respond via SMS or call a provided phone number, and as click-based if it directed the user to click a link, visit a website, or provide information through a form. We will add a new subsection in the revised manuscript detailing these rules with illustrative examples from the dataset. For inter-rater reliability, the classification was conducted by the lead author following these rules; we will note this as a limitation and, if feasible, have a second author independently classify a random sample of 500 reports to compute Cohen's kappa. Regarding ground-truth validation using carrier logs or actual message flows, this is not possible in our setting due to privacy laws and the absence of access to such proprietary data. We will expand the limitations section to discuss potential selection biases inherent in Reddit-reported data. revision: partial
Referee: [Detector evaluation] Detector evaluation subsection: The assertion that reply-based scams exhibit the lowest detector performance is load-bearing for the recommendation to develop new tools, yet the text does not report per-category quantitative metrics (precision, recall, or F1), the exact set of tools and versions tested, or the decision thresholds applied. This omission prevents assessment of whether the performance gap is statistically or practically significant.

Authors: We concur that detailed metrics are necessary for a rigorous evaluation. In the revised version, we will add a table presenting precision, recall, and F1 scores broken down by scam category (reply-based and click-based) for each detector. We will also explicitly list the commercial and open-source tools tested along with their versions and describe the decision thresholds or classification criteria used by each tool. This will enable a clear evaluation of the performance differences. revision: yes

standing simulated objections not resolved

Ground-truth validation against actual message flows or carrier logs cannot be provided, as we do not have access to such data and it would violate user privacy regulations.

Circularity Check

0 steps flagged

No significant circularity in direct empirical measurement of Reddit scam reports

full rationale

The paper collects a dataset of 175430 user-reported mobile messaging scams from Reddit over a fixed time window and performs direct statistical computations on observable attributes such as temporal distribution for CAGR, text/phone-number similarities within categories, and detector accuracy on the collected messages. No modeling equations, fitted parameters presented as predictions, self-referential definitions, or load-bearing self-citations appear in the derivation chain. All central claims (50% reply-based share, 99.98% vs 57.29% CAGR, lowest detector performance) are straightforward summaries of the raw collected data without reduction to inputs by construction. The analysis remains self-contained against external benchmarks because it relies on measurable message attributes rather than any theoretical loop.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis is an empirical measurement study whose conclusions depend on the assumption that Reddit user reports are a reliable proxy for scam prevalence and characteristics; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Reddit user reports accurately reflect real-world mobile messaging scam activities and can be reliably partitioned into reply-based and click-based categories
The growth-rate calculations and detector-performance claims rest entirely on this data source and classification step.

pith-pipeline@v0.9.0 · 5760 in / 1319 out tokens · 70051 ms · 2026-05-20T15:58:57.421345+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

113 extracted references · 113 canonical work pages · 3 internal anchors

[1]

[n. d.]. North American Numbering Plan General Management and Oversight | Federal Communications Commission. https://www.fcc.gov/north-american- numbering-plan-general-management-and-oversight

work page
[2]

[n. d.]. Twilio help center. https://help.twilio.com/articles/11587910480155- A2P-10DLC-Campaign-Vetting-Changes-January-2023

work page arXiv 2023
[3]

[n. d.]. VirusTotal. https://www.virustotal.com/gui/home/upload Accessed: 2025-11-13

work page 2025
[4]

Mistral-7B-LLM-Fraud-Detection

2023. Mistral-7B-LLM-Fraud-Detection. https://huggingface.co/Bilic/Mistral- 7B-LLM-Fraud-Detection Accessed: 2025-11-13

work page 2023
[5]

Smishing Triad

2023. "Smishing Triad" Targeted USPS And US Citizens For Data Theft. https: //www.resecurity.com/blog/article/smishing-triad-targeted-usps-and-us- citizens-for-data-theft Accessed: 2025-11-13

work page 2023
[6]

Reddit Terms of Service

2025. Reddit Terms of Service. https://redditinc.com/policies/user-agreement Accessed: 2025-11-13

work page 2025
[7]

Bhupendra Acharya and Thorsten Holz. 2024. An Explorative Study of Pig Butchering Scams. arXiv:2412.15423 [cs.CR] https://arxiv.org/abs/2412.15423

work page arXiv 2024
[8]

Olivia Acland. 2025. I was tricked, tortured, finally freed: Inside a Burmese scam farm. https://www.thetimes.com/world/asia/article/scam-farms-burma- chinese-l35j7jz8g

work page 2025
[9]

Sadia Afroz and Rachel Greenstadt. 2011. Phishzoo: Detecting phishing websites by looking at them. InProceedings of the 2011 IEEE Fifth International Conference on Semantic Computing. IEEE, 368–375

work page 2011
[10]

Sharad Agarwal, Emma Harvey, Enrico Mariconti, Guillermo Suarez-Tangil, Marie Vasek, et al. 2025. ‘Hey mum, I dropped my phone down the toilet’: Investigating Hi Mum and Dad SMS Scams in the United Kingdom. InUSENIX Security Symposium

work page 2025
[11]

Sharad Agarwal, Guillermo Suarez-Tangil, and Marie Vasek. 2025. An Overview of 7726 User Reports: Uncovering SMS Scams and Scammer Strategies. arXiv:2508.05276 [cs.CR] https://arxiv.org/abs/2508.05276

work page arXiv 2025
[12]

Majid Hameed Ahmed, Sabrina Tiun, Nazlia Omar, and Nor Samsiah Sani

work page
[13]

Applied Sciences13, 1 (2023)

Short Text Clustering Algorithms, Application and Challenges: A Survey. Applied Sciences13, 1 (2023). doi:10.3390/app13010342

work page doi:10.3390/app13010342 2023
[14]

Almeida, José María G

Tiago A. Almeida, José María G. Hidalgo, and Akebo Yamakami. 2011. Con- tributions to the study of SMS spam filtering: new collection and results. In Proceedings of the 11th ACM Symposium on Document Engineering(Mountain View, California, USA)(DocEng ’11). Association for Computing Machinery, New York, NY, USA, 259–262. doi:10.1145/2034691.2034742

work page doi:10.1145/2034691.2034742 2011
[15]

Arctic Shift. 2024. Arctic Shift Reddit API. https://arctic- shif t.photon- reddit.com. Accessed: 2026-03-25

work page 2024
[16]

Tom Bartlett. 2025. ‘The Worst Internet-Research Ethics Violation I Have Ever Seen’. https://www.theatlantic.com/technology/archive/2025/05/reddit-ai- persuasion-experiment-ethics/682676/

work page 2025
[17]

Marzieh Bitaab, Haehyun Cho, Adam Oest, Zhuoer Lyu, Wei Wang, Jorij Abraham, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, and Adam Doupé

work page
[18]

alien traces

Beyond Phish: Toward Detecting Fraudulent e-Commerce Websites at Scale. In2023 IEEE Symposium on Security and Privacy (SP). 2566–2583. doi:10.1109/SP46215.2023.10179461

work page doi:10.1109/sp46215.2023.10179461 2023
[19]

Bitdefender. [n. d.]. The anatomy of Illuminati scams: We spoke to the grand masters so you don’t have to. https://www.bitdefender.com/en-us/blog/hotfor security/the-anatomy-of-illuminati-scams-we-spoke-to-the-grand-masters- so-you-dont-have-to Accessed: 2025-11-13

work page 2025
[20]

It was honestly just gambling

Elijah Bouma-Sims, Hiba Hassan, Alexandra Nisenoff, Lorrie Faith Cranor, and Nicolas Christin. 2024. "It was honestly just gambling": Investigating the Experiences of Teenage Cryptocurrency Users on Reddit. InTwentieth Symposium on Usable Privacy and Security (SOUPS 2024). USENIX Association, Philadelphia, PA, 333–352. https://www.usenix.org/conference/so...

work page 2024
[21]

Is this a scam?

Elijah Bouma-Sims, Mandy Lanyon, and Lorrie Faith Cranor. 2025. “Is this a scam?”: The Nature and Quality of Reddit Discussion about Scams(CCS ’25). Association for Computing Machinery, New York, NY, USA

work page 2025
[22]

Danielle K Brown, Yee Man Margaret Ng, Martin J Riedl, and Ivan Lacasa-Mas

work page
[23]

Social Media + Society

Reddit’s veil of anonymity: Predictors of engagement and participation in media environments with hostile reputations."Social Media + Society"4, 4 (2018)

work page 2018
[24]

Eshwar Chandrasekharan, Mattia Samory, Shagun Jhaver, Hunter Charvat, Amy Bruckman, Cliff Lampe, Jacob Eisenstein, and Eric Gilbert. 2018. The Internet’s Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales.Proc. ACM Hum.-Comput. Interact.2, CSCW, Article 32 (Nov. 2018), 25 pages. doi:10.1145/3274301

work page doi:10.1145/3274301 2018
[25]

Bill Chappell. 2024. FBI warns Americans to keep their text messages secure: What to know. https://www.npr.org/2024/12/17/nx-s1-5223490/text- messaging-security-fbi-chinese-hackers-security-encryption

work page 2024
[26]

Wenbin Chen and Changqing Chen. 2025. Deep Learning-Based Model for Detecting Fraudulent SMS Messages. InProceedings of the 2024 2nd Interna- tional Conference on Information Education and Artificial Intelligence (ICIEAI ’24). Association for Computing Machinery, New York, NY, USA, 346–350. doi:10.1145/3724504.3724561

work page doi:10.1145/3724504.3724561 2025
[27]

Kevin Collier. 2025. Text scams warning of unpaid road tolls fueled by cybercrim- inal salesmen on Telegram. https://www.nbcnews.com/tech/security/unpaid- toll-bill-e-zpass-text-scams-fueled-telegram-salesmen-rcna196347 Accessed: 2025-11-13

work page 2025
[28]

Anna Coluccia, Andrea Pozza, Fabio Ferretti, Fulvio Carabellese, Alessandra Masti, and Giacomo Gualtieri. 2020. Online Romance Scams: Relational Dynam- ics and Psychological Characteristics of the Victims and Scammers. A Scoping Review.Clinical practice and epidemiology in mental health: CP & EMH16 (2020), 24

work page 2020
[29]

and Gómez Hidalgo, José María and Sánz, Enrique Puertas

Cormack, Gordon V. and Gómez Hidalgo, José María and Sánz, Enrique Puertas

work page
[30]

InProceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management(Lisbon, Portugal)(CIKM ’07)

Spam filtering for short messages. InProceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management(Lisbon, Portugal)(CIKM ’07). Association for Computing Machinery, New York, NY, USA, 313–320. doi:10.1145/1321440.1321486

work page doi:10.1145/1321440.1321486
[31]

Ben Cost. 2025. ‘Relentless’ scammers are trying to rip off people by asking to use their pictures for fake ‘art project’ — here’s how. https://nypost.com/2025/ 07/28/lifestyle/fraudsters-target-bank-details-with-fake-art-project-scam/

work page 2025
[32]

Andrei Costin, Jelena Isacenkova, Marco Balduzzi, Aurélien Francillon, and Davide Balzarotti. 2013. The role of phone numbers in understanding cyber- crime schemes. In2013 Eleventh Annual Conference on Privacy, Security and Trust. 213–220. doi:10.1109/PST.2013.6596056

work page doi:10.1109/pst.2013.6596056 2013
[33]

Greta Cross. 2025. Don’t click that link: Authorities warn of new DMV scam texts. https://www.usatoday.com/story/tech/news/2025/05/30/dmv-text- message-scam/83944066007/

work page arXiv 2025
[34]

Tobias Dam, Lukas Daniel Klausner, Damjan Buhov, and Sebastian Schrittwieser

work page
[35]

InProceed- ings of the 14th International Conference on A vailability, Reliability and Security (Canterbury, CA, United Kingdom)(ARES ’19)

Large-Scale Analysis of Pop-Up Scam on Typosquatting URLs. InProceed- ings of the 14th International Conference on A vailability, Reliability and Security (Canterbury, CA, United Kingdom)(ARES ’19). Association for Computing Ma- chinery, New York, NY, USA, Article 53, 9 pages. doi:10.1145/3339252.3340332

work page doi:10.1145/3339252.3340332
[36]

Sarah Jane Delany, Mark Buckley, and Derek Greene. 2012. SMS spam filtering: Methods and data.Expert Systems with Applications39, 10 (2012), 9899–9908. doi:10.1016/j.eswa.2012.02.053

work page doi:10.1016/j.eswa.2012.02.053 2012
[37]

Estqlal Hammad Dhah, Mohammed Abdullah Naser, and Suhad A. Ali. 2019. Spam Email Image Classification Based on Text and Image Features. In2019 First International Conference of Computer and Applied Sciences (CAS). 148–153. doi:10.1109/CAS47993.2019.9075725

work page doi:10.1109/cas47993.2019.9075725 2019
[38]

Brian Eyler, Allison Pytlak, Courtney Weatherby, and Shreya Lad. 2024. To Protect Americans, Prioritize Countering Cyber Scam Operations in the Indo- Pacific. https://www.stimson.org/2024/to-protect-americans-prioritize- countering-cyber-scam-operations-in-the-indo-pacific/

work page 2024
[39]

Polra Victor Falade. 2023. Analysis of 419 Scams: The Trends and New Variants in Emerging Types.Int. J. Sci. Res. in Computer Science and Engineering Vol11, 5 (2023)

work page 2023
[40]

Casey Fiesler, Michael Zimmer, Nicholas Proferes, Sarah Gilbert, and Naiyan Jones. 2024. Remember the Human: A Systematic Review of Ethical Consider- ations in Reddit Research.Proc. ACM Hum.-Comput. Interact.8, GROUP (Feb. 2024). doi:10.1145/3633070

work page doi:10.1145/3633070 2024
[41]

Emily Fishbein. 2024. ‘A Global Monster’: Myanmar-Based Cyber Scams Widen the Net. https://pulitzercenter.org/stories/global-monster-myanmar-based- cyber-scams-widen-net Allison Lu, Bernardo B. P. Medeiros, Kevin R. B. Butler, and Patrick Traynor

work page 2024
[42]

Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia, and Alok N Choudhary

work page
[43]

InNDSS, Vol

Towards Online Spam Filtering in Social Networks. InNDSS, Vol. 12. 1–16

work page
[44]

Maria Glenski, Emily Saldanha, and Svitlana Volkova. 2019. Characterizing Speed and Scale of Cryptocurrency Discussion Spread on Reddit. InThe World Wide Web Conference(San Francisco, CA, USA)(WWW ’19). Association for Computing Machinery, New York, NY, USA, 560–570. doi:10.1145/3308558.33 13702

work page doi:10.1145/3308558.33 2019
[45]

Karthik Gopalakrishnan, Behnam Hedayatnia, Qinlang Chen, Anna Gottardi, Sanjeev Kwatra, Anu Venkatesh, Raefer Gabriel, and Dilek Hakkani-Tur. 2023. Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations. arXiv:2308.11995 [cs.CL] https://arxiv.org/abs/2308.11995

work page arXiv 2023
[46]

Yael Grauer. 2025. Text Message Scam Attempts Have Increased by 50 Percent, a Consumer Reports Survey Finds. https://www.consumerreports.org/mo ney/scams-fraud/texting-and-messaging-scam-attempts-increased-by-50- percent-a1001405682/

work page 2025
[47]

Maarten Grootendorst. 2022. BERTopic: Neural topic modeling with a class- based TF-IDF procedure.arXiv preprint arXiv:2203.05794(2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022
[48]

Yuting Guo and Abeed Sarker. 2025. Benchmarking Open-Source Large Lan- guage Models on Healthcare Text Classification Tasks. arXiv:2503.15169 [cs.CL] https://arxiv.org/abs/2503.15169

work page arXiv 2025
[49]

Mehul Gupta, Aditya Bakliwal, Shubhangi Agarwal, and Pulkit Mehndiratta

work page
[50]

In2018 Eleventh International Conference on Contemporary Comput- ing (IC3)

A Comparative Study of Spam SMS Detection Using Machine Learning Classifiers. In2018 Eleventh International Conference on Contemporary Comput- ing (IC3). 1–7. doi:10.1109/IC3.2018.8530469

work page doi:10.1109/ic3.2018.8530469 2018
[51]

2023.Individual frauds in China: exploring the impact and response to telecommunication network fraud and pig butchering scams

Bing Han. 2023.Individual frauds in China: exploring the impact and response to telecommunication network fraud and pig butchering scams. Ph. D. Dissertation. University of Portsmouth Portsmouth, UK

work page 2023
[52]

Nathan Hart. 2024. Spoofing scams: How to recognize and protect yourself from fake numbers. https://www.dispatch.com/story/news/state/2024/11/07/s poofing-text-message-scams-how-to-block-spam-phone-calls/76112609007/

work page arXiv 2024
[53]

Voelker, and David Wagner

Grant Ho, Asaf Cidon, Lior Gavish, Marco Schweighauser, Vern Paxson, Stefan Savage, Geoffrey M. Voelker, and David Wagner. 2019. Detecting and Charac- terizing Lateral Phishing at Scale. InProceedings of the 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1273–

work page 2019
[54]

https://www.usenix.org/conference/usenixsecurity19/presentation/ho

work page
[55]

Mohamed Houtti, Abhishek Roy, Venkata Narsi Reddy Gangula, and Ashley Walker. 2024. A Survey of Scam Exposure, Victimization, Types, Vectors, and Reporting in 12 Countries.Journal of Online Trust and Safety2, 4 (2024)

work page 2024
[56]

Christian Hudspeth. 2024. More smishing: Beware of a USPS text messaging scam circulating this holiday season. https://www.ktnv.com/news/more-smishi ng-beware-of-a-usps-text-messaging-scam-circulating-this-holiday-season

work page 2024
[57]

Liming Jiang. 2024. Detecting Scams Using Large Language Models. arXiv:2402.03147 [cs.CR] https://arxiv.org/abs/2402.03147

work page arXiv 2024
[58]

Chandrasekar V

Chinthan Kambar, Mrinalini K, and Dr. Chandrasekar V. 2023. Content Based SMS Fraud Detection Using Supervised Learning Approach. https://api.sema nticscholar.org/CorpusID:259923225

work page 2023
[59]

Michael Kan. 2025. Beware the friendly texts from strangers: US sanctions web host tied to $200m in online scam losses. https://www.pcmag.com/news/treas ury-dept-sanctions-funnull-pig-butchering-fbi-scam-texts

work page 2025
[60]

Kelly Kendall. 2025. Watch out for unpaid toll text SCAM, NC officials warn. https://www.wxii12.com/article/unpaid-toll-text-scam-targeting-massive- number-of-people-in-nc/64178713

work page arXiv 2025
[61]

Mahmoud Khonji, Youssef Iraqi, and Andrew Jones. 2013. Phishing Detection: A Literature Survey.IEEE Communications Surveys & Tutorials15, 4 (2013), 2091–2121. doi:10.1109/SURV.2013.032213.00009

work page doi:10.1109/surv.2013.032213.00009 2013
[62]

Brian Krebs. 2025. China-based SMS phishing triad pivots to Banks. https://kreb sonsecurity.com/2025/04/china-based-sms-phishing-triad-pivots-to-banks/

work page 2025
[63]

Alfirna Rizqi Lahitani, Adhistya Erna Permanasari, and Noor Akhmad Setiawan

work page
[64]

InProceedings of the 2016 4th International Conference on Cyber and IT Service Management

Cosine similarity to determine similarity measure: Study case in online essay assessment. InProceedings of the 2016 4th International Conference on Cyber and IT Service Management. 1–6. doi:10.1109/CITSM.2016.7577578

work page doi:10.1109/citsm.2016.7577578 2016
[65]

Medeiros, Kevin Butler, and Patrick Traynor

Seth Layton, Bernardo B.P. Medeiros, Kevin Butler, and Patrick Traynor. 2026. AI Wrote My Paper and All I Got Was This False Negative: Measuring the Efficacy of Commercial AI Text Detectors. In47th IEEE Symposium on Security and Privacy (SP 2026)

work page 2026
[66]

Bochmann, Jason Flood, and Iosif-Viorel Onut

Sophie Le Page, Guy-Vincent Jourdan, Gregor V. Bochmann, Jason Flood, and Iosif-Viorel Onut. 2018. Using URL shorteners to compare phishing and malware attacks. In2018 APWG Symposium on Electronic Crime Research (eCrime). 1–13. doi:10.1109/ECRIME.2018.8376215

work page doi:10.1109/ecrime.2018.8376215 2018
[67]

Kiho Lee, Kyungchan Lim, Hyoungshick Kim, Yonghwi Kwon, and Doowon Kim. 2025. 7 Days Later: Analyzing Phishing-Site Lifespan After Detected. In Proceedings of the ACM on Web Conference 2025(Sydney NSW, Australia)(WWW ’25). Association for Computing Machinery, New York, NY, USA, 945–956. doi:10 .1145/3696410.3714678

work page arXiv 2025
[68]

Rui Li, Yongzheng Zhang, Yupeng Tuo, and Peng Chang. 2018. A Novel Method for Detecting Telecom Fraud User. InProceedings of the 2018 3rd International Conference on Information Systems Engineering (ICISE). 46–50. doi:10.1109/ICIS E.2018.00016

work page doi:10.1109/icis 2018
[69]

Xigao Li, Amir Rahmati, and Nick Nikiforakis. 2024. Like, comment, get scammed: Characterizing comment scams on media platforms. Network and Distributed System Security (NDSS) Symposium

work page 2024
[70]

Xigao Li, Amir Rahmati, and Nick Nikiforakis. 2024. Like, Comment, Get Scammed: Characterizing Comment Scams on Media Platforms. InProceedings of the 31st Network and Distributed Systems Security (NDSS) Symposium. doi:10 .14722/ndss.2024.24060

work page arXiv 2024
[71]

Zhehui Liao, Maria Antoniak, Inyoung Cheong, Evie Yu-Yen Cheng, Ai-Heng Lee, Kyle Lo, Joseph Chee Chang, and Amy X. Zhang. 2024. LLMs as Re- search Tools: A Large Scale Survey of Researchers’ Usage and Perceptions. arXiv:2411.05025 [cs.CL] https://arxiv.org/abs/2411.05025

work page arXiv 2024
[72]

Mingxuan Liu, Yiming Zhang, Baojun Liu, Zhou Li, Haixin Duan, and Donghong Sun. 2021. Detecting and Characterizing SMS Spearphishing Attacks. InPro- ceedings of the 37th Annual Computer Security Applications Conference (ACSAC) (Virtual Event, USA)(ACSAC ’21). Association for Computing Machinery, New York, NY, USA, 930–943. doi:10.1145/3485832.3488012

work page doi:10.1145/3485832.3488012 2021
[73]

Bernard Marr. 2023. A Short History Of ChatGPT: How We Got To Where We Are Today. https://www.forbes.com/sites/bernardmarr/2023/05/19/a-short- history-of-chatgpt-how-we-got-to-where-we-are-today/

work page 2023
[74]

Leland McInnes, John Healy, Steve Astels, et al. 2017. hdbscan: Hierarchical density based clustering.J. Open Source Softw.2, 11 (2017), 205

work page 2017
[75]

Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform man- ifold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[76]

Alexey N Medvedev, Renaud Lambiotte, and Jean-Charles Delvenne. 2017. The anatomy of Reddit: An overview of academic research.Dynamics on and of Complex Networks III, 183–204

work page 2017
[77]

Sandhya Mishra and Devpriya Soni. 2020. Smishing Detector: A security model to detect smishing through SMS content analysis and URL behavior analysis. Future Generation Computer Systems108 (2020), 803–815

work page 2020
[78]

Morium Akter Munny, Mahbub Alam, Sonjoy Kumar Paul, Daniel Timko, Muhammad Lutfor Rahman, and Nitesh Saxena. 2025. Infrastructure Patterns in Toll Scam Domains: A Comprehensive Analysis of Cybercriminal Registration and Hosting Strategies. In2025 APWG Symposium on Electronic Crime Research (eCrime). 1–13. doi:10.1109/eCrime66972.2025.11327851

work page doi:10.1109/ecrime66972.2025.11327851 2025
[79]

Aleksandr Nahapetyan, Sathvik Prasad, Kevin Childs, Adam Oest, Yeganeh Ladwig, Alexandros Kapravelos, and Bradley Reaves. 2024. On SMS Phishing Tactics and Infrastructure. In2024 IEEE Symposium on Security and Privacy (SP). 1–16. doi:10.1109/SP54263.2024.00169

work page doi:10.1109/sp54263.2024.00169 2024
[80]

David Nield. 2025. How to Spot and Guard Against Wrong Number Scams. https://www.wired.com/story/how-to-spot-and-guard-against-wrong- number-scams/

work page 2025

Showing first 80 references.

[1] [1]

[n. d.]. North American Numbering Plan General Management and Oversight | Federal Communications Commission. https://www.fcc.gov/north-american- numbering-plan-general-management-and-oversight

work page

[2] [2]

[n. d.]. Twilio help center. https://help.twilio.com/articles/11587910480155- A2P-10DLC-Campaign-Vetting-Changes-January-2023

work page arXiv 2023

[3] [3]

[n. d.]. VirusTotal. https://www.virustotal.com/gui/home/upload Accessed: 2025-11-13

work page 2025

[4] [4]

Mistral-7B-LLM-Fraud-Detection

2023. Mistral-7B-LLM-Fraud-Detection. https://huggingface.co/Bilic/Mistral- 7B-LLM-Fraud-Detection Accessed: 2025-11-13

work page 2023

[5] [5]

Smishing Triad

2023. "Smishing Triad" Targeted USPS And US Citizens For Data Theft. https: //www.resecurity.com/blog/article/smishing-triad-targeted-usps-and-us- citizens-for-data-theft Accessed: 2025-11-13

work page 2023

[6] [6]

Reddit Terms of Service

2025. Reddit Terms of Service. https://redditinc.com/policies/user-agreement Accessed: 2025-11-13

work page 2025

[7] [7]

Bhupendra Acharya and Thorsten Holz. 2024. An Explorative Study of Pig Butchering Scams. arXiv:2412.15423 [cs.CR] https://arxiv.org/abs/2412.15423

work page arXiv 2024

[8] [8]

Olivia Acland. 2025. I was tricked, tortured, finally freed: Inside a Burmese scam farm. https://www.thetimes.com/world/asia/article/scam-farms-burma- chinese-l35j7jz8g

work page 2025

[9] [9]

Sadia Afroz and Rachel Greenstadt. 2011. Phishzoo: Detecting phishing websites by looking at them. InProceedings of the 2011 IEEE Fifth International Conference on Semantic Computing. IEEE, 368–375

work page 2011

[10] [10]

Sharad Agarwal, Emma Harvey, Enrico Mariconti, Guillermo Suarez-Tangil, Marie Vasek, et al. 2025. ‘Hey mum, I dropped my phone down the toilet’: Investigating Hi Mum and Dad SMS Scams in the United Kingdom. InUSENIX Security Symposium

work page 2025

[11] [11]

Sharad Agarwal, Guillermo Suarez-Tangil, and Marie Vasek. 2025. An Overview of 7726 User Reports: Uncovering SMS Scams and Scammer Strategies. arXiv:2508.05276 [cs.CR] https://arxiv.org/abs/2508.05276

work page arXiv 2025

[12] [12]

Majid Hameed Ahmed, Sabrina Tiun, Nazlia Omar, and Nor Samsiah Sani

work page

[13] [13]

Applied Sciences13, 1 (2023)

Short Text Clustering Algorithms, Application and Challenges: A Survey. Applied Sciences13, 1 (2023). doi:10.3390/app13010342

work page doi:10.3390/app13010342 2023

[14] [14]

Almeida, José María G

Tiago A. Almeida, José María G. Hidalgo, and Akebo Yamakami. 2011. Con- tributions to the study of SMS spam filtering: new collection and results. In Proceedings of the 11th ACM Symposium on Document Engineering(Mountain View, California, USA)(DocEng ’11). Association for Computing Machinery, New York, NY, USA, 259–262. doi:10.1145/2034691.2034742

work page doi:10.1145/2034691.2034742 2011

[15] [15]

Arctic Shift. 2024. Arctic Shift Reddit API. https://arctic- shif t.photon- reddit.com. Accessed: 2026-03-25

work page 2024

[16] [16]

Tom Bartlett. 2025. ‘The Worst Internet-Research Ethics Violation I Have Ever Seen’. https://www.theatlantic.com/technology/archive/2025/05/reddit-ai- persuasion-experiment-ethics/682676/

work page 2025

[17] [17]

Marzieh Bitaab, Haehyun Cho, Adam Oest, Zhuoer Lyu, Wei Wang, Jorij Abraham, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, and Adam Doupé

work page

[18] [18]

alien traces

Beyond Phish: Toward Detecting Fraudulent e-Commerce Websites at Scale. In2023 IEEE Symposium on Security and Privacy (SP). 2566–2583. doi:10.1109/SP46215.2023.10179461

work page doi:10.1109/sp46215.2023.10179461 2023

[19] [19]

Bitdefender. [n. d.]. The anatomy of Illuminati scams: We spoke to the grand masters so you don’t have to. https://www.bitdefender.com/en-us/blog/hotfor security/the-anatomy-of-illuminati-scams-we-spoke-to-the-grand-masters- so-you-dont-have-to Accessed: 2025-11-13

work page 2025

[20] [20]

It was honestly just gambling

Elijah Bouma-Sims, Hiba Hassan, Alexandra Nisenoff, Lorrie Faith Cranor, and Nicolas Christin. 2024. "It was honestly just gambling": Investigating the Experiences of Teenage Cryptocurrency Users on Reddit. InTwentieth Symposium on Usable Privacy and Security (SOUPS 2024). USENIX Association, Philadelphia, PA, 333–352. https://www.usenix.org/conference/so...

work page 2024

[21] [21]

Is this a scam?

Elijah Bouma-Sims, Mandy Lanyon, and Lorrie Faith Cranor. 2025. “Is this a scam?”: The Nature and Quality of Reddit Discussion about Scams(CCS ’25). Association for Computing Machinery, New York, NY, USA

work page 2025

[22] [22]

Danielle K Brown, Yee Man Margaret Ng, Martin J Riedl, and Ivan Lacasa-Mas

work page

[23] [23]

Social Media + Society

Reddit’s veil of anonymity: Predictors of engagement and participation in media environments with hostile reputations."Social Media + Society"4, 4 (2018)

work page 2018

[24] [24]

Eshwar Chandrasekharan, Mattia Samory, Shagun Jhaver, Hunter Charvat, Amy Bruckman, Cliff Lampe, Jacob Eisenstein, and Eric Gilbert. 2018. The Internet’s Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales.Proc. ACM Hum.-Comput. Interact.2, CSCW, Article 32 (Nov. 2018), 25 pages. doi:10.1145/3274301

work page doi:10.1145/3274301 2018

[25] [25]

Bill Chappell. 2024. FBI warns Americans to keep their text messages secure: What to know. https://www.npr.org/2024/12/17/nx-s1-5223490/text- messaging-security-fbi-chinese-hackers-security-encryption

work page 2024

[26] [26]

Wenbin Chen and Changqing Chen. 2025. Deep Learning-Based Model for Detecting Fraudulent SMS Messages. InProceedings of the 2024 2nd Interna- tional Conference on Information Education and Artificial Intelligence (ICIEAI ’24). Association for Computing Machinery, New York, NY, USA, 346–350. doi:10.1145/3724504.3724561

work page doi:10.1145/3724504.3724561 2025

[27] [27]

Kevin Collier. 2025. Text scams warning of unpaid road tolls fueled by cybercrim- inal salesmen on Telegram. https://www.nbcnews.com/tech/security/unpaid- toll-bill-e-zpass-text-scams-fueled-telegram-salesmen-rcna196347 Accessed: 2025-11-13

work page 2025

[28] [28]

Anna Coluccia, Andrea Pozza, Fabio Ferretti, Fulvio Carabellese, Alessandra Masti, and Giacomo Gualtieri. 2020. Online Romance Scams: Relational Dynam- ics and Psychological Characteristics of the Victims and Scammers. A Scoping Review.Clinical practice and epidemiology in mental health: CP & EMH16 (2020), 24

work page 2020

[29] [29]

and Gómez Hidalgo, José María and Sánz, Enrique Puertas

Cormack, Gordon V. and Gómez Hidalgo, José María and Sánz, Enrique Puertas

work page

[30] [30]

InProceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management(Lisbon, Portugal)(CIKM ’07)

Spam filtering for short messages. InProceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management(Lisbon, Portugal)(CIKM ’07). Association for Computing Machinery, New York, NY, USA, 313–320. doi:10.1145/1321440.1321486

work page doi:10.1145/1321440.1321486

[31] [31]

Ben Cost. 2025. ‘Relentless’ scammers are trying to rip off people by asking to use their pictures for fake ‘art project’ — here’s how. https://nypost.com/2025/ 07/28/lifestyle/fraudsters-target-bank-details-with-fake-art-project-scam/

work page 2025

[32] [32]

Andrei Costin, Jelena Isacenkova, Marco Balduzzi, Aurélien Francillon, and Davide Balzarotti. 2013. The role of phone numbers in understanding cyber- crime schemes. In2013 Eleventh Annual Conference on Privacy, Security and Trust. 213–220. doi:10.1109/PST.2013.6596056

work page doi:10.1109/pst.2013.6596056 2013

[33] [33]

Greta Cross. 2025. Don’t click that link: Authorities warn of new DMV scam texts. https://www.usatoday.com/story/tech/news/2025/05/30/dmv-text- message-scam/83944066007/

work page arXiv 2025

[34] [34]

Tobias Dam, Lukas Daniel Klausner, Damjan Buhov, and Sebastian Schrittwieser

work page

[35] [35]

InProceed- ings of the 14th International Conference on A vailability, Reliability and Security (Canterbury, CA, United Kingdom)(ARES ’19)

Large-Scale Analysis of Pop-Up Scam on Typosquatting URLs. InProceed- ings of the 14th International Conference on A vailability, Reliability and Security (Canterbury, CA, United Kingdom)(ARES ’19). Association for Computing Ma- chinery, New York, NY, USA, Article 53, 9 pages. doi:10.1145/3339252.3340332

work page doi:10.1145/3339252.3340332

[36] [36]

Sarah Jane Delany, Mark Buckley, and Derek Greene. 2012. SMS spam filtering: Methods and data.Expert Systems with Applications39, 10 (2012), 9899–9908. doi:10.1016/j.eswa.2012.02.053

work page doi:10.1016/j.eswa.2012.02.053 2012

[37] [37]

Estqlal Hammad Dhah, Mohammed Abdullah Naser, and Suhad A. Ali. 2019. Spam Email Image Classification Based on Text and Image Features. In2019 First International Conference of Computer and Applied Sciences (CAS). 148–153. doi:10.1109/CAS47993.2019.9075725

work page doi:10.1109/cas47993.2019.9075725 2019

[38] [38]

Brian Eyler, Allison Pytlak, Courtney Weatherby, and Shreya Lad. 2024. To Protect Americans, Prioritize Countering Cyber Scam Operations in the Indo- Pacific. https://www.stimson.org/2024/to-protect-americans-prioritize- countering-cyber-scam-operations-in-the-indo-pacific/

work page 2024

[39] [39]

Polra Victor Falade. 2023. Analysis of 419 Scams: The Trends and New Variants in Emerging Types.Int. J. Sci. Res. in Computer Science and Engineering Vol11, 5 (2023)

work page 2023

[40] [40]

Casey Fiesler, Michael Zimmer, Nicholas Proferes, Sarah Gilbert, and Naiyan Jones. 2024. Remember the Human: A Systematic Review of Ethical Consider- ations in Reddit Research.Proc. ACM Hum.-Comput. Interact.8, GROUP (Feb. 2024). doi:10.1145/3633070

work page doi:10.1145/3633070 2024

[41] [41]

Emily Fishbein. 2024. ‘A Global Monster’: Myanmar-Based Cyber Scams Widen the Net. https://pulitzercenter.org/stories/global-monster-myanmar-based- cyber-scams-widen-net Allison Lu, Bernardo B. P. Medeiros, Kevin R. B. Butler, and Patrick Traynor

work page 2024

[42] [42]

Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia, and Alok N Choudhary

work page

[43] [43]

InNDSS, Vol

Towards Online Spam Filtering in Social Networks. InNDSS, Vol. 12. 1–16

work page

[44] [44]

Maria Glenski, Emily Saldanha, and Svitlana Volkova. 2019. Characterizing Speed and Scale of Cryptocurrency Discussion Spread on Reddit. InThe World Wide Web Conference(San Francisco, CA, USA)(WWW ’19). Association for Computing Machinery, New York, NY, USA, 560–570. doi:10.1145/3308558.33 13702

work page doi:10.1145/3308558.33 2019

[45] [45]

Karthik Gopalakrishnan, Behnam Hedayatnia, Qinlang Chen, Anna Gottardi, Sanjeev Kwatra, Anu Venkatesh, Raefer Gabriel, and Dilek Hakkani-Tur. 2023. Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations. arXiv:2308.11995 [cs.CL] https://arxiv.org/abs/2308.11995

work page arXiv 2023

[46] [46]

Yael Grauer. 2025. Text Message Scam Attempts Have Increased by 50 Percent, a Consumer Reports Survey Finds. https://www.consumerreports.org/mo ney/scams-fraud/texting-and-messaging-scam-attempts-increased-by-50- percent-a1001405682/

work page 2025

[47] [47]

Maarten Grootendorst. 2022. BERTopic: Neural topic modeling with a class- based TF-IDF procedure.arXiv preprint arXiv:2203.05794(2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022

[48] [48]

Yuting Guo and Abeed Sarker. 2025. Benchmarking Open-Source Large Lan- guage Models on Healthcare Text Classification Tasks. arXiv:2503.15169 [cs.CL] https://arxiv.org/abs/2503.15169

work page arXiv 2025

[49] [49]

Mehul Gupta, Aditya Bakliwal, Shubhangi Agarwal, and Pulkit Mehndiratta

work page

[50] [50]

In2018 Eleventh International Conference on Contemporary Comput- ing (IC3)

A Comparative Study of Spam SMS Detection Using Machine Learning Classifiers. In2018 Eleventh International Conference on Contemporary Comput- ing (IC3). 1–7. doi:10.1109/IC3.2018.8530469

work page doi:10.1109/ic3.2018.8530469 2018

[51] [51]

2023.Individual frauds in China: exploring the impact and response to telecommunication network fraud and pig butchering scams

Bing Han. 2023.Individual frauds in China: exploring the impact and response to telecommunication network fraud and pig butchering scams. Ph. D. Dissertation. University of Portsmouth Portsmouth, UK

work page 2023

[52] [52]

Nathan Hart. 2024. Spoofing scams: How to recognize and protect yourself from fake numbers. https://www.dispatch.com/story/news/state/2024/11/07/s poofing-text-message-scams-how-to-block-spam-phone-calls/76112609007/

work page arXiv 2024

[53] [53]

Voelker, and David Wagner

Grant Ho, Asaf Cidon, Lior Gavish, Marco Schweighauser, Vern Paxson, Stefan Savage, Geoffrey M. Voelker, and David Wagner. 2019. Detecting and Charac- terizing Lateral Phishing at Scale. InProceedings of the 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1273–

work page 2019

[54] [54]

https://www.usenix.org/conference/usenixsecurity19/presentation/ho

work page

[55] [55]

Mohamed Houtti, Abhishek Roy, Venkata Narsi Reddy Gangula, and Ashley Walker. 2024. A Survey of Scam Exposure, Victimization, Types, Vectors, and Reporting in 12 Countries.Journal of Online Trust and Safety2, 4 (2024)

work page 2024

[56] [56]

Christian Hudspeth. 2024. More smishing: Beware of a USPS text messaging scam circulating this holiday season. https://www.ktnv.com/news/more-smishi ng-beware-of-a-usps-text-messaging-scam-circulating-this-holiday-season

work page 2024

[57] [57]

Liming Jiang. 2024. Detecting Scams Using Large Language Models. arXiv:2402.03147 [cs.CR] https://arxiv.org/abs/2402.03147

work page arXiv 2024

[58] [58]

Chandrasekar V

Chinthan Kambar, Mrinalini K, and Dr. Chandrasekar V. 2023. Content Based SMS Fraud Detection Using Supervised Learning Approach. https://api.sema nticscholar.org/CorpusID:259923225

work page 2023

[59] [59]

Michael Kan. 2025. Beware the friendly texts from strangers: US sanctions web host tied to $200m in online scam losses. https://www.pcmag.com/news/treas ury-dept-sanctions-funnull-pig-butchering-fbi-scam-texts

work page 2025

[60] [60]

Kelly Kendall. 2025. Watch out for unpaid toll text SCAM, NC officials warn. https://www.wxii12.com/article/unpaid-toll-text-scam-targeting-massive- number-of-people-in-nc/64178713

work page arXiv 2025

[61] [61]

Mahmoud Khonji, Youssef Iraqi, and Andrew Jones. 2013. Phishing Detection: A Literature Survey.IEEE Communications Surveys & Tutorials15, 4 (2013), 2091–2121. doi:10.1109/SURV.2013.032213.00009

work page doi:10.1109/surv.2013.032213.00009 2013

[62] [62]

Brian Krebs. 2025. China-based SMS phishing triad pivots to Banks. https://kreb sonsecurity.com/2025/04/china-based-sms-phishing-triad-pivots-to-banks/

work page 2025

[63] [63]

Alfirna Rizqi Lahitani, Adhistya Erna Permanasari, and Noor Akhmad Setiawan

work page

[64] [64]

InProceedings of the 2016 4th International Conference on Cyber and IT Service Management

Cosine similarity to determine similarity measure: Study case in online essay assessment. InProceedings of the 2016 4th International Conference on Cyber and IT Service Management. 1–6. doi:10.1109/CITSM.2016.7577578

work page doi:10.1109/citsm.2016.7577578 2016

[65] [65]

Medeiros, Kevin Butler, and Patrick Traynor

Seth Layton, Bernardo B.P. Medeiros, Kevin Butler, and Patrick Traynor. 2026. AI Wrote My Paper and All I Got Was This False Negative: Measuring the Efficacy of Commercial AI Text Detectors. In47th IEEE Symposium on Security and Privacy (SP 2026)

work page 2026

[66] [66]

Bochmann, Jason Flood, and Iosif-Viorel Onut

Sophie Le Page, Guy-Vincent Jourdan, Gregor V. Bochmann, Jason Flood, and Iosif-Viorel Onut. 2018. Using URL shorteners to compare phishing and malware attacks. In2018 APWG Symposium on Electronic Crime Research (eCrime). 1–13. doi:10.1109/ECRIME.2018.8376215

work page doi:10.1109/ecrime.2018.8376215 2018

[67] [67]

Kiho Lee, Kyungchan Lim, Hyoungshick Kim, Yonghwi Kwon, and Doowon Kim. 2025. 7 Days Later: Analyzing Phishing-Site Lifespan After Detected. In Proceedings of the ACM on Web Conference 2025(Sydney NSW, Australia)(WWW ’25). Association for Computing Machinery, New York, NY, USA, 945–956. doi:10 .1145/3696410.3714678

work page arXiv 2025

[68] [68]

Rui Li, Yongzheng Zhang, Yupeng Tuo, and Peng Chang. 2018. A Novel Method for Detecting Telecom Fraud User. InProceedings of the 2018 3rd International Conference on Information Systems Engineering (ICISE). 46–50. doi:10.1109/ICIS E.2018.00016

work page doi:10.1109/icis 2018

[69] [69]

Xigao Li, Amir Rahmati, and Nick Nikiforakis. 2024. Like, comment, get scammed: Characterizing comment scams on media platforms. Network and Distributed System Security (NDSS) Symposium

work page 2024

[70] [70]

Xigao Li, Amir Rahmati, and Nick Nikiforakis. 2024. Like, Comment, Get Scammed: Characterizing Comment Scams on Media Platforms. InProceedings of the 31st Network and Distributed Systems Security (NDSS) Symposium. doi:10 .14722/ndss.2024.24060

work page arXiv 2024

[71] [71]

Zhehui Liao, Maria Antoniak, Inyoung Cheong, Evie Yu-Yen Cheng, Ai-Heng Lee, Kyle Lo, Joseph Chee Chang, and Amy X. Zhang. 2024. LLMs as Re- search Tools: A Large Scale Survey of Researchers’ Usage and Perceptions. arXiv:2411.05025 [cs.CL] https://arxiv.org/abs/2411.05025

work page arXiv 2024

[72] [72]

Mingxuan Liu, Yiming Zhang, Baojun Liu, Zhou Li, Haixin Duan, and Donghong Sun. 2021. Detecting and Characterizing SMS Spearphishing Attacks. InPro- ceedings of the 37th Annual Computer Security Applications Conference (ACSAC) (Virtual Event, USA)(ACSAC ’21). Association for Computing Machinery, New York, NY, USA, 930–943. doi:10.1145/3485832.3488012

work page doi:10.1145/3485832.3488012 2021

[73] [73]

Bernard Marr. 2023. A Short History Of ChatGPT: How We Got To Where We Are Today. https://www.forbes.com/sites/bernardmarr/2023/05/19/a-short- history-of-chatgpt-how-we-got-to-where-we-are-today/

work page 2023

[74] [74]

Leland McInnes, John Healy, Steve Astels, et al. 2017. hdbscan: Hierarchical density based clustering.J. Open Source Softw.2, 11 (2017), 205

work page 2017

[75] [75]

Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform man- ifold approximation and projection for dimension reduction.arXiv preprint arXiv:1802.03426(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[76] [76]

Alexey N Medvedev, Renaud Lambiotte, and Jean-Charles Delvenne. 2017. The anatomy of Reddit: An overview of academic research.Dynamics on and of Complex Networks III, 183–204

work page 2017

[77] [77]

Sandhya Mishra and Devpriya Soni. 2020. Smishing Detector: A security model to detect smishing through SMS content analysis and URL behavior analysis. Future Generation Computer Systems108 (2020), 803–815

work page 2020

[78] [78]

Morium Akter Munny, Mahbub Alam, Sonjoy Kumar Paul, Daniel Timko, Muhammad Lutfor Rahman, and Nitesh Saxena. 2025. Infrastructure Patterns in Toll Scam Domains: A Comprehensive Analysis of Cybercriminal Registration and Hosting Strategies. In2025 APWG Symposium on Electronic Crime Research (eCrime). 1–13. doi:10.1109/eCrime66972.2025.11327851

work page doi:10.1109/ecrime66972.2025.11327851 2025

[79] [79]

Aleksandr Nahapetyan, Sathvik Prasad, Kevin Childs, Adam Oest, Yeganeh Ladwig, Alexandros Kapravelos, and Bradley Reaves. 2024. On SMS Phishing Tactics and Infrastructure. In2024 IEEE Symposium on Security and Privacy (SP). 1–16. doi:10.1109/SP54263.2024.00169

work page doi:10.1109/sp54263.2024.00169 2024

[80] [80]

David Nield. 2025. How to Spot and Guard Against Wrong Number Scams. https://www.wired.com/story/how-to-spot-and-guard-against-wrong- number-scams/

work page 2025