Scam2Prompt is a framework that converts scam-site intents into developer-style prompts and measures how often production LLMs generate malicious code, finding rates from 4.24% to 47.3% across eleven models and showing that current guardrails do not block the behavior.
Understanding the promise and limits of automated fact-checking
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Scam2Prompt: A Scalable Framework for Auditing Malicious Scam Endpoints in Production LLMs
Scam2Prompt is a framework that converts scam-site intents into developer-style prompts and measures how often production LLMs generate malicious code, finding rates from 4.24% to 47.3% across eleven models and showing that current guardrails do not block the behavior.