CySecBench: Generative AI-based cybersecurity-focused prompt dataset for benchmarking large language models

Johan Wahr´ eus, Ahmed Mohamed Hussain, Panagiotis Papadimitratos · 2024 · arXiv 2501.01335

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

support 1 use dataset 1

representative citing papers

Cybersecurity AI (CAI) Dataset

cs.CR · 2026-05-27 · unverdicted · novelty 7.0

CAI Dataset is presented as the largest described corpus of LLM-driven hacker trajectories, with the claim that operator data concentration in frontier-model providers creates a major security risk best addressed by on-premise specialized LLMs.

Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpora (2023-2025)

cs.CR · 2026-05-19 · accept · novelty 7.0

Systematic review of thirteen malicious-code prompt corpora for coding LLM refusal evaluation that catalogs construction methods, surfaces gaps in human baselines, cross-corpus comparability, and malware taxonomies, and proposes methodological improvements.

Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Requests

cs.CR · 2026-05-27 · unverdicted · novelty 5.0

Consolidates eight corpora into a 6,671-prompt bank with five-judge consensus labels separating executable malicious code requests (4,748) from harmful security knowledge requests (1,923), achieving Fleiss' kappa 0.767.

A Validated Prompt Bank for Malicious Code Generation: Separating Executable Weapons from Security Knowledge in 1,554 Consensus-Labeled Prompts

cs.CR · 2026-05-04 · accept · novelty 5.0

The paper releases a 1,554-prompt consensus-labeled bank separating executable malicious code requests from security knowledge requests, validated by five-model majority labeling with Fleiss' kappa of 0.876.

Beyond Context: Large Language Models' Failure to Grasp Users' Intent

cs.AI · 2025-12-24 · unverdicted · novelty 3.0

LLMs fail to detect hidden harmful intent, allowing systematic bypass of safety mechanisms through framing techniques, with reasoning modes often worsening the issue.

citing papers explorer

Showing 5 of 5 citing papers.

Cybersecurity AI (CAI) Dataset cs.CR · 2026-05-27 · unverdicted · none · ref 15
CAI Dataset is presented as the largest described corpus of LLM-driven hacker trajectories, with the claim that operator data concentration in frontier-model providers creates a major security risk best addressed by on-premise specialized LLMs.
Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpora (2023-2025) cs.CR · 2026-05-19 · accept · none · ref 9
Systematic review of thirteen malicious-code prompt corpora for coding LLM refusal evaluation that catalogs construction methods, surfaces gaps in human baselines, cross-corpus comparability, and malware taxonomies, and proposes methodological improvements.
Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Requests cs.CR · 2026-05-27 · unverdicted · none · ref 4
Consolidates eight corpora into a 6,671-prompt bank with five-judge consensus labels separating executable malicious code requests (4,748) from harmful security knowledge requests (1,923), achieving Fleiss' kappa 0.767.
A Validated Prompt Bank for Malicious Code Generation: Separating Executable Weapons from Security Knowledge in 1,554 Consensus-Labeled Prompts cs.CR · 2026-05-04 · accept · none · ref 12
The paper releases a 1,554-prompt consensus-labeled bank separating executable malicious code requests from security knowledge requests, validated by five-model majority labeling with Fleiss' kappa of 0.876.
Beyond Context: Large Language Models' Failure to Grasp Users' Intent cs.AI · 2025-12-24 · unverdicted · none · ref 56
LLMs fail to detect hidden harmful intent, allowing systematic bypass of safety mechanisms through framing techniques, with reasoning modes often worsening the issue.

CySecBench: Generative AI-based cybersecurity-focused prompt dataset for benchmarking large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer