Title resolution pending

Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, et al · 2024 · arXiv 2403.13793

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks?

cs.CR · 2026-05-11 · conditional · novelty 7.0

ExploitGym benchmark shows frontier AI models can generate working exploits for 120-157 of 898 real vulnerabilities, with non-trivial success even when common security defenses are enabled.

CyberCertBench: Evaluating LLMs in Cybersecurity Certification Knowledge

cs.CR · 2026-04-22 · unverdicted · novelty 7.0

CyberCertBench shows frontier LLMs reach human-expert performance on general IT and networking security but drop on vendor-specific and formal standards questions such as IEC 62443, with a new framework for producing interpretable explanations.

Rollout Cards: A Reproducibility Standard for Agent Research

cs.AI · 2026-05-12 · conditional · novelty 6.0

Rollout cards preserve complete agent rollout records and declare the reporting rules behind scores, enabling reproducible evaluation where changing only the rule can alter success rates by over 20 percentage points.

The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems

cs.CY · 2026-02-19 · accept · novelty 6.0

The 2025 AI Agent Index catalogs technical and safety details for 30 deployed AI agents and finds low developer transparency on safety, evaluations, and societal impacts.

Towards an AI co-scientist

cs.AI · 2025-02-26 · unverdicted · novelty 6.0

A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.

From Disclosure to Self-Referential Opacity: Six Dimensions of Strain in Current AI Governance

cs.CY · 2026-04-15 · unverdicted · novelty 4.0

As AI capability asymmetry increases, disclosure-based governance fails because systems either game evaluations or become embedded in oversight, straining legitimacy and non-domination more than corrigibility or resilience.

Gemma 2: Improving Open Language Models at a Practical Size

cs.CL · 2024-07-31 · conditional · novelty 3.0

Gemma 2 models achieve leading performance at their sizes by combining established Transformer modifications with knowledge distillation for the 2B and 9B variants.

citing papers explorer

Showing 7 of 7 citing papers.

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks? cs.CR · 2026-05-11 · conditional · none · ref 44
ExploitGym benchmark shows frontier AI models can generate working exploits for 120-157 of 898 real vulnerabilities, with non-trivial success even when common security defenses are enabled.
CyberCertBench: Evaluating LLMs in Cybersecurity Certification Knowledge cs.CR · 2026-04-22 · unverdicted · none · ref 18
CyberCertBench shows frontier LLMs reach human-expert performance on general IT and networking security but drop on vendor-specific and formal standards questions such as IEC 62443, with a new framework for producing interpretable explanations.
Rollout Cards: A Reproducibility Standard for Agent Research cs.AI · 2026-05-12 · conditional · none · ref 2
Rollout cards preserve complete agent rollout records and declare the reporting rules behind scores, enabling reproducible evaluation where changing only the rule can alter success rates by over 20 percentage points.
The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems cs.CY · 2026-02-19 · accept · none · ref 103
The 2025 AI Agent Index catalogs technical and safety details for 30 deployed AI agents and finds low developer transparency on safety, evaluations, and societal impacts.
Towards an AI co-scientist cs.AI · 2025-02-26 · unverdicted · none · ref 38
A multi-agent AI system generates novel biomedical hypotheses that show promising experimental validation in drug repurposing for leukemia, new targets for liver fibrosis, and a bacterial gene transfer mechanism.
From Disclosure to Self-Referential Opacity: Six Dimensions of Strain in Current AI Governance cs.CY · 2026-04-15 · unverdicted · none · ref 69
As AI capability asymmetry increases, disclosure-based governance fails because systems either game evaluations or become embedded in oversight, straining legitimacy and non-domination more than corrigibility or resilience.
Gemma 2: Improving Open Language Models at a Practical Size cs.CL · 2024-07-31 · conditional · none · ref 37
Gemma 2 models achieve leading performance at their sizes by combining established Transformer modifications with knowledge distillation for the 2B and 9B variants.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer