Kim, Stephen Fitz, and Dan Hendrycks

Richard Ren, Steven Basart, Adam Khoja, Alexander Pan, Alice Gatti, Long Phan, Xuwang Yin, Mantas Mazeika, Gabriel Mukobi, Ryan Hwang Kim, Stephen Fitz, Dan Hendrycks · 2024 · DOI 10.52202/079017-2190

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Quality Is Not a Safety Proxy Under Quantization

cs.LG · 2026-06-08 · conditional · novelty 6.0

Across 51 quantized checkpoints, quality metrics fail to predict safety drops in 36 pairings and 10 hidden-danger cases, while a new RTSI screen routes all 10 dangerous rows to testing at matched bucket size.

The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems

cs.CY · 2026-02-19 · accept · novelty 6.0

The 2025 AI Agent Index catalogs technical and safety details for 30 deployed AI agents and finds low developer transparency on safety, evaluations, and societal impacts.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Kim, Stephen Fitz, and Dan Hendrycks

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer