Emily Pronin, Daniel Y Lin, and Lee Ross

URLhttps://arxiv · 2002 · arXiv 2512.12895

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

How Compliant Are GitHub Actions Workflows? A Checklist-Based Study with LLM-Assisted Auditing

cs.SE · 2026-05-03 · accept · novelty 6.0

GitHub Actions workflows achieve only 28% overall compliance with best practices, with LLMs enabling an 81% reduction in verification effort via hybrid adjudication but still requiring expert oversight for security judgments.

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

cs.AI · 2026-04-08 · unverdicted · novelty 6.0

Reasoning SFT generalizes cross-domain conditionally on sufficient optimization, high-quality long-CoT data, and strong base models, while degrading safety.

Agents of Chaos

cs.AI · 2026-02-23 · unverdicted · novelty 6.0

An exploratory red-teaming study documents eleven cases of security, privacy, and governance failures in autonomous language-model agents with tool access and persistent memory.

Escaping Mode Collapse in LLM Generation via Geometric Regulation

cs.CL · 2026-05-01

citing papers explorer

Showing 2 of 2 citing papers after filters.

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability cs.AI · 2026-04-08 · unverdicted · none · ref 4
Reasoning SFT generalizes cross-domain conditionally on sufficient optimization, high-quality long-CoT data, and strong base models, while degrading safety.
Agents of Chaos cs.AI · 2026-02-23 · unverdicted · none · ref 8
An exploratory red-teaming study documents eleven cases of security, privacy, and governance failures in autonomous language-model agents with tool access and persistent memory.

Emily Pronin, Daniel Y Lin, and Lee Ross

fields

years

verdicts

representative citing papers

citing papers explorer