GitHub Actions workflows achieve only 28% overall compliance with best practices, with LLMs enabling an 81% reduction in verification effort via hybrid adjudication but still requiring expert oversight for security judgments.
Wait, wait, wait
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
Reinforced Mode Regulation (RMR) uses low-rank damping on the value cache to prevent geometric collapse and mode collapse in autoregressive LLM generation, supporting stable output down to 0.8 nats/step entropy.
Reasoning SFT generalizes cross-domain conditionally on sufficient optimization, high-quality long-CoT data, and strong base models, while degrading safety.
An exploratory red-teaming study documents eleven cases of security, privacy, and governance failures in autonomous language-model agents with tool access and persistent memory.
citing papers explorer
-
How Compliant Are GitHub Actions Workflows? A Checklist-Based Study with LLM-Assisted Auditing
GitHub Actions workflows achieve only 28% overall compliance with best practices, with LLMs enabling an 81% reduction in verification effort via hybrid adjudication but still requiring expert oversight for security judgments.
-
Escaping Mode Collapse in LLM Generation via Geometric Regulation
Reinforced Mode Regulation (RMR) uses low-rank damping on the value cache to prevent geometric collapse and mode collapse in autoregressive LLM generation, supporting stable output down to 0.8 nats/step entropy.
-
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability
Reasoning SFT generalizes cross-domain conditionally on sufficient optimization, high-quality long-CoT data, and strong base models, while degrading safety.
-
Agents of Chaos
An exploratory red-teaming study documents eleven cases of security, privacy, and governance failures in autonomous language-model agents with tool access and persistent memory.