pith. sign in

super hub Mixed citations

GLM-5: from Vibe Coding to Agentic Engineering

Mixed citation behavior. Most common role is background (69%).

167 Pith papers citing it
Background 69% of classified citations
abstract

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintaining long-context fidelity. To advance model alignment and autonomy, we implement a new asynchronous reinforcement learning infrastructure that drastically improves post-training efficiency by decoupling generation from training. Furthermore, we propose novel asynchronous agent RL algorithms that further improve RL quality, enabling the model to learn from complex, long-horizon interactions more effectively. Through these innovations, GLM-5 achieves state-of-the-art performance on major open benchmarks. Most critically, GLM-5 demonstrates unprecedented capability in real-world coding tasks, surpassing previous baselines in handling end-to-end software engineering challenges. Code, models, and more information are available at https://github.com/zai-org/GLM-5.

hub tools

citation-role summary

background 24 baseline 6 dataset 2 method 2 other 2

citation-polarity summary

claims ledger

  • abstract We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintaining long-context fidelity. To advance model alignment and autonomy, we implement a new asynchronous reinforcement learning infrastructure that drastically improves post-training efficiency by decoupling generation from training. Furthermore, we propose novel asynchronous agent RL algorithms that fur

authors

co-cited works

years

2026 167

clear filters

representative citing papers

Dockerless: Environment-Free Program Verifier for Coding Agents

cs.SE · 2026-06-26 · unverdicted · novelty 7.0

Dockerless uses agentic repository exploration to verify patches without execution, enabling SFT and RL training of coding agents that reach 62.0/50.0/35.2% resolve rates on SWE-bench Verified/Multilingual/Pro while matching environment-based results.

Toward Agentic SysAdmin: Rethinking System Administration with AI Agents

cs.NI · 2026-06-25 · unverdicted · novelty 7.0

NetLLMeval is an emulation-based framework for benchmarking LLM solvers on network admin tasks, with a 24000-run study showing solver architecture lifts a 14B model from 0.43 to 0.88 accuracy and allows local models to match frontier systems.

MacAgentBench: Benchmarking AI Agents on Real-World macOS Desktop

cs.AI · 2026-06-21 · unverdicted · novelty 7.0

MacAgentBench is a new benchmark for macOS AI agents with 676 tasks, deterministic multi-checkpoint evaluation, and tests across frameworks showing skill libraries drive performance more than framework design.

StaminaBench: Stress-Testing Coding Agents over 100 Interaction Turns

cs.SE · 2026-06-17 · unverdicted · novelty 7.0

StaminaBench evaluates coding agents over 100 procedurally generated change requests to a REST API, finding that tested models fail within 5-6 turns without feedback but improve up to 12x with test feedback and good harnesses.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.