pith. sign in

super hub Mixed citations

GLM-5: from Vibe Coding to Agentic Engineering

Mixed citation behavior. Most common role is background (68%).

106 Pith papers citing it
Background 68% of classified citations
abstract

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintaining long-context fidelity. To advance model alignment and autonomy, we implement a new asynchronous reinforcement learning infrastructure that drastically improves post-training efficiency by decoupling generation from training. Furthermore, we propose novel asynchronous agent RL algorithms that further improve RL quality, enabling the model to learn from complex, long-horizon interactions more effectively. Through these innovations, GLM-5 achieves state-of-the-art performance on major open benchmarks. Most critically, GLM-5 demonstrates unprecedented capability in real-world coding tasks, surpassing previous baselines in handling end-to-end software engineering challenges. Code, models, and more information are available at https://github.com/zai-org/GLM-5.

hub tools

citation-role summary

background 22 baseline 6 dataset 2 method 2 other 2

citation-polarity summary

claims ledger

  • abstract We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintaining long-context fidelity. To advance model alignment and autonomy, we implement a new asynchronous reinforcement learning infrastructure that drastically improves post-training efficiency by decoupling generation from training. Furthermore, we propose novel asynchronous agent RL algorithms that fur

authors

co-cited works

years

2026 106

clear filters

representative citing papers

AMUSE: Anytime Muon with Stable Gradient Evaluation

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

AMUSE is a new optimizer integrating Muon orthogonalization with Schedule-Free averaging via adaptive interpolation for schedule-free anytime training that improves Pareto frontiers on vision and LLM tasks.

KL for a KL: On-Policy Distillation with Control Variate Baseline

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

vOPD stabilizes on-policy distillation gradients by subtracting a closed-form per-token negative reverse KL baseline as a detached control variate, preserving unbiasedness while lowering variance and matching expensive full-vocabulary methods.

citing papers explorer

Showing 8 of 8 citing papers after filters.

  • MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems cs.CR · 2026-05-24 · unverdicted · none · ref 2 · internal anchor

    MemMark enables snapshot-only attribution for agent long-term memory by embedding signals via keyed distribution-preserving sampling at memory-write decisions, recovering 40-bit payloads with near-baseline utility.

  • When Alignment Isn't Enough: Response-Path Attacks on LLM Agents cs.CR · 2026-05-04 · unverdicted · none · ref 66 · internal anchor

    A malicious relay can strategically rewrite aligned LLM outputs in BYOK agent architectures to achieve up to 99.1% attack success on benchmarks like AgentDojo and ASB.

  • Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions cs.CR · 2026-05-21 · unverdicted · none · ref 44 · internal anchor

    A3S-Bench evaluates LLM agents against temporal, spatial, and semantic evasions, raising average risk trigger rates from 28.3% to 52.6% across 2,254 trajectories and 20 scenarios.

  • Exploiting LLM Agent Supply Chains via Payload-less Skills cs.CR · 2026-05-14 · conditional · none · ref 45 · internal anchor

    Semantic Compliance Hijacking lets attackers hijack LLM agents by disguising malicious instructions as compliance rules in skills, reaching up to 77.67% success on confidentiality breaches and 67.33% on RCE while evading all tested scanners.

  • LoopTrap: Termination Poisoning Attacks on LLM Agents cs.CR · 2026-05-07 · unverdicted · none · ref 9 · internal anchor

    LoopTrap is an automated red-teaming framework that crafts termination-poisoning prompts to amplify LLM agent steps by 3.57x on average (up to 25x) across 8 agents.

  • ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection cs.CR · 2026-04-13 · unverdicted · none · ref 44 · 2 links · internal anchor

    ClawGuard enforces deterministic, user-derived access constraints at tool boundaries to block indirect prompt injection without changing the underlying LLM.

  • An Independent Safety Evaluation of Kimi K2.5 cs.CR · 2026-04-03 · conditional · none · ref 3 · internal anchor

    Kimi K2.5 matches closed models on dual-use tasks but refuses fewer CBRNE requests and shows some sabotage and self-replication tendencies.

  • XekRung Technical Report cs.CR · 2026-04-30 · unverdicted · none · ref 204 · internal anchor

    XekRung achieves state-of-the-art performance on cybersecurity benchmarks among same-scale models via tailored data synthesis and multi-stage training while retaining strong general capabilities.