Embedding-based defenses fail against crafted attacks in LLM MAS; confidence scores from logits improve robustness but decay over communication rounds.
Title resolution pending
7 Pith papers cite this work. Polarity classification is still indexing.
years
2026 7representative citing papers
OptiVerse is a new benchmark spanning neglected optimization domains that shows LLMs suffer sharp accuracy drops on hard problems due to modeling and logic errors, with a Dual-View Auditor Agent proposed to improve performance.
DMoA is a differentiable multi-agent framework for LLMs that uses recurrent context-aware routing and predictive entropy for test-time adaptation, claiming SOTA results on 9 benchmarks with efficiency and robustness.
Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.
CAP is a reinforcement-learning-driven prompt optimization framework that suppresses target knowledge in LLMs while preserving general capabilities, enabling reversible unlearning without any parameter updates.
Cosine similarity poorly predicts performance degradation from layer removal in LLMs, making direct accuracy-drop ablation a more reliable relevance metric.