Title resolution pending

Measuring Massive Multitask Language Understanding , author=

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems

cs.CR · 2026-05-01 · unverdicted · novelty 7.0

Embedding-based defenses fail against crafted attacks in LLM MAS; confidence scores from logits improve robustness but decay over communication rounds.

OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving

cs.CL · 2026-04-23 · unverdicted · novelty 7.0

OptiVerse is a new benchmark spanning neglected optimization domains that shows LLMs suffer sharp accuracy drops on hard problems due to modeling and logic errors, with a Dual-View Auditor Agent proposed to improve performance.

Differentiable Mixture-of-Agents Incentivizes Swarm Intelligence of Large Language Models

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

DMoA is a differentiable multi-agent framework for LLMs that uses recurrent context-aware routing and predictive entropy for test-time adaptation, claiming SOTA results on 9 benchmarks with efficiency and robustness.

Who and What? Using Linguistic Features and Annotator Characteristics to Analyze Annotation Variation

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Large-scale statistical analysis of four harmful language datasets reveals that interactions between annotator characteristics and linguistic cues drive annotation variation, with lexical features and attitudes prominent but patterns varying by dataset.

CAP: Controllable Alignment Prompting for Unlearning in LLMs

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

CAP is a reinforcement-learning-driven prompt optimization framework that suppresses target knowledge in LLMs while preserving general capabilities, enabling reversible unlearning without any parameter updates.

Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

Cosine similarity poorly predicts performance degradation from layer removal in LLMs, making direct accuracy-drop ablation a more reliable relevance metric.

Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving

cs.CL · 2026-04-22

citing papers explorer

Showing 1 of 1 citing paper after filters.

Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving cs.CL · 2026-04-22 · unreviewed · ref 108

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer