hub Tool reference

Measuring massive multitask language understanding.Proceedings of the International Conference on Learning Representations (ICLR)

Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt · 2021

Tool reference. 80% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.

14 Pith papers citing it

Method reference 80% of classified citations

browse 14 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

dataset 4 background 1

citation-polarity summary

use dataset 4 background 1

representative citing papers

Grid Games: The Power of Multiple Grids for Quantizing Large Language Models

cs.LG · 2026-05-12 · accept · novelty 8.0

Allowing each quantization group to select among multiple 4-bit grids improves accuracy over single-grid FP4 for both post-training and pre-training of LLMs.

Self-Improving In-Context Learning

cs.CL · 2026-05-22 · unverdicted · novelty 7.0

A test-time zeroth-order optimization of prompt embeddings using a bounded self-supervised proxy from demonstration log-probabilities improves ICL accuracy and correlates with gains across tasks.

Dynamic Chunking for Diffusion Language Models

cs.CL · 2026-05-15 · unverdicted · novelty 7.0

DCDM replaces positional blocks with learnable semantic chunks via differentiable Chunking Attention, yielding consistent gains over block and unstructured diffusion baselines up to 1.5B parameters.

Rethinking Vacuity for OOD Detection in Evidential Deep Learning

cs.AI · 2026-05-07 · accept · novelty 7.0

Vacuity-based OOD detection in evidential deep learning is highly sensitive to class cardinality differences between ID and OOD, which can artificially inflate AUROC and AUPR without any change in model predictions.

When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning

cs.AI · 2025-10-08 · unverdicted · novelty 7.0

Anonymization in multi-agent debate reduces identity bias by equalizing self and peer weights in a Bayesian update model, quantified by the Identity Bias Coefficient.

From 0-Order Selection to 2-Order Judgment: Combinatorial Hardening Exposes Compositional Failures in Frontier LLMs

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

LogiHard hardens reasoning benchmarks by transforming 0-order selection into 2-order judgment, causing 31-56% accuracy drops in 12 frontier LLMs and a 47% drop on zero-shot MMLU, revealing a combinatorial reasoning gap rather than knowledge deficits.

EMO: Pretraining Mixture of Experts for Emergent Modularity

cs.CL · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

EMO pretrains MoEs using document boundaries to induce semantic expert specialization, enabling modular subset deployment with minimal accuracy loss unlike standard MoEs.

Active Learning for Communication Structure Optimization in LLM-Based Multi-Agent Systems

cs.MA · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

An ensemble-based information-theoretic active learning method using ensemble Kalman inversion selects valuable tasks to optimize communication structures in LLM multi-agent systems more reliably than random sampling under limited training budgets.

Disentangling Intent from Role: Adversarial Self-Play for Persona-Invariant Safety Alignment

cs.AI · 2026-05-03 · unverdicted · novelty 6.0

PIA achieves lower attack success rates on persona-based jailbreaks via self-play co-evolution of attacks (PLE) and defenses (PICL) that structurally decouples safety from persona context using unilateral KL-divergence.

Spatiotemporal Hidden-State Dynamics as a Signature of Internal Reasoning in Large Language Models

cs.CL · 2026-05-03 · unverdicted · novelty 6.0

Large reasoning models show measurable hidden-state dynamics that a new statistic can use to distinguish correct reasoning trajectories without labels.

On the Implicit Reward Overfitting and the Low-rank Dynamics in RLVR

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

RLVR exhibits implicit reward overfitting to training data and optimizes heavy-tailed singular spectra with rank-1 focus on reasoning capability.

Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity

cs.LG · 2025-12-14 · unverdicted · novelty 5.0 · 2 refs

SPON adds a small set of trainable input-independent activation vectors as representational anchors, trained by distribution matching, to stabilize sparse activation in LLMs and recover performance lost to hidden-state distribution shifts.

Taklif.AI: LLM-Powered Platform for Interest-Based Personalized College Assignments

cs.AI · 2026-05-07 · unverdicted · novelty 4.0

Taklif.AI generates interest-based personalized college assignments via LLMs with prompt engineering and guardrails, receiving positive feedback from 84% of 68 preliminary users.

A Multi-Dimensional Audit of Politically Aligned Large Language Models

cs.CL · 2026-04-27 · unverdicted · novelty 4.0

A multi-dimensional audit framework for politically aligned LLMs finds consistent trade-offs: larger models are more effective and truthful but less fair with higher bias, while fine-tuned models reduce bias but increase hallucinations and reasoning decline, and all tested models show deficiencies.

citing papers explorer

Showing 14 of 14 citing papers.

Grid Games: The Power of Multiple Grids for Quantizing Large Language Models cs.LG · 2026-05-12 · accept · none · ref 18
Allowing each quantization group to select among multiple 4-bit grids improves accuracy over single-grid FP4 for both post-training and pre-training of LLMs.
Self-Improving In-Context Learning cs.CL · 2026-05-22 · unverdicted · none · ref 15
A test-time zeroth-order optimization of prompt embeddings using a bounded self-supervised proxy from demonstration log-probabilities improves ICL accuracy and correlates with gains across tasks.
Dynamic Chunking for Diffusion Language Models cs.CL · 2026-05-15 · unverdicted · none · ref 19
DCDM replaces positional blocks with learnable semantic chunks via differentiable Chunking Attention, yielding consistent gains over block and unstructured diffusion baselines up to 1.5B parameters.
Rethinking Vacuity for OOD Detection in Evidential Deep Learning cs.AI · 2026-05-07 · accept · none · ref 19
Vacuity-based OOD detection in evidential deep learning is highly sensitive to class cardinality differences between ID and OOD, which can artificially inflate AUROC and AUPR without any change in model predictions.
When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning cs.AI · 2025-10-08 · unverdicted · none · ref 28
Anonymization in multi-agent debate reduces identity bias by equalizing self and peer weights in a Bayesian update model, quantified by the Identity Bias Coefficient.
From 0-Order Selection to 2-Order Judgment: Combinatorial Hardening Exposes Compositional Failures in Frontier LLMs cs.CL · 2026-05-08 · unverdicted · none · ref 11
LogiHard hardens reasoning benchmarks by transforming 0-order selection into 2-order judgment, causing 31-56% accuracy drops in 12 frontier LLMs and a 47% drop on zero-shot MMLU, revealing a combinatorial reasoning gap rather than knowledge deficits.
EMO: Pretraining Mixture of Experts for Emergent Modularity cs.CL · 2026-05-07 · unverdicted · none · ref 45 · 2 links
EMO pretrains MoEs using document boundaries to induce semantic expert specialization, enabling modular subset deployment with minimal accuracy loss unlike standard MoEs.
Active Learning for Communication Structure Optimization in LLM-Based Multi-Agent Systems cs.MA · 2026-05-07 · unverdicted · none · ref 44 · 2 links
An ensemble-based information-theoretic active learning method using ensemble Kalman inversion selects valuable tasks to optimize communication structures in LLM multi-agent systems more reliably than random sampling under limited training budgets.
Disentangling Intent from Role: Adversarial Self-Play for Persona-Invariant Safety Alignment cs.AI · 2026-05-03 · unverdicted · none · ref 61
PIA achieves lower attack success rates on persona-based jailbreaks via self-play co-evolution of attacks (PLE) and defenses (PICL) that structurally decouples safety from persona context using unilateral KL-divergence.
Spatiotemporal Hidden-State Dynamics as a Signature of Internal Reasoning in Large Language Models cs.CL · 2026-05-03 · unverdicted · none · ref 43
Large reasoning models show measurable hidden-state dynamics that a new statistic can use to distinguish correct reasoning trajectories without labels.
On the Implicit Reward Overfitting and the Low-rank Dynamics in RLVR cs.LG · 2026-05-07 · unverdicted · none · ref 13
RLVR exhibits implicit reward overfitting to training data and optimizes heavy-tailed singular spectra with rank-1 focus on reasoning capability.
Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity cs.LG · 2025-12-14 · unverdicted · none · ref 54 · 2 links
SPON adds a small set of trainable input-independent activation vectors as representational anchors, trained by distribution matching, to stabilize sparse activation in LLMs and recover performance lost to hidden-state distribution shifts.
Taklif.AI: LLM-Powered Platform for Interest-Based Personalized College Assignments cs.AI · 2026-05-07 · unverdicted · none · ref 25
Taklif.AI generates interest-based personalized college assignments via LLMs with prompt engineering and guardrails, receiving positive feedback from 84% of 68 preliminary users.
A Multi-Dimensional Audit of Politically Aligned Large Language Models cs.CL · 2026-04-27 · unverdicted · none · ref 37
A multi-dimensional audit framework for politically aligned LLMs finds consistent trade-offs: larger models are more effective and truthful but less fair with higher bias, while fine-tuned models reduce bias but increase hallucinations and reasoning decline, and all tested models show deficiencies.

Measuring massive multitask language understanding.Proceedings of the International Conference on Learning Representations (ICLR)

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer