Title resolution pending

The Llama 3 Herd of Models , author= · 2024

87 Pith papers cite this work. Polarity classification is still indexing.

87 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 3

citation-polarity summary

background 2 unclear 1

representative citing papers

RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably

cs.CL · 2026-05-15 · conditional · novelty 7.0

Proves that RoPE attention loses locality bias and token distinction in long contexts, approaching random behavior independent of content.

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

cs.CL · 2026-05-14 · unverdicted · novelty 7.0

DiHAL uses geometry proxies to pick where to replace the lower layers of a pretrained transformer with a diffusion bridge for hidden-state reconstruction, improving over token-level diffusion baselines on 8B models.

LLMs as annotators of credibility assessment in Danish asylum decisions: evaluating classification performance and errors beyond aggregated metrics

cs.CL · 2026-05-13 · accept · novelty 7.0

LLMs can provide cost-effective annotation of credibility in Danish asylum texts but produce inconsistent errors that vary by model and prompt, requiring checks beyond single-model accuracy.

How Much Do Circuits Tell Us? Measuring the Consistency and Specificity of Language Model Circuits

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

Language model circuits show high within-task consistency and necessity but substantial overlap across tasks, making them less specific than assumed.

Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

CircuitFormer is a 511M-parameter encoder-decoder model that generates analog circuit topologies from text prompts at 100% syntactic correctness and 83% functional success using a new subcircuit-mining tokenizer that keeps vocabulary size fixed at 512.

Surprisal Minimisation over Goal-directed Alternatives Predicts Production Choice in Dialogue

cs.CL · 2026-05-01 · unverdicted · novelty 7.0

Surprisal minimization over goal-directed alternatives generated by language models provides the strongest account of production choices in open-ended dialogue compared to uniform information density or length-based costs.

BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

BWLA is the first post-training quantization method for LLMs that achieves 1-bit weights paired with low-bit activations such as 6 bits, using OKT to reshape weights and suppress activation tails plus PSP for low-rank refinement.

Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms

cs.CL · 2026-04-23 · unverdicted · novelty 7.0

LLMs display inconsistent factual recall across different surface forms of the same entity, with greater robustness to minor spelling changes than to aliases or abbreviations.

Adaptive Instruction Composition for Automated LLM Red-Teaming

cs.CR · 2026-04-22 · unverdicted · novelty 7.0

Adaptive Instruction Composition uses a neural contextual bandit with RL to adaptively combine crowdsourced texts, generating more effective and diverse LLM jailbreaks than random or prior adaptive methods on Harmbench.

Are LLM Uncertainty and Correctness Encoded by the Same Features? A Functional Dissociation via Sparse Autoencoders

cs.LG · 2026-04-21 · unverdicted · novelty 7.0

Uncertainty and correctness in LLMs are encoded by distinct feature populations, with suppression of confounded features improving accuracy and reducing entropy.

MUCOCO: Automated Consistency Testing of Code LLMs

cs.SE · 2026-04-21 · unverdicted · novelty 7.0

MUCOCO applies semantic-preserving mutation analysis to automatically expose inconsistent behaviors in code LLMs, detecting inconsistencies in about 15% of cases across 7 models and 4 tasks while outperforming the TURBULENCE baseline.

STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming

cs.CL · 2026-04-21 · unverdicted · novelty 7.0

STAR-Teaming uses a Strategy-Response Multiplex Network inside a multi-agent framework to organize attack strategies into semantic communities, delivering higher attack success rates on LLMs at lower computational cost than prior methods.

Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

R-CAI inverts constitutional AI to automatically generate diverse toxic data for LLM red teaming, with probability clamping improving output coherence by 15% while preserving adversarial strength.

When Vision-Language Models Judge Without Seeing: Exposing Informativeness Bias

cs.AI · 2026-04-20 · unverdicted · novelty 7.0

VLMs as judges exhibit informativeness bias by favoring detailed but image-inconsistent answers; BIRCH mitigates it by first correcting answers against the image, reducing bias up to 17% and improving performance up to 9.8%.

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation

cs.CL · 2025-02-28 · unverdicted · novelty 7.0

CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

cs.SE · 2025-02-25 · unverdicted · novelty 7.0

SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

cs.CL · 2024-10-10 · conditional · novelty 7.0

Omni-MATH supplies 4428 human-verified Olympiad math problems that expose top LLMs achieving only 52.55% to 60.54% accuracy on the most difficult items.

PGT: Procedurally Generated Tasks for improving visual grounding in MLLMs

cs.CV · 2026-05-22 · unverdicted · novelty 6.0

PGT generates synthetic tasks via geometric overlays on images to supply dense visual supervision, improving spatial and relational understanding in MLLMs by up to 20% on targeted benchmarks.

How Many Different Outputs Can a Transformer Generate?

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Transformers are limited to a linearly growing number of accessible output sequences with prompt length, with exponential decay in accessible proportion beyond a critical point, even under unbounded context.

Ex-GraphRAG: Interpretable Evidence Routing for Graph-Augmented LLMs

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

Ex-GraphRAG replaces GNN encoders with M-GNAN for exact node-level decomposition in graph-augmented LLMs, matching black-box performance on STaRK-Prime while exposing semantic-structural mismatches that degrade multi-hop QA when low-attribution intermediaries are removed.

How Far Will They Go? Red-Teaming Online Influence with Large Language Models

cs.CL · 2026-05-20 · unverdicted · novelty 6.0

An empirical red-teaming study measures political Overton Windows across more than 30 open-source LLMs from 10 families and finds left-leaning bias, inverse size correlation, regional variation, and variable jailbreak effectiveness.

How Much Online RL is Enough? Informative Rollouts for Offline Preference Optimization in RLVR

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

Short GRPO warm-up followed by offline DPO on informative rollouts matches or beats full GRPO on math reasoning benchmarks at substantially lower compute cost.

citing papers explorer

Showing 50 of 87 citing papers.

RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably cs.CL · 2026-05-15 · conditional · none · ref 14
Proves that RoPE attention loses locality bias and token distinction in long contexts, approaching random behavior independent of content.
Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement cs.CL · 2026-05-14 · unverdicted · none · ref 47
DiHAL uses geometry proxies to pick where to replace the lower layers of a pretrained transformer with a diffusion bridge for hidden-state reconstruction, improving over token-level diffusion baselines on 8B models.
LLMs as annotators of credibility assessment in Danish asylum decisions: evaluating classification performance and errors beyond aggregated metrics cs.CL · 2026-05-13 · accept · none · ref 79
LLMs can provide cost-effective annotation of credibility in Danish asylum texts but produce inconsistent errors that vary by model and prompt, requiring checks beyond single-model accuracy.
How Much Do Circuits Tell Us? Measuring the Consistency and Specificity of Language Model Circuits cs.CL · 2026-05-08 · unverdicted · none · ref 11
Language model circuits show high within-task consistency and necessity but substantial overlap across tasks, making them less specific than assumed.
Trajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation cs.LG · 2026-05-08 · unverdicted · none · ref 37
Energy-navigated trajectory shaping during training produces 8-step discrete flow matching students that achieve 32% lower perplexity than 1024-step teachers on 170M language models with unchanged inference cost.
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost cs.AI · 2026-05-07 · conditional · none · ref 25
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt cs.AI · 2026-05-07 · unverdicted · none · ref 22
CircuitFormer is a 511M-parameter encoder-decoder model that generates analog circuit topologies from text prompts at 100% syntactic correctness and 83% functional success using a new subcircuit-mining tokenizer that keeps vocabulary size fixed at 512.
Surprisal Minimisation over Goal-directed Alternatives Predicts Production Choice in Dialogue cs.CL · 2026-05-01 · unverdicted · none · ref 116
Surprisal minimization over goal-directed alternatives generated by language models provides the strongest account of production choices in open-ended dialogue compared to uniform information density or length-based costs.
BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs cs.LG · 2026-05-01 · unverdicted · none · ref 1
BWLA is the first post-training quantization method for LLMs that achieves 1-bit weights paired with low-bit activations such as 6 bits, using OKT to reshape weights and suppress activation tails plus PSP for low-rank refinement.
Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms cs.CL · 2026-04-23 · unverdicted · none · ref 24
LLMs display inconsistent factual recall across different surface forms of the same entity, with greater robustness to minor spelling changes than to aliases or abbreviations.
Adaptive Instruction Composition for Automated LLM Red-Teaming cs.CR · 2026-04-22 · unverdicted · none · ref 7
Adaptive Instruction Composition uses a neural contextual bandit with RL to adaptively combine crowdsourced texts, generating more effective and diverse LLM jailbreaks than random or prior adaptive methods on Harmbench.
Are LLM Uncertainty and Correctness Encoded by the Same Features? A Functional Dissociation via Sparse Autoencoders cs.LG · 2026-04-21 · unverdicted · none · ref 13
Uncertainty and correctness in LLMs are encoded by distinct feature populations, with suppression of confounded features improving accuracy and reducing entropy.
MUCOCO: Automated Consistency Testing of Code LLMs cs.SE · 2026-04-21 · unverdicted · none · ref 17
MUCOCO applies semantic-preserving mutation analysis to automatically expose inconsistent behaviors in code LLMs, detecting inconsistencies in about 15% of cases across 7 models and 4 tasks while outperforming the TURBULENCE baseline.
STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming cs.CL · 2026-04-21 · unverdicted · none · ref 34
STAR-Teaming uses a Strategy-Response Multiplex Network inside a multi-agent framework to organize attack strategies into semantic communities, delivering higher attack success rates on LLMs at lower computational cost than prior methods.
Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF cs.CL · 2026-04-20 · unverdicted · none · ref 12
R-CAI inverts constitutional AI to automatically generate diverse toxic data for LLM red teaming, with probability clamping improving output coherence by 15% while preserving adversarial strength.
When Vision-Language Models Judge Without Seeing: Exposing Informativeness Bias cs.AI · 2026-04-20 · unverdicted · none · ref 2
VLMs as judges exhibit informativeness bias by favoring detailed but image-inconsistent answers; BIRCH mitigates it by first correcting answers against the image, reducing bias up to 17% and improving performance up to 9.8%.
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation cs.CL · 2025-02-28 · unverdicted · none · ref 46
CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution cs.SE · 2025-02-25 · unverdicted · none · ref 147
SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models cs.CL · 2024-10-10 · conditional · none · ref 23
Omni-MATH supplies 4428 human-verified Olympiad math problems that expose top LLMs achieving only 52.55% to 60.54% accuracy on the most difficult items.
PGT: Procedurally Generated Tasks for improving visual grounding in MLLMs cs.CV · 2026-05-22 · unverdicted · none · ref 35
PGT generates synthetic tasks via geometric overlays on images to supply dense visual supervision, improving spatial and relational understanding in MLLMs by up to 20% on targeted benchmarks.
How Many Different Outputs Can a Transformer Generate? cs.LG · 2026-05-21 · unverdicted · none · ref 11
Transformers are limited to a linearly growing number of accessible output sequences with prompt length, with exponential decay in accessible proportion beyond a critical point, even under unbounded context.
Ex-GraphRAG: Interpretable Evidence Routing for Graph-Augmented LLMs cs.LG · 2026-05-21 · unverdicted · none · ref 7
Ex-GraphRAG replaces GNN encoders with M-GNAN for exact node-level decomposition in graph-augmented LLMs, matching black-box performance on STaRK-Prime while exposing semantic-structural mismatches that degrade multi-hop QA when low-attribution intermediaries are removed.
How Far Will They Go? Red-Teaming Online Influence with Large Language Models cs.CL · 2026-05-20 · unverdicted · none · ref 32
An empirical red-teaming study measures political Overton Windows across more than 30 open-source LLMs from 10 families and finds left-leaning bias, inverse size correlation, regional variation, and variable jailbreak effectiveness.
How Much Online RL is Enough? Informative Rollouts for Offline Preference Optimization in RLVR cs.LG · 2026-05-20 · unverdicted · none · ref 18
Short GRPO warm-up followed by offline DPO on informative rollouts matches or beats full GRPO on math reasoning benchmarks at substantially lower compute cost.
BLINKG: A Benchmark for LLM-Integrated Knowledge Graph Generation cs.AI · 2026-05-19 · unverdicted · none · ref 63
BLINKG is a benchmark for evaluating LLMs on mapping input data schemas to ontology concepts for knowledge graph construction, with experiments showing promising but limited performance in complex real-world scenarios.
Forecasting Downstream Performance of LLMs With Proxy Metrics cs.CL · 2026-05-18 · unverdicted · none · ref 85
Proxy metrics from next-token distributions over expert solutions outperform loss and compute baselines for ranking LLMs, selecting pretraining data, and extrapolating performance across compute scales.
DMN: A Compositional Framework for Jailbreaking Multimodal LLMs with Multi-Image Inputs cs.CR · 2026-05-18 · unverdicted · none · ref 47
DMN achieves over 90% attack success rate on GPT-4o, Gemini-2.5-pro and Claude Sonnet 4 by distributing instructions, supplying multimodal evidence, and adding number chain tasks across multiple images.
Self-Supervised On-Policy Distillation for Reasoning Language Models cs.LG · 2026-05-17 · unverdicted · none · ref 74
SSOPD converts intra-group correct-wrong contrast into process supervision by distilling a teacher distribution from the shortest correct completion into prefixes of the longest wrong completion, improving GRPO on AIME and HMMT benchmarks.
MixSD: Mixed Contextual Self-Distillation for Knowledge Injection cs.CL · 2026-05-16 · unverdicted · none · ref 16 · 2 links
MixSD mixes tokens from the base model's expert and naive conditionals to create distribution-aligned supervision for knowledge injection, yielding better memorization-retention trade-offs than SFT across scales and benchmarks.
GRLO: Towards Generalizable Reinforcement Learning in Open-Ended Environments from Zero cs.LG · 2026-05-14 · unverdicted · none · ref 24
GRLO shows RLHF from scratch on 5K open-ended prompts raises average performance from 24.1 to 63.1 across domains on Qwen3-4B-Base using 46x less data and 68x less compute than in-domain RLVR while remaining competitive with heavily post-trained models.
From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents cs.CL · 2026-05-14 · unverdicted · none · ref 16
A dataset-agnostic framework converts text tool-calling benchmarks to paired audio evaluations via TTS, speaker variation and noise, then evaluates seven omni-modal models showing model- and task-dependent performance with small text-to-voice gaps.
Polar probe linearly decodes semantic structures from LLMs cs.CL · 2026-05-13 · unverdicted · none · ref 15 · 2 links
LLMs represent semantic relations geometrically via embedding distance and direction; a linear Polar Probe decodes these structures from middle-layer activations and generalizes to new entities.
Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents cs.CL · 2026-05-13 · unverdicted · none · ref 186 · 2 links
A dual hierarchical RL framework with two agents coordinates high-level dialogue strategy and low-level question generation to emulate judicial questioning and extract key information from Supreme Court arguments, outperforming baselines.
Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time cs.CL · 2026-05-13 · conditional · none · ref 24
OP-Mix is an on-policy data mixing method that uses low-rank adapter interpolation to find near-optimal data mixtures throughout language model training with reduced compute.
Low-Rank Adapters Initialization via Gradient Surgery for Continual Learning cs.LG · 2026-05-12 · unverdicted · none · ref 38
SLICE applies gradient surgery via projection and truncated SVD to initialize LoRA adapters, yielding better stability-plasticity trade-offs on continual learning benchmarks including adversarial task sequences.
ODRPO: Ordinal Decompositions of Discrete Rewards for Robust Policy Optimization cs.LG · 2026-05-12 · unverdicted · none · ref 19 · 2 links
ODRPO decomposes discrete rewards into ordinal binary indicators to create robust, variance-aware advantage estimators for noisy RLAIF in LLM alignment.
Search Your Block Floating Point Scales! cs.LG · 2026-05-12 · unverdicted · none · ref 17
ScaleSearch optimizes block floating point scales via fine-grained search to cut quantization error by 27% for NVFP4, improving PTQ by up to 15 points on MATH500 for Qwen3-8B and attention PPL by 0.77 on Llama 3.1 70B.
MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces cs.DC · 2026-05-11 · unverdicted · none · ref 37 · 2 links
Chakra introduces a standardized graph-based execution trace representation for distributed ML workloads along with supporting tools to enable benchmarking, analysis, generation, and co-design across simulators and hardware.
Intrinsic Guardrails: How Semantic Geometry of Personality Interacts with Emergent Misalignment in LLMs cs.CL · 2026-05-11 · unverdicted · none · ref 33
Stable personality vectors in LLMs function as intrinsic guardrails, with ablation increasing emergent misalignment above 40% and amplification reducing it below 3%, enabling zero-shot transfer from aligned to corrupted models.
Modeling Implicit Conflict Monitoring Mechanisms against Stereotypes in LLMs cs.SI · 2026-05-10 · unverdicted · none · ref 62
LLMs contain identifiable COCO neurons that enable implicit self-correction against stereotypes; targeted editing of these neurons improves fairness and robustness to jailbreaks while preserving generation quality.
Bilinear autoencoders find interpretable manifolds cs.LG · 2026-05-09 · unverdicted · none · ref 57
Bilinear autoencoders decompose neural activations into low-rank quadratic forms to discover interpretable multi-dimensional manifolds, improving reconstruction in language models and challenging linear representation assumptions.
BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models cs.CL · 2026-05-07 · unverdicted · none · ref 25
BioTool dataset enables fine-tuning a 4B-parameter LLM to outperform GPT-5.1 in biomedical tool calling while improving downstream answer quality per human experts.
ZAYA1-8B Technical Report cs.AI · 2026-05-06 · unverdicted · none · ref 133
ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.
Parallel Prefix Verification for Speculative Generation cs.AI · 2026-05-05 · unverdicted · none · ref 43
PARSE accelerates LLM inference via parallel semantic prefix verification in a single forward pass, delivering 1.25x-4.3x speedups alone and up to 4.5x when combined with EAGLE-3.
Logical Consistency as a Bridge: Improving LLM Hallucination Detection via Label Constraint Modeling between Responses and Self-Judgments cs.CL · 2026-05-05 · unverdicted · none · ref 65
LaaB improves LLM hallucination detection by mapping self-judgment labels back into neural feature space and using mutual learning under logical consistency constraints between responses and meta-judgments.
Effective Performance Measurement: Challenges and Opportunities in KPI Extraction from Earnings Calls cs.CL · 2026-05-04 · unverdicted · none · ref 32
Encoder models trained on SEC filings struggle with earnings calls due to domain shift, while LLMs enable open-ended KPI extraction with 79.7% human-verified precision on newly introduced benchmarks.
Programmatic Context Augmentation for LLM-based Symbolic Regression cs.AI · 2026-05-04 · unverdicted · none · ref 42
Programmatic context augmentation lets LLM-based symbolic regression perform code-driven data analysis during search, yielding superior efficiency and accuracy over baselines on LLM-SRBench.
InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition cs.CL · 2026-05-04 · unverdicted · none · ref 31
InfoLaw models pretraining as information accumulation where quality sets information density and repetition causes scale-dependent diminishing returns, predicting loss with low error on unseen mixtures and larger scales up to 7B models and 425B tokens.
Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting cs.LG · 2026-05-04 · unverdicted · none · ref 38
Sharpness-aware pretraining and related flat-minima interventions reduce catastrophic forgetting by up to 80% after post-training across 20M-150M models and by 31-40% at 1B scale.
Learn-to-learn on Arbitrary Textual Conditioning: A Hypernetwork-Driven Meta-Gated LLM cs.CL · 2026-05-03 · unverdicted · none · ref 66
A hypernetwork generates meta-gating parameters for SwiGLU blocks to let LLMs adapt their nonlinearity to arbitrary textual conditions, outperforming finetuning and meta-learning baselines with reasonable generalization to unseen cases.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer