super hub Mixed citations

Gemma 2: Improving Open Language Models at a Practical Size

Cassidy Hardin, Gemma Team: Morgane Riviere, Pier Giuseppe Sessa, Shreya Pathak, Surya Bhupatiraju · 2024 · cs.CL · arXiv 2408.00118

Mixed citation behavior. Most common role is background (64%).

228 Pith papers citing it

Background 64% of classified citations

open full Pith review browse 228 citing papers more from Cassidy Hardin arXiv PDF

abstract

In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2-3 times bigger. We release all our models to the community.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 23 method 6 baseline 2 dataset 1 other 1

citation-polarity summary

background 21 use method 6 unclear 3 baseline 2 use dataset 1

claims ledger

abstract In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer compe

authors

Cassidy Hardin Gemma Team: Morgane Riviere L\'eonard Hussenot Pier Giuseppe Sessa Shreya Pathak Surya Bhupatiraju

co-cited works

representative citing papers

Masked Generative Transformer Is What You Need for Image Editing

cs.CV · 2026-05-11 · unverdicted · novelty 8.0

EditMGT applies masked generative transformers with attention consolidation and region-hold sampling to deliver state-of-the-art localized image editing at 6x the speed of diffusion methods.

Acceptance Cards:A Four-Diagnostic Standard for Safe Fine-Tuning Defense Claims

cs.CR · 2026-05-11 · unverdicted · novelty 8.0

Acceptance Cards is a new four-diagnostic standard for safe fine-tuning defense claims that requires statistical reliability, fresh semantic generalization, mechanism alignment, and cross-task transfer; under this protocol SafeLoRA fails the full-card pass on Gemma-2-2B-it.

SLAM: Structural Linguistic Activation Marking for Language Models

cs.CL · 2026-05-06 · unverdicted · novelty 8.0 · 2 refs

SLAM achieves 100% detection on Gemma-2 models with only 1-2 point quality cost by causally steering SAE-identified residual-stream directions for linguistic structure.

ArgBench: Benchmarking LLMs on Computational Argumentation Tasks

cs.CL · 2026-04-19 · unverdicted · novelty 8.0

ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.

LiveBench: A Challenging, Contamination-Limited LLM Benchmark

cs.CL · 2024-06-27 · unverdicted · novelty 8.0

LiveBench is a contamination-limited LLM benchmark with auto-scored challenging tasks from recent sources across math, coding, reasoning and more, where top models score below 70%.

Applicability Condition Extraction for Therapeutic Drug-Disease Relations

cs.AI · 2026-06-12 · unverdicted · novelty 7.0

Introduces applicability condition extraction for therapeutic drug-disease relations, creates first annotated dataset of 1,119 pairs, and proposes enhanced LoRA method outperforming baselines.

Every Act Has Its Price: Compressed Moral Composition in Frontier LLMs

cs.CL · 2026-05-29 · unverdicted · novelty 7.0

Moral Trolley Arena shows frontier LLMs produce composite moral preferences that are compressed rather than additive functions of calibrated component act strengths across Moral Foundations Theory.

Wait! There's a Way Out: A Decision Mechanism for Forecasting Conversational Derailment

cs.CL · 2026-05-28 · unverdicted · novelty 7.0

A deferral mechanism using forward-looking simulations reduces false positives in derailment forecasting by selectively waiting when recovery paths appear plausible.

Convergence Without Understanding: When Language Models Agree on Representations but Disagree on Reasoning

cs.CL · 2026-05-22 · unverdicted · novelty 7.0

Representational convergence across 16 LLMs on 800 reasoning problems is stronger for failed tasks and pre-decision stages but shows minimal causal influence on predictions, pointing to shared processing constraints over shared reasoning.

Self-Improving In-Context Learning

cs.CL · 2026-05-22 · unverdicted · novelty 7.0

A test-time zeroth-order optimization of prompt embeddings using a bounded self-supervised proxy from demonstration log-probabilities improves ICL accuracy and correlates with gains across tasks.

GraphFlow: A Graph-Based Workflow Management for Efficient LLM-Agent Serving

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

GraphFlow uses a unified wGraph to dynamically instantiate workflows and manage KV caches for LLM agents, reporting 4.95 pp average gains and 4x memory reduction on five benchmarks.

Tensor Cache: Eviction-conditioned Associative Memory for Transformers

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

Tensor Cache augments sliding-window attention with an eviction-fed outer-product associative memory and a training correction to improve long-context performance under bounded memory.

The Readout Shortcut: Positional Number Copying Dominates Arithmetic CoT Readout in Small Language Models

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

In 1-3B instruction-tuned LMs on GSM8K, arithmetic CoT readout is dominated by positional copying of the trailing number before the answer delimiter, accounting for 54-92 percentage points of accuracy.

Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

Chronicle is the first model jointly pretrained from scratch on text and time series in a unified transformer that matches a comparable language model on NLU tasks and sets new bars for time series classification and multimodal forecasting.

Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs

cs.RO · 2026-05-13 · unverdicted · novelty 7.0

A new speculative inference system speeds up diffusion VLAs to 19.1 ms average latency (3.04x faster) on LIBERO by replacing most full 58 ms inferences with 7.8 ms draft rounds while preserving task performance.

Uncovering Symmetry Transfer in Large Language Models via Layer-Peeled Optimization

math.OC · 2026-05-12 · conditional · novelty 7.0

Symmetries in next-token prediction targets induce corresponding geometric symmetries such as circulant matrices and equiangular tight frames in the optimal weights and embeddings of a layer-peeled LLM surrogate model.

Towards Automated Air Traffic Safety Assessment Around Non-Towered Airports Using Large Language Models

cs.AI · 2026-05-12 · unverdicted · novelty 7.0

Large language models achieve macro F1 scores above 0.85 on binary nominal-versus-danger classification from CTAF radio transcripts and METAR weather data using a new synthetic dataset with a 12-category hazard taxonomy.

Causal Bias Detection in Generative Artificial Intelligence

cs.AI · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

Develops a causal framework unifying generative AI fairness with standard ML, with new decompositions, identification conditions, and estimators demonstrated on LLM race and gender bias.

Cross-Family Universality of Behavioral Axes via Anchor-Projected Representations

cs.AI · 2026-05-11 · unverdicted · novelty 7.0

Behavioral directions from one LLM family transfer to others via projection into a shared anchor coordinate space, yielding 0.83 ten-way detection accuracy and steering effects up to 0.46% on held-out models.

PLOT: Progressive Localization via Optimal Transport in Neural Causal Abstraction

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

PLOT localizes causal variables in neural networks by fitting optimal transport couplings between abstract and neural intervention effect geometries, enabling fast handles or guided search.

Beyond Factor Aggregation: Gauge-Aware Low-Rank Server Representations for Federated LoRA

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

GLoRA replaces raw factor averaging with gauge-aware aggregation in a consensus subspace estimated from client projectors, enabling consistent low-rank federated LoRA under heterogeneity.

Implicit Representations of Grammaticality in Language Models

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

Linear probes on LM hidden states detect grammaticality better than string probabilities, generalize to human benchmarks and other languages, and correlate weakly with likelihood.

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

cs.SE · 2026-05-01 · unverdicted · novelty 7.0 · 2 refs

Themis introduces the largest open code preference dataset with over 350k pairs and trains multilingual reward models from 600M to 32B parameters that support flexible multi-criteria scoring, with experiments showing scaling trends and cross-lingual transfer.

E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems

cs.CR · 2026-05-01 · unverdicted · novelty 7.0

E-MIA converts document details into four types of exam questions and aggregates the RAG's answers into a membership score that separates member and non-member documents better than prior similarity-based or probe-based attacks.

citing papers explorer

Showing 50 of 228 citing papers.

Masked Generative Transformer Is What You Need for Image Editing cs.CV · 2026-05-11 · unverdicted · none · ref 16 · internal anchor
EditMGT applies masked generative transformers with attention consolidation and region-hold sampling to deliver state-of-the-art localized image editing at 6x the speed of diffusion methods.
Acceptance Cards:A Four-Diagnostic Standard for Safe Fine-Tuning Defense Claims cs.CR · 2026-05-11 · unverdicted · none · ref 19 · internal anchor
Acceptance Cards is a new four-diagnostic standard for safe fine-tuning defense claims that requires statistical reliability, fresh semantic generalization, mechanism alignment, and cross-task transfer; under this protocol SafeLoRA fails the full-card pass on Gemma-2-2B-it.
SLAM: Structural Linguistic Activation Marking for Language Models cs.CL · 2026-05-06 · unverdicted · none · ref 6 · 2 links · internal anchor
SLAM achieves 100% detection on Gemma-2 models with only 1-2 point quality cost by causally steering SAE-identified residual-stream directions for linguistic structure.
ArgBench: Benchmarking LLMs on Computational Argumentation Tasks cs.CL · 2026-04-19 · unverdicted · none · ref 57 · internal anchor
ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.
LiveBench: A Challenging, Contamination-Limited LLM Benchmark cs.CL · 2024-06-27 · unverdicted · none · ref 37 · internal anchor
LiveBench is a contamination-limited LLM benchmark with auto-scored challenging tasks from recent sources across math, coding, reasoning and more, where top models score below 70%.
Applicability Condition Extraction for Therapeutic Drug-Disease Relations cs.AI · 2026-06-12 · unverdicted · none · ref 34 · internal anchor
Introduces applicability condition extraction for therapeutic drug-disease relations, creates first annotated dataset of 1,119 pairs, and proposes enhanced LoRA method outperforming baselines.
Every Act Has Its Price: Compressed Moral Composition in Frontier LLMs cs.CL · 2026-05-29 · unverdicted · none · ref 30 · internal anchor
Moral Trolley Arena shows frontier LLMs produce composite moral preferences that are compressed rather than additive functions of calibrated component act strengths across Moral Foundations Theory.
Wait! There's a Way Out: A Decision Mechanism for Forecasting Conversational Derailment cs.CL · 2026-05-28 · unverdicted · none · ref 3 · internal anchor
A deferral mechanism using forward-looking simulations reduces false positives in derailment forecasting by selectively waiting when recovery paths appear plausible.
Convergence Without Understanding: When Language Models Agree on Representations but Disagree on Reasoning cs.CL · 2026-05-22 · unverdicted · none · ref 14 · internal anchor
Representational convergence across 16 LLMs on 800 reasoning problems is stronger for failed tasks and pre-decision stages but shows minimal causal influence on predictions, pointing to shared processing constraints over shared reasoning.
Self-Improving In-Context Learning cs.CL · 2026-05-22 · unverdicted · none · ref 44 · internal anchor
A test-time zeroth-order optimization of prompt embeddings using a bounded self-supervised proxy from demonstration log-probabilities improves ICL accuracy and correlates with gains across tasks.
GraphFlow: A Graph-Based Workflow Management for Efficient LLM-Agent Serving cs.LG · 2026-05-21 · unverdicted · none · ref 20 · internal anchor
GraphFlow uses a unified wGraph to dynamically instantiate workflows and manage KV caches for LLM agents, reporting 4.95 pp average gains and 4x memory reduction on five benchmarks.
Tensor Cache: Eviction-conditioned Associative Memory for Transformers cs.LG · 2026-05-21 · unverdicted · none · ref 15 · internal anchor
Tensor Cache augments sliding-window attention with an eviction-fed outer-product associative memory and a training correction to improve long-context performance under bounded memory.
The Readout Shortcut: Positional Number Copying Dominates Arithmetic CoT Readout in Small Language Models cs.LG · 2026-05-20 · unverdicted · none · ref 27 · internal anchor
In 1-3B instruction-tuned LMs on GSM8K, arithmetic CoT readout is dominated by positional copying of the trailing number before the answer delimiter, accounting for 54-92 percentage points of accuracy.
Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding cs.LG · 2026-05-18 · unverdicted · none · ref 46 · internal anchor
Chronicle is the first model jointly pretrained from scratch on text and time series in a unified transformer that matches a comparable language model on NLU tasks and sets new bars for time series classification and multimodal forecasting.
Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs cs.RO · 2026-05-13 · unverdicted · none · ref 24 · internal anchor
A new speculative inference system speeds up diffusion VLAs to 19.1 ms average latency (3.04x faster) on LIBERO by replacing most full 58 ms inferences with 7.8 ms draft rounds while preserving task performance.
Uncovering Symmetry Transfer in Large Language Models via Layer-Peeled Optimization math.OC · 2026-05-12 · conditional · none · ref 45 · internal anchor
Symmetries in next-token prediction targets induce corresponding geometric symmetries such as circulant matrices and equiangular tight frames in the optimal weights and embeddings of a layer-peeled LLM surrogate model.
Towards Automated Air Traffic Safety Assessment Around Non-Towered Airports Using Large Language Models cs.AI · 2026-05-12 · unverdicted · none · ref 29 · internal anchor
Large language models achieve macro F1 scores above 0.85 on binary nominal-versus-danger classification from CTAF radio transcripts and METAR weather data using a new synthetic dataset with a 12-category hazard taxonomy.
Causal Bias Detection in Generative Artificial Intelligence cs.AI · 2026-05-12 · unverdicted · none · ref 53 · 2 links · internal anchor
Develops a causal framework unifying generative AI fairness with standard ML, with new decompositions, identification conditions, and estimators demonstrated on LLM race and gender bias.
Cross-Family Universality of Behavioral Axes via Anchor-Projected Representations cs.AI · 2026-05-11 · unverdicted · none · ref 15 · internal anchor
Behavioral directions from one LLM family transfer to others via projection into a shared anchor coordinate space, yielding 0.83 ten-way detection accuracy and steering effects up to 0.46% on held-out models.
PLOT: Progressive Localization via Optimal Transport in Neural Causal Abstraction cs.LG · 2026-05-07 · unverdicted · none · ref 7 · 2 links · internal anchor
PLOT localizes causal variables in neural networks by fitting optimal transport couplings between abstract and neural intervention effect geometries, enabling fast handles or guided search.
Beyond Factor Aggregation: Gauge-Aware Low-Rank Server Representations for Federated LoRA cs.LG · 2026-05-07 · unverdicted · none · ref 18 · internal anchor
GLoRA replaces raw factor averaging with gauge-aware aggregation in a consensus subspace estimated from client projectors, enabling consistent low-rank federated LoRA under heterogeneity.
Implicit Representations of Grammaticality in Language Models cs.CL · 2026-05-06 · unverdicted · none · ref 6 · internal anchor
Linear probes on LM hidden states detect grammaticality better than string probabilities, generalize to human benchmarks and other languages, and correlate weakly with likelihood.
Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring cs.SE · 2026-05-01 · unverdicted · none · ref 17 · 2 links · internal anchor
Themis introduces the largest open code preference dataset with over 350k pairs and trains multilingual reward models from 600M to 32B parameters that support flexible multi-criteria scoring, with experiments showing scaling trends and cross-lingual transfer.
E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems cs.CR · 2026-05-01 · unverdicted · none · ref 33 · internal anchor
E-MIA converts document details into four types of exam questions and aggregates the RAG's answers into a membership score that separates member and non-member documents better than prior similarity-based or probe-based attacks.
Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression cs.LG · 2026-04-30 · unverdicted · none · ref 64 · internal anchor
Auto-FlexSwitch achieves efficient dynamic model merging by decomposing task vectors into sparse masks, signs, and scalars, then making the compression learnable via gating and adaptive bit selection with KNN-based retrieval.
Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning astro-ph.GA · 2026-04-28 · unverdicted · none · ref 73 · internal anchor
A single end-to-end Transformer model unifies stellar labels from heterogeneous spectroscopic surveys into a self-consistent scale without post-hoc recalibration.
Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective cs.CL · 2026-04-25 · conditional · none · ref 59 · 2 links · internal anchor
A controlled formal language task reveals fine-tuning outperforms in-context learning on in-distribution generalization but equals it on out-of-distribution, with ICL showing greater sensitivity to model size and tokenization.
Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs cs.CL · 2026-04-23 · unverdicted · none · ref 2 · internal anchor
LLMs exhibit a clear preference for Japanese culture when answering open cultural questions, with this bias emerging after supervised fine-tuning rather than during pre-training.
How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models cs.LG · 2026-04-22 · unverdicted · none · ref 37 · internal anchor
A fitted iso-depth scaling law measures that one recurrence in looped transformers is worth r^0.46 unique blocks in validation loss.
Are LLM Uncertainty and Correctness Encoded by the Same Features? A Functional Dissociation via Sparse Autoencoders cs.LG · 2026-04-21 · unverdicted · none · ref 53 · internal anchor
Uncertainty and correctness in LLMs are encoded by distinct feature populations, with suppression of confounded features improving accuracy and reducing entropy.
MORPHOGEN: A Multilingual Benchmark for Evaluating Gender-Aware Morphological Generation cs.CL · 2026-04-20 · unverdicted · none · ref 44 · internal anchor
MORPHOGEN is a new multilingual benchmark for testing LLMs on gender-aware morphological generation via rewriting first-person sentences to the opposite gender in French, Arabic, and Hindi.
LQM: Linguistically Motivated Multidimensional Quality Metrics for Machine Translation cs.CL · 2026-04-20 · unverdicted · none · ref 53 · internal anchor
LQM introduces a six-level linguistically motivated error taxonomy for MT evaluation and applies it via expert annotation to LLM outputs on a new 3,850-sentence multi-dialect Arabic corpus.
Prune, Interpret, Evaluate: A Cross-Layer Transcoder-Native Framework for Efficient Circuit Discovery via Feature Attribution cs.CL · 2026-04-18 · unverdicted · none · ref 2 · 2 links · internal anchor
PIE prunes CLT features first via FAP and FAP-Synergy to match baseline circuit fidelity at lower feature budgets on IOI and Doc-String tasks, reducing interpretation costs.
Conjunctive Prompt Attacks in Multi-Agent LLM Systems cs.MA · 2026-04-17 · unverdicted · none · ref 35 · internal anchor
Conjunctive prompt attacks split adversarial elements across agents and routing paths in multi-agent LLM systems, evading isolated defenses and succeeding through topology-aware optimization.
Response-Aware User Memory Selection for LLM Personalization cs.AI · 2026-04-15 · unverdicted · none · ref 2 · internal anchor
RUMS selects LLM user memory via mutual information with model outputs to reduce response uncertainty, outperforming similarity-based methods in human alignment and response quality with up to 95% lower cost.
Ruling Out to Rule In: Contrastive Hypothesis Retrieval for Medical Question Answering cs.IR · 2026-04-06 · unverdicted · none · ref 13 · internal anchor
CHR improves medical question answering retrieval by explicitly promoting evidence aligned with a correct hypothesis while penalizing content aligned with a plausible incorrect alternative.
MetaSAEs: Joint Training with a Decomposability Penalty Produces More Atomic Sparse Autoencoder Latents cs.LG · 2026-04-03 · conditional · none · ref 8 · internal anchor
Joint training of a primary SAE with a meta SAE that applies a decomposability penalty on decoder directions produces more atomic latents, shown by 7.5% lower mean absolute phi and 7.6% higher fuzzing scores on GPT-2.
WMF-AM: Probing LLM Working Memory via Depth-Parameterized Cumulative State Tracking cs.AI · 2026-03-28 · unverdicted · none · ref 50 · internal anchor
WMF-AM is a depth-parameterized benchmark that measures LLMs' cumulative state tracking ability without scratchpads, validated on 28 models across arithmetic and non-arithmetic tasks with ablations confirming the construct.
DeEscalWild: A Real-World Benchmark for Automated De-Escalation Training with SLMs cs.CL · 2026-03-20 · unverdicted · none · ref 18 · internal anchor
DeEscalWild supplies 1,500 high-fidelity de-escalation scenarios that let fine-tuned 3B SLMs outperform general-purpose larger models on realism and dialogue metrics.
The Stepwise Informativeness Assumption: Why are Entropy Dynamics and Reasoning Correlated in LLMs? cs.CL · 2026-03-11 · unverdicted · none · ref 9 · internal anchor
The Stepwise Informativeness Assumption explains the correlation between LLM entropy dynamics and reasoning correctness by positing that correct traces accumulate answer-relevant information stepwise during generation.
PEEM: Prompt Engineering Evaluation Metrics for Interpretable Joint Evaluation of Prompts and Responses cs.CL · 2026-03-11 · unverdicted · none · ref 27 · internal anchor
PEEM is a multi-criteria LLM-based evaluator for prompts and responses that aligns with standard accuracy while enabling zero-shot prompt optimization via feedback.
Physical Commonsense Reasoning for Lower-Resourced Languages and Dialects: a Study on Basque cs.CL · 2026-02-16 · conditional · none · ref 12 · internal anchor
BasPhyCo is the first physical commonsense reasoning dataset for Basque and dialects, showing LLMs have limited performance on verifiability tasks especially with dialects.
VTC: DNN Compilation with Virtual Tensors for Data Movement Elimination cs.DC · 2026-02-11 · unverdicted · none · ref 29 · internal anchor
VTC eliminates unnecessary data movement in DNN compilation using virtual tensors tracked by index mappings, achieving up to 1.93x speedup and 60% memory savings on NVIDIA GPUs.
How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability cs.CL · 2026-01-27 · unverdicted · none · ref 16 · internal anchor
Transformer weights at early training stages are closed-form compositions of bigram, token-interchangeability, and context mappings that directly reflect text-corpus statistics and explain the emergence of semantic associations.
SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators cs.AI · 2025-11-05 · unverdicted · none · ref 20 · internal anchor
SnapStream deploys sparse KV attention in a production inference system on dataflow accelerators, delivering 4x on-chip memory savings for DeepSeek-671B at 128k context with up to 1832 tokens/sec and minimal accuracy loss on LongBench-v2, AIME24, and LiveCodeBench.
Activation Steering with a Feedback Controller cs.LG · 2025-10-05 · unverdicted · none · ref 7 · internal anchor
Popular LLM activation steering methods are shown to act as proportional controllers; a PID steering framework is proposed that improves robustness and outperforms baselines in experiments across model families.
Training Agents Inside of Scalable World Models cs.AI · 2025-09-29 · conditional · none · ref 37 · internal anchor
Dreamer 4 is the first agent to obtain diamonds in Minecraft from only offline data by reinforcement learning inside a scalable world model that accurately predicts game mechanics.
Clotho: Measuring Task-Specific Pre-Generation Test Adequacy for LLM Inputs cs.SE · 2025-09-22 · unverdicted · none · ref 41 · internal anchor
Clotho ranks LLM test inputs by failure likelihood using pre-generation hidden states and GMMs, achieving 0.716 ROC-AUC after labeling 5.4% of inputs on average across eight tasks and three models, with transfer to proprietary models.
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach cs.LG · 2025-02-07 · unverdicted · none · ref 156 · internal anchor
A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models cs.LG · 2024-10-07 · accept · none · ref 91 · internal anchor
LLMs display high variance and major accuracy drops on GSM-Symbolic variants of grade-school math problems, indicating they replicate training patterns rather than execute logical reasoning.

Gemma 2: Improving Open Language Models at a Practical Size

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer