hub

arXiv preprint arXiv:2402.01761 , year=

Rethinking interpretability in the era of large language models , author= · 2024 · arXiv 2402.01761

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

OceanCBM: A Concept Bottleneck Model for Mechanistic Interpretability in Ocean Forecasting

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

OceanCBM is the first concept bottleneck model for spatiotemporal ocean prediction that uses mixed supervision on physical concepts and a free concept to deliver consistent mechanistic representations for mixed layer heat content forecasts.

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

cs.LG · 2026-05-11 · unverdicted · novelty 6.0 · 3 refs

DECO is a sparse MoE architecture with ReLU-based routing, learnable expert scaling, and NormSiLU activation that matches dense Transformer performance at 20% expert activation and delivers 2.93x speedup on Jetson AGX Orin.

Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

cs.AI · 2026-04-05 · unverdicted · novelty 6.0

The paper introduces the Agentic Risk Standard (ARS) as a payment settlement framework that delivers predefined compensation for AI agent execution failures, misalignment, or unintended outcomes.

The Confusion is Real: GRAPHIC -- A Network Science Approach to Confusion Matrices in Deep Learning

cs.LG · 2026-02-23 · unverdicted · novelty 6.0

GRAPHIC interprets confusion matrices from linear classifiers on intermediate layers as graphs to visualize and quantify class confusion dynamics in deep learning.

MoveFM-R: Advancing Mobility Foundation Models via Language-driven Semantic Reasoning

cs.LG · 2025-09-26 · unverdicted · novelty 6.0

MoveFM-R is a framework that bridges mobility foundation models and LLMs using semantically enhanced location encoding, progressive curriculum alignment, and interactive self-reflection to generate plausible trajectories from language inputs.

Large Language Models for Combinatorial Optimization of Design Structure Matrix

cs.CE · 2025-06-11 · unverdicted · novelty 6.0

LLM framework combines network topology and domain knowledge for iterative DSM sequencing optimization and outperforms stochastic and deterministic baselines on convergence speed and solution quality.

Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

The authors propose creating data probes—synthetic sequences from defined random processes—to reveal how data properties drive LLM behavior across workflow stages.

Wearable AI in the Era of Large Sensor Models

eess.SP · 2026-04-11 · unverdicted · novelty 5.0

Large Sensor Models trained on large-scale multimodal wearable data can provide a scalable, general framework for wearable AI by learning transferable representations across modalities and tasks.

Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.

Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs

cs.CL · 2026-04-11 · unverdicted · novelty 5.0

FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

cs.CL · 2026-03-30

citing papers explorer

Showing 11 of 11 citing papers.

OceanCBM: A Concept Bottleneck Model for Mechanistic Interpretability in Ocean Forecasting cs.LG · 2026-05-12 · unverdicted · none · ref 35
OceanCBM is the first concept bottleneck model for spatiotemporal ocean prediction that uses mixed supervision on physical concepts and a free concept to deliver consistent mechanistic representations for mixed layer heat content forecasts.
DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices cs.LG · 2026-05-11 · unverdicted · none · ref 121 · 3 links
DECO is a sparse MoE architecture with ReLU-based routing, learnable expert scaling, and NormSiLU activation that matches dense Transformer performance at 20% expert activation and delivers 2.93x speedup on Jetson AGX Orin.
Quantifying Trust: Financial Risk Management for Trustworthy AI Agents cs.AI · 2026-04-05 · unverdicted · none · ref 37
The paper introduces the Agentic Risk Standard (ARS) as a payment settlement framework that delivers predefined compensation for AI agent execution failures, misalignment, or unintended outcomes.
The Confusion is Real: GRAPHIC -- A Network Science Approach to Confusion Matrices in Deep Learning cs.LG · 2026-02-23 · unverdicted · none · ref 13
GRAPHIC interprets confusion matrices from linear classifiers on intermediate layers as graphs to visualize and quantify class confusion dynamics in deep learning.
MoveFM-R: Advancing Mobility Foundation Models via Language-driven Semantic Reasoning cs.LG · 2025-09-26 · unverdicted · none · ref 38
MoveFM-R is a framework that bridges mobility foundation models and LLMs using semantically enhanced location encoding, progressive curriculum alignment, and interactive self-reflection to generate plausible trajectories from language inputs.
Large Language Models for Combinatorial Optimization of Design Structure Matrix cs.CE · 2025-06-11 · unverdicted · none · ref 60
LLM framework combines network topology and domain knowledge for iterative DSM sequencing optimization and outperforms stochastic and deterministic baselines on convergence speed and solution quality.
Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance cs.AI · 2026-05-11 · unverdicted · none · ref 34
The authors propose creating data probes—synthetic sequences from defined random processes—to reveal how data properties drive LLM behavior across workflow stages.
Wearable AI in the Era of Large Sensor Models eess.SP · 2026-04-11 · unverdicted · none · ref 33
Large Sensor Models trained on large-scale multimodal wearable data can provide a scalable, general framework for wearable AI by learning transferable representations across modalities and tasks.
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning cs.CL · 2026-04-11 · unverdicted · none · ref 136
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs cs.CL · 2026-04-11 · unverdicted · none · ref 151
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning cs.CL · 2026-03-30 · unreviewed · ref 15

arXiv preprint arXiv:2402.01761 , year=

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer