OceanCBM is the first concept bottleneck model for spatiotemporal ocean prediction that uses mixed supervision on physical concepts and a free concept to deliver consistent mechanistic representations for mixed layer heat content forecasts.
hub
arXiv preprint arXiv:2402.01761 , year=
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
representative citing papers
DECO is a sparse MoE architecture with ReLU-based routing, learnable expert scaling, and NormSiLU activation that matches dense Transformer performance at 20% expert activation and delivers 2.93x speedup on Jetson AGX Orin.
The paper introduces the Agentic Risk Standard (ARS) as a payment settlement framework that delivers predefined compensation for AI agent execution failures, misalignment, or unintended outcomes.
GRAPHIC interprets confusion matrices from linear classifiers on intermediate layers as graphs to visualize and quantify class confusion dynamics in deep learning.
MoveFM-R is a framework that bridges mobility foundation models and LLMs using semantically enhanced location encoding, progressive curriculum alignment, and interactive self-reflection to generate plausible trajectories from language inputs.
LLM framework combines network topology and domain knowledge for iterative DSM sequencing optimization and outperforms stochastic and deterministic baselines on convergence speed and solution quality.
The authors propose creating data probes—synthetic sequences from defined random processes—to reveal how data properties drive LLM behavior across workflow stages.
Large Sensor Models trained on large-scale multimodal wearable data can provide a scalable, general framework for wearable AI by learning transferable representations across modalities and tasks.
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
citing papers explorer
-
OceanCBM: A Concept Bottleneck Model for Mechanistic Interpretability in Ocean Forecasting
OceanCBM is the first concept bottleneck model for spatiotemporal ocean prediction that uses mixed supervision on physical concepts and a free concept to deliver consistent mechanistic representations for mixed layer heat content forecasts.
-
DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices
DECO is a sparse MoE architecture with ReLU-based routing, learnable expert scaling, and NormSiLU activation that matches dense Transformer performance at 20% expert activation and delivers 2.93x speedup on Jetson AGX Orin.
-
Quantifying Trust: Financial Risk Management for Trustworthy AI Agents
The paper introduces the Agentic Risk Standard (ARS) as a payment settlement framework that delivers predefined compensation for AI agent execution failures, misalignment, or unintended outcomes.
-
The Confusion is Real: GRAPHIC -- A Network Science Approach to Confusion Matrices in Deep Learning
GRAPHIC interprets confusion matrices from linear classifiers on intermediate layers as graphs to visualize and quantify class confusion dynamics in deep learning.
-
MoveFM-R: Advancing Mobility Foundation Models via Language-driven Semantic Reasoning
MoveFM-R is a framework that bridges mobility foundation models and LLMs using semantically enhanced location encoding, progressive curriculum alignment, and interactive self-reflection to generate plausible trajectories from language inputs.
-
Large Language Models for Combinatorial Optimization of Design Structure Matrix
LLM framework combines network topology and domain knowledge for iterative DSM sequencing optimization and outperforms stochastic and deterministic baselines on convergence speed and solution quality.
-
Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance
The authors propose creating data probes—synthetic sequences from defined random processes—to reveal how data properties drive LLM behavior across workflow stages.
-
Wearable AI in the Era of Large Sensor Models
Large Sensor Models trained on large-scale multimodal wearable data can provide a scalable, general framework for wearable AI by learning transferable representations across modalities and tasks.
-
Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning
APMPO boosts average Pass@1 scores on math reasoning benchmarks by 3 points over GRPO by using an adaptive power-mean policy objective and feedback-driven clipping bounds in RLVR training.
-
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs
FREIA applies free energy principles and adaptive advantage shaping to unsupervised RL, outperforming baselines by 0.5-3.5 Pass@1 points on math reasoning with a 1.5B model.
- The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning