pith. sign in

MIT press Cambridge

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 5

verdicts

UNVERDICTED 5

roles

background 2

polarities

background 2

representative citing papers

State-Centric Decision Process

cs.AI · 2026-05-12 · unverdicted · novelty 7.0

SDP constructs a task-induced state space from raw text by having agents commit to and certify natural-language predicates as states, enabling structured planning and analysis in unstructured language environments.

Score-Based One-step MeanFlow Policy Optimization

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

SOM is an actor-critic algorithm that constructs the target velocity field for one-step MeanFlow policies directly from the Q-function via score estimation and probability flow ODE, achieving claimed SOTA on locomotion tasks with reduced training and inference time.

MeMo: Memory as a Model

cs.CL · 2026-05-14 · unverdicted · novelty 5.0

MeMo encodes new knowledge into a separate memory model that integrates with frozen LLMs, showing strong performance on QA benchmarks while avoiding catastrophic forgetting and working without access to model weights.

citing papers explorer

Showing 5 of 5 citing papers.

  • DepthAgent: Towards Better Universal Depth Estimation via Sample-wise Expert Selection cs.CV · 2026-05-22 · unverdicted · none · ref 61

    A reinforcement-learned vision-language agent adaptively selects and fuses monocular depth experts per sample for better performance across camera geometries.

  • State-Centric Decision Process cs.AI · 2026-05-12 · unverdicted · none · ref 40

    SDP constructs a task-induced state space from raw text by having agents commit to and certify natural-language predicates as states, enabling structured planning and analysis in unstructured language environments.

  • Score-Based One-step MeanFlow Policy Optimization cs.LG · 2026-05-22 · unverdicted · none · ref 25

    SOM is an actor-critic algorithm that constructs the target velocity field for one-step MeanFlow policies directly from the Q-function via score estimation and probability flow ODE, achieving claimed SOTA on locomotion tasks with reduced training and inference time.

  • Iterative Critique-and-Routing Controller for Multi-Agent Systems with Heterogeneous LLMs cs.AI · 2026-05-09 · unverdicted · none · ref 32

    A critique-and-routing controller cast as a finite-horizon MDP with policy-gradient optimization outperforms one-shot routing baselines on reasoning benchmarks while using the strongest agent for under 25% of calls.

  • MeMo: Memory as a Model cs.CL · 2026-05-14 · unverdicted · none · ref 79

    MeMo encodes new knowledge into a separate memory model that integrates with frozen LLMs, showing strong performance on QA benchmarks while avoiding catastrophic forgetting and working without access to model weights.