pith. sign in

Title resolution pending

25 Pith papers cite this work. Polarity classification is still indexing.

25 Pith papers citing it

representative citing papers

Reward Design for Physical Reasoning in Vision-Language Models

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

Accuracy-based rewards outperform SFT and other reward variants in GRPO training of VLMs on the PhyX physics benchmark, with attention-weight rewards raising spatial reasoning accuracy from 0.27 to 0.50.

Introspective Diffusion Language Models

cs.AI · 2026-04-13 · unverdicted · novelty 6.0

I-DLM matches same-scale autoregressive model quality in diffusion language models by enforcing introspective consistency via strided decoding, outperforming prior DLMs on 15 benchmarks including 69.6 on AIME-24.

MixFlow: Mixed Source Distributions Improve Rectified Flows

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

Mixing unconditional Gaussian noise with a κ-conditioned source during training of rectified flows reduces path curvature, yielding 12% better FID scores and faster sampling than standard rectified flows.

Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

cs.AI · 2025-09-30 · unverdicted · novelty 6.0

Post-training on reasoning tasks sparks the emergence of specialized attention heads that enable structured computation, with SFT adding stable heads while GRPO uses dynamic activation and pruning tied to reward signals, and controllable think models relying on compensatory heads instead of specific

Dictionary learning for Kernel EDMD

math.DS · 2026-04-28 · unverdicted · novelty 5.0

A dictionary learning method optimizes weighted kernels via gradients for kEDMD to approximate Koopman operators, with pruning of unimportant kernels based on learned weights.

Learning Adaptive Reasoning Paths for Efficient Visual Reasoning

cs.CV · 2026-04-16 · unverdicted · novelty 5.0

AVR trains vision-language models to adaptively select among full reasoning, perception-only, or direct-answer formats using a modified policy optimization method, reducing token use by 50-90% with little accuracy loss.

citing papers explorer

Showing 25 of 25 citing papers.