pith. sign in

hub

Beyond autoregression: Discrete diffu- sion for complex reasoning and planning.arXiv preprint arXiv:2410.14157

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

hub tools

citation-role summary

background 4

citation-polarity summary

years

2026 13 2025 1

roles

background 4

polarities

background 4

clear filters

representative citing papers

Adaptive Order Policies for Masked Diffusion

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

A policy network learns to choose unmasking order in masked diffusion by reweighting the loss, outperforming random and heuristic baselines on ordering-sensitive tasks.

Looped Diffusion Language Models

cs.LG · 2026-05-25 · conditional · novelty 6.0

LoopMDM loops early-middle layers in masked diffusion models to match same-size MDM performance with up to 3.3x fewer training FLOPs and outperform on reasoning tasks by up to 8.5 points on GSM8K.

Self-Supervised On-Policy Distillation for Reasoning Language Models

cs.LG · 2026-05-17 · unverdicted · novelty 6.0

SSOPD converts intra-group correct-wrong contrast into process supervision by distilling a teacher distribution from the shortest correct completion into prefixes of the longest wrong completion, improving GRPO on AIME and HMMT benchmarks.

Continuous Latent Diffusion Language Model

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Cola DLM proposes a hierarchical latent diffusion model that learns a text-to-latent mapping, fits a global semantic prior in continuous space with a block-causal DiT, and performs conditional decoding, establishing latent prior modeling as an alternative to token-level autoregressive language model

Reinforcement Learning from Denoising Feedback

cs.CL · 2026-05-25 · unverdicted · novelty 5.0

RLDF is a new RL paradigm for diffusion language models that optimizes toward clipped clean states with weighted timestep sampling and reports substantial gains on reasoning benchmarks for LLaDA and Dream.

SSA: Improving Performance With a Better Scoring Function

cs.CL · 2025-08-20 · unverdicted · novelty 5.0

Replacing Softmax with Scaled Signed Averaging in transformer attention improves generalization under distribution shifts for in-context learning and boosts results on NLP benchmarks.

Agentic Reasoning for Large Language Models

cs.AI · 2026-01-18 · unverdicted · novelty 4.0

The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.

citing papers explorer

Showing 5 of 5 citing papers after filters.