Reddi, Stefanie Jegelka, and Sanjiv Kumar

Khashayar Gatmiry, Nikunj Saunshi, Sashank J · 2024 · arXiv 2410.08292

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

Learned Relay Representations enable masked diffusion models to propagate useful latent information across denoising steps, scaling to Fast-dLLM v2 to outperform supervised finetuning on coding tasks while cutting inference latency by up to 32%.

Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent

cs.LG · 2026-05-07 · conditional · novelty 7.0

Multi-layer transformers can implement in-context logistic regression by performing normalized gradient descent steps layer by layer, obtained via supervised training of a single attention layer followed by recurrent application with convergence and OOD guarantees.

Scaling Latent Reasoning via Looped Language Models

cs.CL · 2025-10-29 · unverdicted · novelty 7.0

Looped language models with latent iterative computation and entropy-regularized depth allocation achieve performance matching up to 12B standard LLMs through superior knowledge manipulation.

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

cs.LG · 2025-02-07 · unverdicted · novelty 7.0

A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.

ELT: Elastic Looped Transformers for Visual Generation

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

Elastic Looped Transformers share weights across recurrent blocks and apply intra-loop self-distillation to deliver 4x parameter reduction while matching competitive FID and FVD scores on ImageNet and UCF-101.

Simply Stabilizing the Loop via Fully Looped Transformer

cs.LG · 2026-05-11

citing papers explorer

Showing 6 of 6 citing papers.

Learned Relay Representations for Forward-Thinking Discrete Diffusion Models cs.LG · 2026-05-21 · unverdicted · none · ref 5
Learned Relay Representations enable masked diffusion models to propagate useful latent information across denoising steps, scaling to Fast-dLLM v2 to outperform supervised finetuning on coding tasks while cutting inference latency by up to 32%.
Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent cs.LG · 2026-05-07 · conditional · none · ref 16
Multi-layer transformers can implement in-context logistic regression by performing normalized gradient descent steps layer by layer, obtained via supervised training of a single attention layer followed by recurrent application with convergence and OOD guarantees.
Scaling Latent Reasoning via Looped Language Models cs.CL · 2025-10-29 · unverdicted · none · ref 8
Looped language models with latent iterative computation and entropy-regularized depth allocation achieve performance matching up to 12B standard LLMs through superior knowledge manipulation.
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach cs.LG · 2025-02-07 · unverdicted · none · ref 58
A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.
ELT: Elastic Looped Transformers for Visual Generation cs.CV · 2026-04-10 · unverdicted · none · ref 20
Elastic Looped Transformers share weights across recurrent blocks and apply intra-loop self-distillation to deliver 4x parameter reduction while matching competitive FID and FVD scores on ImageNet and UCF-101.
Simply Stabilizing the Loop via Fully Looped Transformer cs.LG · 2026-05-11 · unreviewed · ref 11

Reddi, Stefanie Jegelka, and Sanjiv Kumar

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer