Token assorted: Mixing latent and text tokens for improved language model reasoning

DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng · 2025 · arXiv 2502.03275

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Dynamic Latent Routing

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

Dynamic Latent Routing jointly learns discrete latent codes, routing policies, and model parameters via dynamic search to match or exceed supervised fine-tuning by 6.6 points on average in low-data settings across four datasets and six models.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

Shorthand for Thought: Compressing LLM Reasoning via Entropy-Guided Supertokens

cs.CL · 2026-04-29 · unverdicted · novelty 7.0

Entropy-guided supertokens from BPE on reasoning traces compress LLM outputs by 8.1% on average across models and math benchmarks with no accuracy loss while exposing strategy differences between correct and incorrect traces.

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.

Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

cs.CV · 2026-04-20 · unverdicted · novelty 6.0 · 2 refs

OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.

SeLaR: Selective Latent Reasoning in Large Language Models

cs.CL · 2026-04-09 · unverdicted · novelty 6.0

SeLaR selectively applies latent soft reasoning in LLMs via entropy gating and contrastive regularization, outperforming standard CoT on five benchmarks without training.

LEPO: Latent Reasoning Policy Optimization for Large Language Models

cs.LG · 2026-04-20 · unverdicted · novelty 5.0

LEPO applies RL to continuous latent representations in LLMs by injecting Gumbel-Softmax stochasticity for diverse trajectory sampling and unified gradient estimation, outperforming existing discrete and latent RL methods.

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

cs.CL · 2025-03-20 · accept · novelty 5.0

A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

citing papers explorer

Showing 8 of 8 citing papers.

Dynamic Latent Routing cs.LG · 2026-05-14 · unverdicted · none · ref 42
Dynamic Latent Routing jointly learns discrete latent codes, routing policies, and model parameters via dynamic search to match or exceed supervised fine-tuning by 6.6 points on average in low-data settings across four datasets and six models.
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost cs.AI · 2026-05-07 · conditional · none · ref 89
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
Shorthand for Thought: Compressing LLM Reasoning via Entropy-Guided Supertokens cs.CL · 2026-04-29 · unverdicted · none · ref 17
Entropy-guided supertokens from BPE on reasoning traces compress LLM outputs by 8.1% on average across models and math benchmarks with no accuracy loss while exposing strategy differences between correct and incorrect traces.
HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering cs.AI · 2026-04-22 · unverdicted · none · ref 86
HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation cs.CV · 2026-04-20 · unverdicted · none · ref 90 · 2 links
OneVL achieves superior accuracy to explicit chain-of-thought reasoning at answer-only latency by supervising latent tokens with a visual world model decoder that predicts future frames.
SeLaR: Selective Latent Reasoning in Large Language Models cs.CL · 2026-04-09 · unverdicted · none · ref 37
SeLaR selectively applies latent soft reasoning in LLMs via entropy gating and contrastive regularization, outperforming standard CoT on five benchmarks without training.
LEPO: Latent Reasoning Policy Optimization for Large Language Models cs.LG · 2026-04-20 · unverdicted · none · ref 11
LEPO applies RL to continuous latent representations in LLMs by injecting Gumbel-Softmax stochasticity for diverse trajectory sampling and unified gradient estimation, outperforming existing discrete and latent RL methods.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models cs.CL · 2025-03-20 · accept · none · ref 163
A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

Token assorted: Mixing latent and text tokens for improved language model reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer