Recognition: unknown
Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought
Pith reviewed 2026-05-08 11:46 UTC · model grok-4.3
The pith
Language models can reason by emitting short sequences of abstract tokens instead of natural-language chains of thought.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Abstract Chain-of-Thought lets a language model generate a short sequence of tokens from a reserved vocabulary in place of a natural-language chain-of-thought before producing the response. Training begins with a policy-iteration warm-up that alternates masking-based bottlenecking from verbal CoT plus supervised fine-tuning, then self-distillation to generate abstract tokens from the prompt alone via constrained decoding; after warm-up, warm-started reinforcement learning under constrained decoding optimizes the abstract sequences.
What carries the argument
Abstract Chain-of-Thought, a discrete latent reasoning mechanism in which the model produces tokens from a reserved vocabulary instead of verbal reasoning steps.
If this is right
- Reasoning token usage drops by up to 11.6 times while accuracy remains comparable on mathematical, instruction-following, and multi-hop tasks.
- The method transfers across language model families without requiring architecture changes.
- An emergent power-law frequency distribution develops over the abstract vocabulary and evolves across training phases.
- Post-training alone can install latent reasoning that reduces inference cost without altering the base model.
Where Pith is reading between the lines
- If the abstract tokens encode reusable reasoning structure, they could enable faster transfer to new domains than verbal chains do.
- The learned abstract sequences might allow direct inspection of internal reasoning steps without needing natural-language translation.
- Dynamic switching between verbal and abstract modes could further optimize cost on tasks of varying difficulty.
- The power-law pattern suggests the abstract vocabulary may scale in complexity similarly to natural language when models grow larger.
Load-bearing premise
The two-stage warm-up plus reinforcement learning under constrained decoding can teach the model to use the new abstract tokens for genuine reasoning rather than superficial pattern matching.
What would settle it
Performance on held-out reasoning tasks falls to the level of a model that outputs random tokens from the same reserved vocabulary.
Figures
read the original abstract
While long, explicit chains-of-thought (CoT) have proven effective on complex reasoning tasks, they are costly to generate during inference. Non-verbal reasoning methods have emerged with shorter generation lengths by leveraging continuous representations, yet their performance lags behind verbalized CoT. We propose $\textbf{Abstract Chain-of-Thought}$, a discrete latent reasoning post-training mechanism in which the language model produces a short sequence of tokens from a reserved vocabulary in lieu of a natural language CoT, before generating a response. To make previously unseen ''abstract'' tokens useful, we introduce a policy iteration-style warm-up loop that alternates between (i.) bottlenecking from a verbal CoT via masking and performing supervised fine-tuning, and (ii.) self-distillation by training the model to generate abstract tokens from the prompt alone via constrained decoding with the codebook. After warm-up, we optimize the generation of abstract sequences with warm-started reinforcement learning under constrained decoding. Abstract-CoT achieves up to $11.6\times$ fewer reasoning tokens while demonstrating comparable performance across mathematical reasoning, instruction-following, and multi-hop reasoning, and generalizes across language model families. We also find an emergent power law distribution over the abstract vocabulary, akin to those seen in natural language, that evolves across the training phases. Our findings highlight the potential for post-training latent reasoning mechanisms that enable efficient inference through a learned abstract reasoning language.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Abstract Chain-of-Thought (Abstract-CoT), a post-training method in which language models generate short sequences of discrete tokens drawn from a reserved abstract vocabulary in place of explicit natural-language chains-of-thought. The approach uses a two-stage warm-up (masking-based bottlenecking followed by self-distillation under constrained decoding) and subsequent reinforcement learning to optimize abstract-sequence generation. It reports up to 11.6× reduction in reasoning tokens while maintaining comparable performance on mathematical reasoning, instruction-following, and multi-hop reasoning tasks, with generalization across model families and an emergent power-law distribution over the abstract vocabulary.
Significance. If the abstract tokens can be shown to carry genuine intermediate reasoning content rather than serving as compressed predictors, the method would offer a practical route to substantially lower inference cost for reasoning workloads while preserving performance. The cross-family generalization and the observed power-law statistics are noteworthy and, if robust, would strengthen the case for learned discrete latent reasoning languages.
major comments (2)
- [Method (warm-up loop and RL stage)] The central claim that abstract sequences implement latent reasoning (rather than direct prompt-to-answer mappings) is load-bearing yet unsupported by the described procedure. The two-stage warm-up plus RL under constrained decoding permits the model to learn abstract tokens as compressed class labels or surface cues distilled from verbal CoT; no causal intervention, information-probing, or out-of-distribution test is described that would distinguish these alternatives.
- [Experiments and Results] No ablation studies, error bars, training curves, or per-task quantitative tables are referenced that would allow assessment of whether performance is truly comparable or whether RL instabilities occurred. The 11.6× token-reduction figure therefore cannot be evaluated for reliability or sensitivity to the abstract-vocabulary size hyperparameter.
minor comments (2)
- [Abstract and Method] The phrase 'policy iteration-style warm-up loop' is used without specifying the exact alternation schedule, reward shaping, or constrained-decoding implementation details needed for reproducibility.
- [Abstract] A concrete example showing a prompt, the generated abstract token sequence, and the corresponding verbal CoT would help readers understand the learned mapping.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below with point-by-point responses, indicating planned revisions where appropriate to strengthen the work.
read point-by-point responses
-
Referee: The central claim that abstract sequences implement latent reasoning (rather than direct prompt-to-answer mappings) is load-bearing yet unsupported by the described procedure. The two-stage warm-up plus RL under constrained decoding permits the model to learn abstract tokens as compressed class labels or surface cues distilled from verbal CoT; no causal intervention, information-probing, or out-of-distribution test is described that would distinguish these alternatives.
Authors: We agree that direct evidence distinguishing latent reasoning from compressed direct mappings would strengthen the central claim. The warm-up procedure is explicitly designed to enforce an information bottleneck: verbal CoT is masked during supervised fine-tuning, and constrained decoding during self-distillation forces the model to produce and rely on abstract tokens before generating the answer. The subsequent RL stage further optimizes the abstract sequences for task performance. Indirect support comes from the emergent power-law distribution over the abstract vocabulary (evolving across phases, akin to natural language) and consistent generalization across model families, which would be unlikely for arbitrary class labels. We will add a dedicated discussion subsection addressing alternative interpretations and outlining future probing experiments (e.g., mutual information analysis between abstract tokens and reasoning steps). revision: partial
-
Referee: No ablation studies, error bars, training curves, or per-task quantitative tables are referenced that would allow assessment of whether performance is truly comparable or whether RL instabilities occurred. The 11.6× token-reduction figure therefore cannot be evaluated for reliability or sensitivity to the abstract-vocabulary size hyperparameter.
Authors: We acknowledge that the current manuscript lacks these details, limiting assessment of robustness. In the revised version we will expand the Experiments and Results sections to include: ablation studies on the two warm-up stages and on abstract vocabulary size (32/64/128/256 tokens); error bars computed over three random seeds for all main results; training curves for the RL phase showing reward and token-length trajectories; and expanded per-task tables reporting accuracy, token counts, and reduction ratios. The 11.6× figure is the peak reduction observed on GSM8K with vocabulary size 128; we will add a sensitivity table across vocabulary sizes and clarify that all reported numbers use the same constrained decoding setup. revision: yes
Circularity Check
No significant circularity; derivation relies on standard empirical training procedures
full rationale
The paper presents Abstract-CoT as a post-training procedure using masking-based bottlenecking, supervised fine-tuning, self-distillation via constrained decoding, and subsequent reinforcement learning. No equations, predictions, or first-principles claims are described that reduce the token reduction or performance results to a fitted parameter or self-referential definition by construction. The power-law observation over abstract tokens is reported as an empirical finding, not a derived necessity. The method is self-contained against external benchmarks via reported experiments on math, instruction, and multi-hop tasks, with no load-bearing self-citations or ansatzes that loop back to the target claims.
Axiom & Free-Parameter Ledger
free parameters (1)
- abstract vocabulary size
invented entities (1)
-
abstract tokens from reserved vocabulary
no independent evidence
Forward citations
Cited by 2 Pith papers
-
Dynamic Latent Routing
Dynamic Latent Routing jointly learns discrete latent codes, routing policies, and model parameters via dynamic search to match or exceed supervised fine-tuning by 6.6 points on average in low-data settings across fou...
-
Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding
Advanced language representations shape LLMs' schemas to improve knowledge activation and problem-solving.
Reference graph
Works this paper leans on
-
[1]
L1: Controlling how long a reasoning model thinks with reinforcement learning
Pranjal Aggarwal and Sean Welleck. L1: Controlling how long a reasoning model thinks with reinforcement learning. In Second Conference on Language Modeling, 2025. URL https://openreview.net/forum?id=4jdIxXBNve
2025
-
[2]
Dolci-Think-RL-7B
AI2. Dolci-Think-RL-7B . Hugging Face Datasets, 2025 a . URL https://huggingface.co/datasets/allenai/Dolci-Think-RL-7B. Accessed: 2026-02-08
2025
-
[3]
Dolci-Think-SFT-7B
AI2. Dolci-Think-SFT-7B . Hugging Face Datasets, 2025 b . URL https://huggingface.co/datasets/allenai/Dolci-Think-SFT-7B. Accessed: 2026-02-08
2025
-
[4]
AIME problems and solutions
Art of Problem Solving . AIME problems and solutions. https://artofproblemsolving.com/wiki/index.php/AIME_Problems_and_Solutions, 2025. Accessed: 2026-04-24
2025
-
[5]
Natasha Butt, Ariel Kwiatkowski, Ismail Labiad, Julia Kempe, and Yann Ollivier. Soft tokens, hard truths, 2025. URL https://arxiv.org/abs/2509.19170
-
[6]
Jeffrey Cheng and Benjamin Van Durme. Compressed chain of thought: Efficient reasoning through dense representations, 2024. URL https://arxiv.org/abs/2412.13171
-
[7]
From explicit cot to implicit cot: Learning to internalize cot step by step
Yuntian Deng, Yejin Choi, and Stuart Shieber. From explicit cot to implicit cot: Learning to internalize cot step by step, 2024. URL https://arxiv.org/abs/2405.14838
-
[8]
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois, Bal \'a zs Galambosi, Percy Liang, and Tatsunori B Hashimoto. Length-controlled alpacaeval: A simple way to debias automatic evaluators. arXiv preprint arXiv:2404.04475, 2024
work page internal anchor Pith review arXiv 2024
-
[9]
Think before you speak: Training language models with pause tokens
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, and Vaishnavh Nagarajan. Think before you speak: Training language models with pause tokens. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=ph04CRkPdC
2024
-
[10]
Granite 3.0 language models, October 2024
IBM Granite Team. Granite 3.0 language models, October 2024. URL https://github.com/ibm-granite/granite-3.0-language-models/
2024
-
[11]
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, ...
-
[12]
Training large language model to reason in a continuous latent space, 2025
Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason E Weston, and Yuandong Tian. Training large language model to reason in a continuous latent space, 2025. URL https://openreview.net/forum?id=tG4SgayTtk
2025
-
[13]
Measuring mathematical problem solving with the MATH dataset
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the MATH dataset. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. URL https://openreview.net/forum?id=7Bywt2mQsCe
2021
-
[14]
Thinkprune: Pruning long chain-of-thought of LLM s via reinforcement learning
Bairu Hou, Yang Zhang, Jiabao Ji, Yujian Liu, Kaizhi Qian, Jacob Andreas, and Shiyu Chang. Thinkprune: Pruning long chain-of-thought of LLM s via reinforcement learning. Transactions on Machine Learning Research, 2026. ISSN 2835-8856. URL https://openreview.net/forum?id=V51gPu1uQD
2026
-
[15]
Distilling step-by-step: Outperforming larger language models with less training data
Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, and Tomas Pfister. Distilling step-by-step: Outperforming larger language models with less training data. In Findings of the Association for Computational Linguistics: ACL 2023, 2023
2023
-
[16]
Expanding computation spaces of large language models at inference time, 2025
Yoonna Jang, Kisu Yang, and Isabelle Augenstein. Expanding computation spaces of large language models at inference time, 2025. URL https://arxiv.org/abs/2509.24884
-
[17]
e1: Learning adaptive control of reasoning effort.arXiv preprint arXiv:2510.27042, 2025
Michael Kleinman, Matthew Trager, Alessandro Achille, Wei Xia, and Stefano Soatto. e1: Learning adaptive control of reasoning effort, 2025. URL https://arxiv.org/abs/2510.27042
-
[18]
Large language models are zero-shot reasoners
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS '22, Red Hook, NY, USA, 2022. Curran Associates Inc. ISBN 9781713871088
2022
-
[19]
Tomek Korbak, Mikita Balesni, Elizabeth Barnes, Yoshua Bengio, Joe Benton, Joseph Bloom, Mark Chen, Alan Cooney, Allan Dafoe, Anca Dragan, Scott Emmons, Owain Evans, David Farhi, Ryan Greenblatt, Dan Hendrycks, Marius Hobbhahn, Evan Hubinger, Geoffrey Irving, Erik Jenner, Daniel Kokotajlo, Victoria Krakovna, Shane Legg, David Lindner, David Luan, Aleksand...
-
[20]
Measuring Faithfulness in Chain-of-Thought Reasoning
Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė, Karina Nguyen, Newton Cheng, Nicholas Joseph, Nicholas Schiefer, Oliver Rausch, Robin Larson, Sam McCandlish, Sandipan Kundu, Saurav Kadavath, Shannon Yang, Thomas Henighan, Timothy Maxwel...
work page Pith review arXiv 2023
-
[21]
The power of scale for parameter-efficient prompt tuning
Brian Lester, Rami Al-Rfou, and Noah Constant. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
2021
-
[22]
Numinamath
Jia Li, Edward Beeching, Lewis Tunstall, Ben Lipkin, Roman Soletskyi, Shengyi Costa Huang, Kashif Rasul, Longhui Yu, Albert Jiang, Ziju Shen, Zihan Qin, Bin Dong, Li Zhou, Yann Fleureau, Guillaume Lample, and Stanislas Polu. Numinamath. [https://huggingface.co/AI-MO/NuminaMath-CoT](https://github.com/project-numina/aimo-progress-prize/blob/main/report/num...
2024
-
[23]
Prefix-tuning: Optimizing continuous prompts for generation
Xiang Lisa Li and Percy Liang. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021
2021
-
[24]
Pause tokens strictly increase the expressivity of constant-depth transformers
Charles London and Varun Kanade. Pause tokens strictly increase the expressivity of constant-depth transformers. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=eG5oh8l1WZ
2025
-
[25]
Chenwei Lou, Zewei Sun, Xinnian Liang, Meng Qu, Wei Shen, Wenqi Wang, Yuntao Li, Qingping Yang, and Shuangzhi Wu. Adacot: Pareto-optimal adaptive chain-of-thought triggering via reinforcement learning, 2025. URL https://arxiv.org/abs/2505.11896
-
[26]
Exact expressive power of transformers with padding
William Merrill and Ashish Sabharwal. Exact expressive power of transformers with padding. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=O1abxStFcy
2025
-
[27]
Learning to compress prompts with gist tokens
Jesse Mu, Xiang Lisa Li, and Noah Goodman. Learning to compress prompts with gist tokens. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=2DtxPCL3T5
2023
-
[28]
Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candes, and Tatsunori Hashimoto. s1: Simple test-time scaling. In Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (eds.), Proceedings of the 2025 Conference on Empirical Methods in Natural Lang...
-
[29]
Team Olmo, Allyson Ettinger, Amanda Bertsch, Bailey Kuehl, David Graham, David Heineman, Dirk Groeneveld, Faeze Brahman, Finbarr Timbers, Hamish Ivison, Jacob Morrison, Jake Poznanski, Kyle Lo, Luca Soldaini, Matt Jordan, Mayee Chen, Michael Noukhovitch, Nathan Lambert, Pete Walsh, Pradeep Dasigi, Robert Berry, Saumya Malik, Saurabh Shah, Scott Geng, Shan...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[30]
gpt-oss-120b & gpt-oss-20b Model Card
OpenAI. gpt-oss-120b & gpt-oss-20b model card, 2025. URL https://arxiv.org/abs/2508.10925
work page internal anchor Pith review arXiv 2025
-
[31]
Jacob Pfau, William Merrill, and Samuel R. Bowman. Let s think dot by dot: Hidden computation in transformer language models. In First Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=NikbrdtYvG
2024
-
[32]
David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, and Samuel R. Bowman. GPQA : A graduate-level google-proof q&a benchmark. In First Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=Ti67584b98
2024
-
[33]
Shah, Khush Gupta, Keshav Ramji, and Pratik Chaudhari
Alok N. Shah, Khush Gupta, Keshav Ramji, and Pratik Chaudhari. Language modeling with learned meta-tokens, 2025. URL https://arxiv.org/abs/2509.16278
-
[34]
Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. Deepseekmath: Pushing the limits of mathematical reasoning in open language models, 2024. URL https://arxiv.org/abs/2402.03300
work page internal anchor Pith review arXiv 2024
-
[35]
Hybridcot: Interleaving latent and text chain-of-thought for efficient reasoning, 2026
Shannon Zejiang Shen, Rulin Shao, Chenyu Wang, Songlin Yang, Vincent-Pierre Berges, Gargi Ghosh, Pang Wei Koh, Luke Zettlemoyer, Yoon Kim, Jason E Weston, David Sontag, and Wen tau Yih. Hybridcot: Interleaving latent and text chain-of-thought for efficient reasoning, 2026. URL https://openreview.net/forum?id=4mfGbMzTwu
2026
-
[36]
CODI: compress- ing chain-of-thought into continuous space via self-distillation
Zhenyi Shen, Hanqi Yan, Linhai Zhang, Zhanghao Hu, Yali Du, and Yulan He. CODI : Compressing chain-of-thought into continuous space via self-distillation. In Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (eds.), Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp.\ 677--693, Suzhou, C...
-
[37]
Token assorted: Mixing latent and text tokens for improved language model reasoning
DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, and Qinqing Zheng. Token assorted: Mixing latent and text tokens for improved language model reasoning. In Proceedings of the 42nd International Conference on Machine Learning, 2025
2025
-
[38]
Miles Turpin, Julian Michael, Ethan Perez, and Samuel R. Bowman. Language models don't always say what they think: Unfaithful explanations in chain-of-thought prompting. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=bzs4uPLXvi
2023
-
[39]
System-1.5 reasoning: Traversal in language and latent spaces with dynamic shortcuts
Xiaoqiang Wang, Suyuchen Wang, Yun Zhu, and Bang Liu. System-1.5 reasoning: Traversal in language and latent spaces with dynamic shortcuts. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=MNduv07wAu
2025
-
[40]
Chi, Quoc V
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS '22, Red Hook, NY, USA, 2022. Curran Associates Inc. ISBN 9781713871088
2022
-
[41]
Tokenskip: Controlling chain-of-thought compression for efficient reasoning
Heming Xia, Chak Tou Leong, Wenjie Wang, Yongqi Li, and Wenjie Li. Tokenskip: Controlling chain-of-thought compression for efficient reasoning. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
2025
-
[42]
S oft C o T : Soft chain-of-thought for efficient reasoning with LLM s
Yige Xu, Xu Guo, Zhiwei Zeng, and Chunyan Miao. S oft C o T : Soft chain-of-thought for efficient reasoning with LLM s. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar (eds.), Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 23336--23351, Vienna, Austria, J...
-
[43]
From long to lean: Performance-aware and adaptive chain-of-thought compression via multi-round refinement
JianZhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Zike Yuan, Yang Xiang, and Buzhou Tang. From long to lean: Performance-aware and adaptive chain-of-thought compression via multi-round refinement. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
2025
-
[44]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou, Jingren Zhou, Junyang Lin, Kai Dang, Keqin Bao, Kexin Yang, ...
work page internal anchor Pith review arXiv 2025
-
[45]
Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. H otpot QA : A dataset for diverse, explainable multi-hop question answering. In Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun ' ichi Tsujii (eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proce...
-
[46]
Lightthinker: Thinking step-by-step compression
Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Da, Da Zheng, Huajun Chen, and Ningyu Zhang. Lightthinker: Thinking step-by-step compression. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025 a
2025
-
[47]
Extract the final answer from:
Zhen Zhang, Xuehai He, Weixiang Yan, Ao Shen, Chenyang Zhao, Shuohang Wang, Yelong Shen, and Xin Eric Wang. Soft thinking: Unlocking the reasoning potential of llms in continuous concept space, 2025 b . URL https://arxiv.org/abs/2505.15778
-
[48]
@esa (Ref
\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
-
[49]
\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
-
[50]
@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.