Recognition: 2 theorem links
· Lean TheoremTimelineReasoner: Advancing Timeline Summarization with Large Reasoning Models
Pith reviewed 2026-05-14 21:29 UTC · model grok-4.3
The pith
TimelineReasoner uses large reasoning models to actively track events globally and fill gaps through targeted retrieval, producing more accurate and coherent timelines than passive LLM approaches.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TimelineReasoner shifts timeline summarization from static generation to an active, reasoning-driven process using large reasoning models. The framework consists of a Global Cognition stage that tracks events at a macroscopic level and continuously updates a global event memory, and a Detail Exploration stage that identifies informational gaps and refines the timeline via targeted document retrieval. It incorporates an Event Scraper for retrieving temporal event descriptions, a Timeline Updater for refining the timeline, and a Supervisor for detecting gaps and guiding retrieval. Experimental results demonstrate that this approach significantly outperforms existing LLM-based TLS methods on开放域
What carries the argument
The two-stage reasoning process of Global Cognition for macroscopic event tracking with continuous memory updates and Detail Exploration for gap identification plus targeted retrieval, supported by the Event Scraper, Timeline Updater, and Supervisor mechanisms.
If this is right
- Timelines produced from news will contain fewer missing events and fewer ordering inconsistencies.
- The method enables iterative acquisition of evidence and explicit validation of temporal consistency during construction.
- Performance gains hold on open-domain settings and remain competitive on closed-domain settings.
- The framework demonstrates that large reasoning models can move beyond passive generation to structured, memory-augmented information extraction.
Where Pith is reading between the lines
- The same two-stage structure could be adapted to build timelines from legal case files or historical archives where events are scattered across documents.
- Real-time news streams could feed directly into the Global Cognition stage for continuously updated public timelines.
- If the supervisor reliably detects gaps, the approach may reduce the need for exhaustive initial retrieval and lower overall token cost.
Load-bearing premise
The specialized mechanisms of Event Scraper, Timeline Updater, and Supervisor can be implemented reliably on top of existing large reasoning models without introducing new hallucinations or retrieval errors that cancel out the reported gains.
What would settle it
A controlled experiment on the same open-domain TLS datasets in which TimelineReasoner produces timelines with measurably lower accuracy, coverage, or coherence than a strong baseline LLM prompt would falsify the central claim.
Figures
read the original abstract
The proliferation of online news poses a challenge to extracting structured timelines from unstructured content. While recent studies have shown that Large Language Models (LLMs) can assist Timeline Summarization (TLS), these approaches primarily treat models as passive generators. The emergence of Large Reasoning Models (LRMs) presents an opportunity to reason over events actively, enabling iterative evidence acquisition, the detection of missing events, and the validation of temporal consistency. To systematically leverage the reasoning capabilities of LRMs, we propose TimelineReasoner, a novel framework that shifts TLS from static generation to an active, reasoning-driven process. Unlike prior work, TimelineReasoner adopts a two-stage framework: Global Cognition, which tracks events at a macroscopic level and continuously updates a global event memory, and Detail Exploration, which identifies informational gaps and refines the timeline via targeted document retrieval. To support this, TimelineReasoner incorporates several specialized mechanisms, including an Event Scraper for retrieving temporal event descriptions, a Timeline Updater for refining the timeline, and a Supervisor for detecting gaps in the timeline and guiding retrieval. Experimental results on open-domain TLS datasets demonstrate that TimelineReasoner significantly outperforms existing LLM-based TLS methods in terms of timeline accuracy, coverage, and coherence. On closed-domain TLS datasets, our method performs on par with or exceeds state-of-the-art approaches. This work not only pushes the boundaries of TLS but also highlights the broader potential of LRM-based reasoning frameworks for timeline summarization.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes TimelineReasoner, a framework that leverages Large Reasoning Models (LRMs) to advance timeline summarization (TLS) by shifting from passive LLM generation to an active, iterative reasoning process. It introduces a two-stage architecture consisting of Global Cognition (macroscopic event tracking and global memory updates) and Detail Exploration (gap detection and targeted retrieval), supported by specialized modules including an Event Scraper, Timeline Updater, and Supervisor. Experiments on open-domain TLS datasets are reported to show significant gains in timeline accuracy, coverage, and coherence over prior LLM-based methods, with competitive or superior performance on closed-domain datasets.
Significance. If the empirical claims hold under rigorous controls, the work could meaningfully advance TLS by demonstrating how LRM reasoning enables dynamic evidence acquisition and consistency validation, moving beyond static prompting. This may have broader implications for applying active reasoning loops to other structured extraction tasks in news and knowledge summarization.
major comments (2)
- [Abstract / Experimental Results] Abstract and Experimental Results section: the central claim of significant outperformance on open-domain TLS datasets is presented without any description of the baselines employed, the precise definitions or implementations of the accuracy/coverage/coherence metrics, statistical significance testing, or experimental controls (e.g., retrieval-only ablations). This absence leaves the headline result unsupported by visible evidence and prevents assessment of whether gains derive from the LRM-specific two-stage loop or from ancillary retrieval components.
- [Framework Description] Framework Description (Global Cognition + Detail Exploration): no quantitative breakdown is supplied for the reliability of the added mechanisms, such as precision/recall of the Event Scraper on temporal event extraction, gap-detection accuracy of the Supervisor, or error propagation rates through the Timeline Updater. Without such diagnostics, it remains unclear whether the iterative reasoning reduces hallucinations and retrieval noise or amplifies them relative to simpler LLM baselines.
minor comments (1)
- [Abstract] The abstract would be strengthened by naming the specific open-domain and closed-domain datasets used and by reporting at least one key quantitative delta (e.g., absolute improvement in a primary metric).
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which highlight important areas for strengthening the presentation of our experimental results and framework analysis. We agree that greater transparency on baselines, metrics, controls, and component reliability is needed to fully support the claims. We will revise the manuscript to incorporate these elements.
read point-by-point responses
-
Referee: [Abstract / Experimental Results] Abstract and Experimental Results section: the central claim of significant outperformance on open-domain TLS datasets is presented without any description of the baselines employed, the precise definitions or implementations of the accuracy/coverage/coherence metrics, statistical significance testing, or experimental controls (e.g., retrieval-only ablations). This absence leaves the headline result unsupported by visible evidence and prevents assessment of whether gains derive from the LRM-specific two-stage loop or from ancillary retrieval components.
Authors: We acknowledge that the current version does not sufficiently detail these elements in the abstract and results summary. The full experimental setup section describes the baselines as prior LLM-based TLS approaches (e.g., static prompting methods from recent works on news timeline extraction). Metrics follow standard TLS evaluation protocols: accuracy assesses factual correctness and temporal ordering of events, coverage measures the proportion of key events captured, and coherence evaluates logical consistency and readability. In the revision, we will explicitly define and implement these metrics, report statistical significance via paired t-tests or bootstrap resampling, and add ablation studies including retrieval-only variants to isolate the contribution of the Global Cognition + Detail Exploration reasoning loop. This will clarify that performance gains stem from the LRM-driven iterative process rather than retrieval alone. revision: yes
-
Referee: [Framework Description] Framework Description (Global Cognition + Detail Exploration): no quantitative breakdown is supplied for the reliability of the added mechanisms, such as precision/recall of the Event Scraper on temporal event extraction, gap-detection accuracy of the Supervisor, or error propagation rates through the Timeline Updater. Without such diagnostics, it remains unclear whether the iterative reasoning reduces hallucinations and retrieval noise or amplifies them relative to simpler LLM baselines.
Authors: We agree that quantitative diagnostics on component reliability would strengthen the framework analysis and help demonstrate the value of the iterative reasoning. In the revised manuscript, we will add targeted evaluations: precision/recall for the Event Scraper on temporal event extraction using a held-out annotated sample; accuracy metrics for the Supervisor's gap detection; and an error propagation study tracing how mistakes in one module affect downstream timeline quality. These will be compared against simpler LLM baselines to show that the two-stage loop reduces hallucinations and noise rather than amplifying them. revision: yes
Circularity Check
No circularity: independent engineering framework on pre-trained LRMs
full rationale
The paper proposes TimelineReasoner as a two-stage engineering framework (Global Cognition for macroscopic event tracking and Detail Exploration for gap-filling via retrieval) built directly on existing Large Reasoning Models, with auxiliary modules (Event Scraper, Timeline Updater, Supervisor) described as implementation components rather than derived quantities. No equations, fitted parameters, or mathematical derivations appear in the provided text; claims rest on experimental comparisons to prior LLM-based TLS methods. No self-definitional reductions, fitted-input predictions, load-bearing self-citations, uniqueness theorems, or ansatz smuggling are present. The contribution is self-contained as an applied system design whose performance is evaluated externally on open- and closed-domain datasets.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
two-stage framework: Global Cognition... Detail Exploration... Event Scraper... Timeline Updater... Supervisor
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
iterative reasoning and targeted information retrieval... dynamic timeline memory
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
James Allan, Rahul Gupta, and Vikas Khandelwal. 2001. Temporal Summaries of News Topics. InSIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September SIGIR ’26, July 20–24, 2026, Naarm, Australia Zhang et al. 9-13, 2001, New Orleans, Louisiana, USA, W. Bruce Croft, David J....
-
[2]
Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng X...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2309 2023
-
[3]
Qiguang Chen, Libo Qin, Jinhao Liu, Dengyun Peng, Jiannan Guan, Peng Wang, Mengkang Hu, Yuhang Zhou, Te Gao, and Wanxiang Che. 2025. Towards Reason- ing Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Mod- els.CoRRabs/2503.09567 (2025). arXiv:2503.09567 doi:10.48550/ARXIV.2503.09567
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.09567 2025
-
[4]
Xiuying Chen, Zhangming Chan, Shen Gao, Meng-Hsuan Yu, Dongyan Zhao, and Rui Yan. 2019. Learning towards Abstractive Timeline Summarization. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intel- ligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 4939–4945. doi:10.24963/IJCAI.2019/686
-
[5]
Xiuying Chen, Mingzhe Li, Shen Gao, Zhangming Chan, Dongyan Zhao, Xin Gao, Xiangliang Zhang, and Rui Yan. 2023. Follow the Timeline! Generating an Abstractive and Extractive Timeline Summary in Chronological Order.ACM Trans. Inf. Syst.41, 1 (2023), 9:1–9:30. doi:10.1145/3517221
-
[6]
Zhipeng Chen, Yingqian Min, Beichen Zhang, Jie Chen, Jinhao Jiang, Daixuan Cheng, Wayne Xin Zhao, Zheng Liu, Xu Miao, Yang Lu, Lei Fang, Zhongyuan Wang, and Ji-Rong Wen. 2025. An Empirical Study on Eliciting and Improving R1-like Reasoning Models.CoRRabs/2503.04548 (2025). arXiv:2503.04548 doi:10. 48550/ARXIV.2503.04548
-
[7]
Hai Leong Chieu and Yoong Keok Lee. 2004. Query based event extraction along a timeline. InSIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, July 25-29, 2004, Mark Sanderson, Kalervo Järvelin, James Allan, and Peter Bruza (Eds.). ACM, 425–432. doi:10.1145/1008...
-
[8]
DeepSeek-AI. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.CoRRabs/2501.12948 (2025). arXiv:2501.12948 doi:10.48550/ARXIV.2501.12948
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2501.12948 2025
-
[9]
Yijun Duan, Adam Jatowt, and Masatoshi Yoshikawa. 2020. Comparative Time- line Summarization via Dynamic Affinity-Preserving Random Walk. InECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Arti...
-
[10]
Mohamed Amine Ferrag, Norbert Tihanyi, and Mérouane Debbah. 2025. Reason- ing Beyond Limits: Advances and Open Problems for LLMs.CoRRabs/2503.22732 (2025). arXiv:2503.22732 doi:10.48550/ARXIV.2503.22732
-
[11]
Demian Gholipour Ghalandari and Georgiana Ifrim. 2020. Examining the State- of-the-Art in News Timeline Summarization. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguist...
-
[12]
Clinton Gormley and Zachary J. Tong. 2015. Elasticsearch: The Definitive Guide. https://api.semanticscholar.org/CorpusID:62964734
work page 2015
-
[13]
Scaling Laws for Autoregressive Generative Modeling
Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, and Sam McCandlish. 2020. Scaling Laws for Au- toregressive Generative Modeling.CoRRabs/2010.14701...
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[14]
Jingcheng Hu, Yinmin Zhang, Qi Han, Daxin Jiang, Xiangyu Zhang, and Heung- Yeung Shum. 2025. Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model.CoRRabs/2503.24290 (2025). arXiv:2503.24290 doi:10.48550/ARXIV.2503.24290
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.24290 2025
-
[15]
Qisheng Hu, Geonsik Moon, and Hwee Tou Ng. 2024. From Moments to Mile- stones: Incremental Timeline Summarization Leveraging Large Language Models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar...
-
[16]
Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksander Madry, Alex Beutel, Alex Carney, Alex If- timie, Alex Karpenko, Alex Tachard Passos, Alexander Neitz, Alexander Prokofiev, Alexander Wei, Allison Tam, Ally Bennett, Ananya Kumar, Andre Saraiva, Andrea Vallone, Andrew Duberstein, Andrew Kondrich, Andre...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.16720 2024
-
[17]
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022. Large Language Models are Zero-Shot Reasoners. InAdvances in Neural Information Processing Systems 35: Annual Conference on Neural Informa- tion Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. A...
work page 2022
-
[18]
Mojtaba Komeili, Kurt Shuster, and Jason Weston. 2022. Internet-Augmented Dialogue Generation. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguist...
-
[19]
Baixuan Li, Bo Zhang, Dingchu Zhang, Fei Huang, Guangyu Li, Guoxin Chen, Huifeng Yin, Jialong Wu, Jingren Zhou, Kuan Li, Liangcai Su, Litu Ou, Liwen Zhang, Pengjun Xie, Rui Ye, Wenbiao Yin, Xinmiao Yu, Xinyu Wang, Xixi Wu, Xuanzhong Chen, Yida Zhao, Zhen Zhang, Zhengwei Tao, Zhongwang Zhang, Zile Qiao, Chenxi Wang, Donglei Yu, Gang Fu, Haiyang Shen, Jiayi...
work page internal anchor Pith review doi:10.48550/arxiv.2510.24701 2025
-
[20]
Jiwei Li and Sujian Li. 2013. Evolutionary Hierarchical Dirichlet Process for Timeline Summarization. InProceedings of the 51st Annual Meeting of the As- sociation for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bul- garia, Volume 2: Short Papers. The Association for Computer Linguistics, 556–560. https://aclanthology.org/P13-2099/
work page 2013
-
[21]
Manling Li, Tengfei Ma, Mo Yu, Lingfei Wu, Tian Gao, Heng Ji, and Kathleen R. McKeown. 2021. Timeline Summarization based on Event Graph Compression via Time-Aware Optimal Transport. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Mar...
-
[22]
Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, and Zhicheng Dou. 2025. Search-o1: Agentic Search-Enhanced Large Reasoning Models.CoRRabs/2501.05366 (2025). arXiv:2501.05366 doi:10.48550/ ARXIV.2501.05366
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[23]
Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yutao Zhu, Yongkang Wu, Ji-Rong Wen, and Zhicheng Dou. 2025. WebThinker: Empowering Large Rea- soning Models with Deep Research Capability.CoRRabs/2504.21776 (2025). arXiv:2504.21776 doi:10.48550/ARXIV.2504.21776
-
[24]
Zijian Li, Xin Guan, Bo Zhang, Shen Huang, Houquan Zhou, Shaopeng Lai, Ming Yan, Yong Jiang, Pengjun Xie, Fei Huang, Jun Zhang, and Jingren Zhou
-
[25]
arXiv:2509.13312 doi:10.48550/ARXIV.2509.13312
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research.CoRRabs/2509.13312 (2025). arXiv:2509.13312 doi:10.48550/ARXIV.2509.13312
-
[26]
Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, and Cheng-Lin Liu. 2025. From System 1 to System 2: A Survey of Reasoning Large Language Models. CoRRabs/2502.17419 (2025). arXiv:2502.17419 doi:10.48550/ARXIV...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2502.17419 2025
-
[27]
Sebastian Martschat and Katja Markert. 2017. Improving ROUGE for Timeline Summarization. InProceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3-7, 2017, Volume 2: Short Papers, Mirella Lapata, Phil Blunsom, and Alexander Koller (Eds.). Association for Computational...
-
[28]
Sebastian Martschat and Katja Markert. 2018. A Temporally Sensitive Sub- modularity Framework for Timeline Summarization. InProceedings of the 22nd Conference on Computational Natural Language Learning, CoNLL 2018, Brussels, Belgium, October 31 - November 1, 2018, Anna Korhonen and Ivan Titov (Eds.). Association for Computational Linguistics, 230–240. doi...
-
[29]
Yingqian Min, Zhipeng Chen, Jinhao Jiang, Jie Chen, Jia Deng, Yiwen Hu, Yiru Tang, Jiapeng Wang, Xiaoxue Cheng, Huatong Song, Wayne Xin Zhao, Zheng Liu, Zhongyuan Wang, and Ji-Rong Wen. 2024. Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems.CoRRabs/2412.09413 (2024). arXiv:2412.09413 doi:10.48550/ARXIV.2412.09413
-
[30]
Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, and John Schulman. 2021. WebGPT: Browser- assisted question-answering with human feedback.CoRRabs/2112....
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[31]
Kiem-Hieu Nguyen, Xavier Tannier, and Véronique Moriceau. 2014. Ranking Multidocument Event Descriptions for Building Thematic Timelines. InCOLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, August 23-29, 2014, Dublin, Ireland, Jan Hajic and Junichi Tsujii (Eds.). ACL, 1208–1217. https...
work page 2014
-
[32]
OpenAI. 2023. GPT-4 Technical Report.CoRRabs/2303.08774 (2023). arXiv:2303.08774 doi:10.48550/ARXIV.2303.08774
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2303.08774 2023
-
[33]
Julius Steen and Katja Markert. 2019. Abstractive Timeline Summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, Lu Wang, Jackie Chi Kit Cheung, Giuseppe Carenini, and Fei Liu (Eds.). Association for Computational Linguistics, Hong Kong, China, 21–31. doi:10.18653/v1/D19-5403
-
[34]
Xinyu Tang, Xiaolei Wang, Zhihao Lv, Yingqian Min, Xin Zhao, Binbin Hu, Ziqi Liu, and Zhiqiang Zhang. 2025. Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2025, Vien...
work page 2025
-
[35]
Qwen Team. 2025. QwQ-32B: Embracing the Power of Reinforcement Learning. https://qwenlm.github.io/blog/qwq-32b/
work page 2025
-
[36]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-Thought Prompt- ing Elicits Reasoning in Large Language Models. InAdvances in Neural Infor- mation Processing Systems 35: Annual Conference on Neural Information Pro- cessing Systems 2022, NeurIPS 2022, New Orleans, LA, USA,...
work page 2022
-
[37]
Weiqi Wu, Shen Huang, Yong Jiang, Pengjun Xie, Fei Huang, and Hai Zhao
-
[38]
Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization. InFindings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29 - May 4, 2025, Luis Chiruzzo, Alan Ritter, and Lu Wang (Eds.). Association for Computational Linguistics, 4385–4398. doi:10.18653/V1/2025.FINDINGS-...
-
[39]
An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tingyu X...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2412.15115 2024
-
[40]
Tian Ye, Zicheng Xu, Yuanzhi Li, and Zeyuan Allen-Zhu. 2025. Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems. InThe Thirteenth International Conference on Learning Rep- resentations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net. https: //openreview.net/forum?id=zpDGwcmMV4
work page 2025
-
[41]
Jingyi You, Dongyuan Li, Hidetaka Kamigaito, Kotaro Funakoshi, and Man- abu Okumura. 2022. Joint Learning-based Heterogeneous Graph Attention Network for Timeline Summarization. InProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguis- tics: Human Language Technologies, NAACL 2022, Seattle, W A, Unite...
work page 2022
-
[42]
doi:10.18653/V1/2022.NAACL-MAIN.301
-
[43]
Yi Yu, Adam Jatowt, Antoine Doucet, Kazunari Sugiyama, and Masatoshi Yoshikawa. 2021. Multi-TimeLine Summarization (MTLS): Improving Time- line Summarization by Generating Multiple Summaries. InProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing,...
-
[44]
Wayne Xin Zhao, Yanwei Guo, Rui Yan, Yulan He, and Xiaoming Li. 2013. Timeline generation with social attention. InThe 36th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR ’13, Dublin, Ireland - July 28 - August 01, 2013, Gareth J. F. Jones, Paraic Sheridan, Diane Kelly, Maarten de Rijke, and Tetsuya Sakai (E...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.