Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning

Jinyeop Song; Julian Shun; Junhong Lin; Shicheng Liu; Song Wang; Yada Zhu

arxiv: 2509.26383 · v5 · pith:JKVHLQFTnew · submitted 2025-09-30 · 💻 cs.CL · cs.AI

Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning

Junhong Lin , Shicheng Liu , Jinyeop Song , Song Wang , Julian Shun , Yada Zhu This is my paper

Pith reviewed 2026-05-25 07:42 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords knowledge graph RAGreinforcement learningagentic frameworkKGQAtransferabilityefficiencysingle-agent retrieval

0 comments

The pith

A single reinforcement learning agent for knowledge-graph retrieval-augmented generation reaches higher accuracy with fewer tokens than multi-module systems that rely on larger models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents KG-R1 as an agentic method that replaces fixed pipelines of separate LLM modules with one agent trained by reinforcement learning. The agent treats the knowledge graph itself as its environment and learns to choose retrieval actions while folding the results directly into its ongoing reasoning and answer generation. Tested on KGQA benchmarks, the approach using a 3B-parameter base model produces more accurate answers than earlier multi-module methods that use substantially larger models, while consuming fewer generation tokens. After training, the same agent keeps its accuracy when applied to previously unseen knowledge graphs without any retraining. These outcomes point to lower inference costs and greater flexibility for deploying verifiable knowledge-graph systems.

Core claim

KG-R1 trains one agent through reinforcement learning to interact with a knowledge graph as its environment, selecting retrieval actions at each step and integrating the retrieved information into a single unified reasoning and generation process; on standard KGQA benchmarks this yields higher answer accuracy and lower token counts than prior multi-module workflows even when the base model is only 3B parameters, and the trained agent maintains performance on new graphs without retraining.

What carries the argument

The KG-R1 single agent that treats the knowledge graph as an RL environment and learns retrieval actions jointly with reasoning in one loop.

If this is right

Inference cost drops because fewer tokens are generated and a smaller base model suffices.
The same trained agent can be dropped onto new knowledge graphs without additional training or fine-tuning.
The need for separate planning, reasoning, and response modules is removed by the single learned policy.
Real-world KG-RAG deployments become more practical because the system generalizes across graph structures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same RL loop could be applied to other structured retrieval sources such as databases or document hierarchies.
Training once on a diverse collection of graphs might produce agents that handle entirely new domains zero-shot.
If the reward design proves robust, similar agentic RL training could replace multi-module pipelines in non-graph RAG settings.

Load-bearing premise

The reinforcement learning reward signal and the way retrieval actions are defined and scored are assumed to transfer across different knowledge-graph schemas without needing schema-specific redesign or retraining.

What would settle it

Train the agent on one family of knowledge graphs, then evaluate it on a second family whose relation vocabulary and connectivity patterns differ markedly; if accuracy falls sharply, the transfer claim does not hold.

Figures

Figures reproduced from arXiv: 2509.26383 by Jinyeop Song, Julian Shun, Junhong Lin, Shicheng Liu, Song Wang, Yada Zhu.

**Figure 1.** Figure 1: Overview of KG-R1, a multi-turn agentic framework for KG-RAG trained with reinforcement learning. The framework enables cost-efficient inference and demonstrates strong cross-KG transferability. †Work done during an internship at MIT-IBM Watson AI Lab, IBM Research. 1 arXiv:2509.26383v3 [cs.CL] 9 Oct 2025 [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Prior multi-module methods are costly and do not transfer well across KGs. Left: mean end-to-end generated tokens per query on WebQSP (Yih et al., 2016). Right: average Hit@1 over five out-of-training KGQA datasets (See Sec. 4.2). KG-R1 achieves both low token cost and strong cross-KG transferability. Despite improved reasoning accuracy, the realworld deployment of such workflows faces two key challenges… view at source ↗

**Figure 3.** Figure 3: KG-R1 framework: a single LLM agent undergoes multi-turn generation–execution loop with a schema-agnostic KG retrieval server and responds with the final answer. 3.2 KG-R1 FRAMEWORK KG-R1 casts KG-RAG as a multi-turn interaction with a KG interface (KG retrieval server). We prioritize two design principles. First, we design a single-agent architecture that simplifies deployment and enables efficient, low-… view at source ↗

**Figure 4.** Figure 4: shows steady F1 improvement that plateaus after convergence during KG-R1 training on WebQSP and CWQ. The validation F1 scores track the training curve, indicating the agent learns generalization. Results are reproducible across three random seeds ( [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: KG-R1 error types with actual server error messages. [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

**Figure 6.** Figure 6: Training dynamics of Qwen 2.5b-it across 3 random seeds demonstrate reproducibility [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: Single-query latency of KG-R1 on one NVIDIA H100. (a) Distribution of end-to-end latency; the dashed line marks the mean 6.38 s, and dotted lines indicate mean±1σ (5.39–7.36 s). (b) Cumulative time versus turn number across 500 queries; diamonds show per-turn means and the dashed trend denotes the average time per turn. The average maximum turn count is 4.2, and the near-linear growth indicates predictable… view at source ↗

**Figure 8.** Figure 8: Training curves of ablation studies WebQSP, reporting F1 score across training steps. [PITH_FULL_IMAGE:figures/full_fig_p025_8.png] view at source ↗

**Figure 9.** Figure 9: Training curves of ablation studies for CWQ, reporting F1 score across training steps. [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗

**Figure 10.** Figure 10: Example KG-R1 response on WebQSP, showing multi-step reasoning and verification for a person-children query. Blue denotes responses generated by the KG-R1 agent. 27 [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗

**Figure 11.** Figure 11: Example KG-R1 response in CWQ. Blue denotes responses generated by the KG-R1 agent. 28 [PITH_FULL_IMAGE:figures/full_fig_p028_11.png] view at source ↗

**Figure 12.** Figure 12: Example KG-R1 response in SimpleQA. Blue denotes responses generated by the KG-R1 agent. 29 [PITH_FULL_IMAGE:figures/full_fig_p029_12.png] view at source ↗

**Figure 13.** Figure 13: Example KG-R1 response in T-REx. Blue denotes responses generated by the KG-R1 agent. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_13.png] view at source ↗

**Figure 14.** Figure 14: Example KG-R1 response in QALD10en. Blue denotes responses generated by the KG-R1 agent. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_14.png] view at source ↗

**Figure 15.** Figure 15: Example KG-R1 response in GrailQA. Blue denotes responses generated by the KG-R1 agent. 32 [PITH_FULL_IMAGE:figures/full_fig_p032_15.png] view at source ↗

**Figure 16.** Figure 16: Example KG-R1 response on MultiTQ, illustrating all four query functions and their usage. Blue denotes responses generated by the KG-R1 agent. 33 [PITH_FULL_IMAGE:figures/full_fig_p033_16.png] view at source ↗

read the original abstract

Knowledge-graph retrieval-augmented generation (KG-RAG) couples large language models (LLMs) with structured, verifiable knowledge graphs (KGs) to reduce hallucination and provide reasoning traces. However, current KG-RAG systems often rely on fixed pipelines of multiple LLM modules (e.g., planning, reasoning, and responding), which inflate inference costs and tie performance to specific graph schemas. To address this, we introduce KG-R1, an agentic framework that optimizes KG-RAG through reinforcement learning (RL). Unlike modular workflows, KG-R1 uses a single agent that interacts with KGs as its environment, learning to retrieve information at each step and incorporating it into its reasoning and generation in a unified process. Across Knowledge-Graph Question Answering (KGQA) benchmarks, KG-R1 demonstrates both efficiency and transferability-using Qwen 2.5-3B, KG-R1 improves answer accuracy with fewer generation tokens than prior multi-module workflow methods that use much larger foundation or fine-tuned models. Furthermore, KG-R1 exhibits strong plug-and-play capability: after training, maintaining accuracy on unseen KGs without retraining. These properties make KG-R1 a promising KG-RAG framework for real-world deployment. Our code is publicly available at github.com/junhongmit/KG-R1/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

KG-R1 gets efficiency gains on KGQA with a 3B model and claims zero-shot transfer, but the transfer result hinges on an unshown schema-invariant action space.

read the letter

The paper's core move is to replace the usual multi-module KG-RAG pipeline with a single RL agent that treats the knowledge graph as its environment. That unification is the actual novelty; prior work kept planning, retrieval, and generation as separate LLM calls. The abstract reports that Qwen 2.5-3B under this setup beats larger models on accuracy while using fewer generation tokens, and that the same policy keeps accuracy on unseen graphs without retraining. Code release is a plus for anyone who wants to check the implementation.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces KG-R1, an agentic RL framework for KG-RAG in which a single LLM agent (Qwen 2.5-3B) treats the knowledge graph as an environment and learns to perform retrieval actions interleaved with reasoning and generation. It reports higher answer accuracy and lower token usage on KGQA benchmarks than prior multi-module pipelines that rely on larger foundation or fine-tuned models, together with plug-and-play transfer: the trained policy maintains accuracy on unseen KGs without retraining.

Significance. If the efficiency gains and schema-invariant transfer are substantiated, the work would be significant for practical KG-RAG deployment, as it reduces inference cost and eliminates per-graph retraining. The public code release supports reproducibility.

major comments (2)

[Abstract and §3] Abstract and §3 (Method): the transfer claim requires that the action space, state representation, and reward encode no schema-specific structure. The manuscript supplies no explicit formulation of retrieval actions (e.g., whether they are defined over concrete relation vocabularies or abstract operations), leaving the generalization argument unsupported.
[§4] §4 (Experiments): accuracy and token-count improvements are stated without error bars, statistical significance tests, training curves, or ablations on reward design and hyperparameters. These omissions are load-bearing for both the efficiency and transfer claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (Method): the transfer claim requires that the action space, state representation, and reward encode no schema-specific structure. The manuscript supplies no explicit formulation of retrieval actions (e.g., whether they are defined over concrete relation vocabularies or abstract operations), leaving the generalization argument unsupported.

Authors: We agree that an explicit formulation is needed to support the transfer claim. The actions in KG-R1 are abstract operations (e.g., RetrieveNeighbors using relation embeddings rather than schema-specific tokens), states use schema-invariant embeddings, and rewards are task-performance based. We will add a formal definition of the action space, state representation, and reward in the revised §3. revision: yes
Referee: [§4] §4 (Experiments): accuracy and token-count improvements are stated without error bars, statistical significance tests, training curves, or ablations on reward design and hyperparameters. These omissions are load-bearing for both the efficiency and transfer claims.

Authors: We acknowledge that these elements are necessary to substantiate the claims. In the revised §4 we will add error bars from multiple runs, statistical significance tests, training curves, and ablations on reward design and hyperparameters. revision: yes

Circularity Check

0 steps flagged

Empirical RL training yields no circular derivation

full rationale

The paper reports results from training an RL agent (KG-R1) on KGQA benchmarks and evaluating accuracy, token efficiency, and transfer to unseen KGs. No equations, derivations, or fitted-parameter predictions are described that would reduce reported accuracies to training inputs by construction. The central claims rest on experimental outcomes rather than self-referential definitions or self-citation chains that collapse the result.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on an RL environment whose reward function and action space are not specified in the abstract; these are treated as design choices that enable the reported transfer. No new physical or mathematical entities are postulated. The approach inherits standard RL assumptions about Markovian states and reward shaping.

free parameters (1)

RL reward function and hyperparameters
The reward that guides retrieval actions and answer quality is a free parameter chosen to make the agent learn the desired behavior; its exact form is not given in the abstract.

axioms (1)

domain assumption The KG can be treated as a Markov decision process environment where retrieval actions produce observable state changes.
This modeling choice is required for the single-agent RL formulation to apply.

pith-pipeline@v0.9.0 · 5776 in / 1395 out tokens · 16835 ms · 2026-05-25T07:42:08.919187+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 12 internal anchors

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[2]

In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024

Program-guided temporal knowledge graph qa (prog-tqa). In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024. URL https://aclanthology.org/2024.lrec-main.1528.pdf

work page 2024
[3]

Knowledge-augmented language model prompting for zero-shot knowledge graph question answering, 2023

Jinheon Baek, Alham Fikri Aji, and Amir Saffari. Knowledge-augmented language model prompting for zero-shot knowledge graph question answering, 2023. URL https://arxiv.org/abs/2306.04136

work page arXiv 2023
[4]

Gett-qa: Graph embedding based t2t transformer for knowledge graph question answering

Debayan Banerjee, Pranav Ajit Nair, Ricardo Usbeck, and Chris Biemann. Gett-qa: Graph embedding based t2t transformer for knowledge graph question answering. In ESWC 2023 Workshops (SemDeep-6), 2023. URL https://2023.eswc-conferences.org/wp-content/uploads/2023/05/paper_Banerjee_2023_GETT-QA.pdf

work page 2023
[5]

Large-scale Simple Question Answering with Memory Networks

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. Large-scale Simple Question Answering with Memory Networks . arXiv preprint arXiv:1506.02075, 2015. URL https://arxiv.org/abs/1506.02075

work page internal anchor Pith review Pith/arXiv arXiv 2015
[6]

Icews coded event data, 2015

Elizabeth Boschee, Jennifer Lautenschlager, Sean O'Brien, Steve Shellman, James Starz, and Michael Ward. Icews coded event data, 2015. URL https://doi.org/10.7910/DVN/28075

work page doi:10.7910/dvn/28075 2015
[7]

Grounding dialogue systems via knowledge graph aware decoding with pre-trained transformers, 2021

Debanjan Chaudhuri, Md Rashad Al Hasan Rony, and Jens Lehmann. Grounding dialogue systems via knowledge graph aware decoding with pre-trained transformers, 2021. URL https://arxiv.org/abs/2103.16289

work page arXiv 2021
[8]

Multi-granularity Temporal Question Answering over Knowledge Graphs

Ziyang Chen, Jinzhi Liao, and Xiang Zhao. Multi-granularity Temporal Question Answering over Knowledge Graphs . In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 11417--11431, Toronto, Canada, July 2023. Association for Computational Linguistics. doi:10.18653/v1/2023.acl-long.637. URL ...

work page doi:10.18653/v1/2023.acl-long.637 2023
[9]

Temporal knowledge question answering via abstract reasoning induction

Ziyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, and Min Zhang. Temporal knowledge question answering via abstract reasoning induction. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024. URL https://openreview.net/pdf?id=Yb64YLwWk_

work page 2024
[10]

Gold-medalist performance in solving olympiad geometry with alphageometry2.arXiv preprint arXiv:2502.03544, 2025

Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, Xiaomeng Yang, Hoang Nguyen, Marcelo Menegali, Junehyuk Jung, Vikas Verma, Quoc V. Le, and Thang Luong. Gold-medalist performance in solving olympiad geometry with alphageometry2, 2025. URL https://arxiv.org/abs/2502.03544

work page arXiv 2025
[11]

Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model, 2024

Jiaxi Cui, Munan Ning, Zongjian Li, Bohua Chen, Yang Yan, Hao Li, Bin Ling, Yonghong Tian, and Li Yuan. Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model, 2024. URL https://arxiv.org/abs/2306.16092

work page arXiv 2024
[12]

Dynamic few-shot learning for knowledge graph question answering

Jacopo D'Abramo, Andrea Zugarini, and Paolo Torroni. Dynamic few-shot learning for knowledge graph question answering. arXiv preprint arXiv:2407.01409, 2024

work page arXiv 2024
[13]

Investigating large language models for text-to-sparql generation

Jacopo D'Abramo, Andrea Zugarini, and Paolo Torroni. Investigating large language models for text-to-sparql generation. In Weijia Shi, Greg Durrett, Hannaneh Hajishirzi, and Luke Zettlemoyer (eds.), Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing, pp.\ 66--80, Albuquerque, New Mexico, USA, may 2...

work page doi:10.18653/v1/2025.knowledgenlp-1.5 2025
[14]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-AI . Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv:2501.12948, 2025. URL https://arxiv.org/abs/2501.12948

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples

Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Fr \'e d \'e rique Laforest, and Elena Simperl. T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples . In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European La...

work page 2018
[16]

Encode. Uvicorn. https://www.uvicorn.org/, 2018. ASGI server for Python. Accessed September 24, 2025

work page 2018
[17]

Beyond seen data: Improving kbqa generalization through schema-guided logical form generation (sg-kbqa), 2025

Shengxiang Gao, Jey Han Lau, and Jianzhong Qi. Beyond seen data: Improving kbqa generalization through schema-guided logical form generation (sg-kbqa), 2025. URL https://arxiv.org/abs/2502.12737

work page arXiv 2025
[18]

Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

Yu Gu, Sue Kase, Michelle Vanni, Brian Sadler, Percy Liang, Xifeng Yan, and Yu Su. Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases . In Proceedings of the Web Conference 2021, pp.\ 3375--3385, Ljubljana, Slovenia, 2021. ACM / IW3C2. doi:10.1145/3442381.3449992. URL https://doi.org/10.1145/3442381.3449992

work page doi:10.1145/3442381.3449992 2021
[19]

Atlas: Reasoning over language models with retrieval

Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Stanislaw Jastrzebski, Sebastian Riedel, and Edouard Grave. Atlas: Reasoning over language models with retrieval. Journal of Machine Learning Research, 24, 2023. URL https://jmlr.org/papers/v24/22-1381.html

work page 2023
[20]

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan \"Omer Arik, Dong Wang, Hamed Zamani, and Jiawei Han. Search-R1 : Training llms to reason and leverage search engines with reinforcement learning. arXiv preprint arXiv:2503.09516, 2025. doi:10.48550/arXiv.2503.09516. URL https://arxiv.org/abs/2503.09516

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.09516 2025
[21]

Gonzalez, Hao Zhang, and Ion Stoica

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023

work page 2023
[22]

Query graph generation for answering multi-hop complex questions from knowledge bases

Yunshi Lan and Jing Jiang. Query graph generation for answering multi-hop complex questions from knowledge bases. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.\ 969--974, Online, July 2020. Association for Computational Linguistics. doi...

work page doi:10.18653/v1/2020.acl-main.91 2020
[23]

Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, and Steven C. H. Hoi. Coderl: Mastering code generation through pretrained models and deep reinforcement learning, 2022. URL https://arxiv.org/abs/2207.01780

work page arXiv 2022
[24]

Sparkle: Enhancing sparql generation with direct kg integration in decoding, 2024

Jaebok Lee and Hyeonjeong Shin. Sparkle: Enhancing sparql generation with direct kg integration in decoding, 2024. URL https://arxiv.org/abs/2407.01626

work page arXiv 2024
[25]

Understanding R1-Zero-Like Training: A Critical Perspective

Zichen Liu, Changyu Chen, Wenjun Li, Penghui Qi, Tianyu Pang, Chao Du, Wee Sun Lee, and Min Lin. Understanding r1-zero-like training: A critical perspective, 2025. URL https://arxiv.org/abs/2503.20783

work page internal anchor Pith review Pith/arXiv arXiv 2025
[26]

Reasoning on graphs: Faithful and interpretable large language model reasoning

Linhao Luo, Yuan-Fang Li, Gholamreza Haffari, and Shirui Pan. Reasoning on graphs: Faithful and interpretable large language model reasoning. In ICLR, 2024. URL https://arxiv.org/abs/2310.01061

work page arXiv 2024
[27]

OpenAI o1 System Card

OpenAI et al. OpenAI o1 System Card . arXiv:2412.16720, 2024. URL https://arxiv.org/abs/2412.16720

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback,...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[29]

ToolRL: Reward is All Tool Learning Needs

Cheng Qian, Emre Can Acikgoz, Qi He, Hongru Wang, Xiusi Chen, Dilek Hakkani-T\"ur, Gokhan Tur, and Heng Ji. ToolRL : Reward is all tool learning needs. arXiv preprint arXiv:2504.13958, 2025. doi:10.48550/arXiv.2504.13958. URL https://arxiv.org/abs/2504.13958

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2504.13958 2025
[30]

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. Toolllm: Facilitating large language models to master 16000+ real-world apis. In ICLR (OpenReview), 2024. URL https://arxiv.org/abs/2307.16789

work page internal anchor Pith review Pith/arXiv arXiv 2024
[31]

Qwen2.5 Technical Report

Qwen, :, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li,...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Sebasti \'a n Ram \' rez. Fastapi. https://fastapi.tiangolo.com/, 2018. Accessed September 24, 2025

work page 2018
[33]

HybridFlow: A Flexible and Efficient RLHF Framework

Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, and Chuan Wu. Hybridflow: A flexible and efficient rlhf framework. arXiv preprint arXiv: 2409.19256, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[34]

Ni, Heung-Yeung Shum, and Jian Guo

Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Lionel M. Ni, Heung-Yeung Shum, and Jian Guo. Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph. In ICLR, 2024. URL https://arxiv.org/abs/2307.07697

work page arXiv 2024
[35]

The Web as a Knowledge-base for Answering Complex Questions

Alon Talmor and Jonathan Berant. The web as a knowledge-base for answering complex questions. In NAACL, 2018. URL https://arxiv.org/abs/1803.06643

work page internal anchor Pith review Pith/arXiv arXiv 2018
[36]

Augmenting reasoning capabilities of LLM s with graph structures in knowledge base question answering

Yuhang Tian, Dandan Song, Zhijing Wu, Changzhi Zhou, Hao Wang, Jun Yang, Jing Xu, Ruanmin Cao, and HaoYu Wang. Augmenting reasoning capabilities of LLM s with graph structures in knowledge base question answering. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (eds.), Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 11967-...

work page doi:10.18653/v1/2024.findings-emnlp.699 2024
[37]

QALD-10---The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA

Ricardo Usbeck, Christina Unger, Andreas Both, Nandana Mihindukulasooriya, Gushem Tadesse, Dayana Spagnuelo, Filip Ilievski, Diego Moussallem, and Axel-Cyrille Ngonga Ngomo. QALD-10---The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA . Semantic Web, 14 0 (1): 0 1--25, 2023. doi:10.3233/SW-222956....

work page doi:10.3233/sw-222956 2023
[38]

Acting less is reasoning more! teaching model to act efficiently

Hongru Wang, Cheng Qian, Wanjun Zhong, Xiusi Chen, Jiahao Qiu, Shijue Huang, Bowen Jin, Mengdi Wang, Kam-Fai Wong, and Heng Ji. Acting less is reasoning more! teaching model to act efficiently. arXiv preprint arXiv:2504.14870, 2025 a . doi:10.48550/arXiv.2504.14870. URL https://arxiv.org/abs/2504.14870. Also referred to as ``OTC: Optimal Tool Calls via Re...

work page doi:10.48550/arxiv.2504.14870 2025
[39]

Knowledge graph retrieval-augmented generation for llm-based recommendation, 2025 b

Shijie Wang, Wenqi Fan, Yue Feng, Shanru Lin, Xinyu Ma, Shuaiqiang Wang, and Dawei Yin. Knowledge graph retrieval-augmented generation for llm-based recommendation, 2025 b . URL https://arxiv.org/abs/2501.02226

work page arXiv 2025
[40]

Reasoning of large language models over knowledge graphs with super-relations

Song Wang, Junhong Lin, Xiaojie Guo, Julian Shun, Jundong Li, and Yada Zhu. Reasoning of large language models over knowledge graphs with super-relations. arXiv:2503.22166, 2025 c . URL https://arxiv.org/abs/2503.22166

work page arXiv 2025
[41]

Knowledge-augmented language model for clinical question answering, 2023

Hao Xiong, Zifeng Wang, Zhijian-huang, Hao tian jia, Yefan-huang, Cheng zhong xu, Zheng-li, Zhi hong chen, Zhi yuan liu, and Zhong zhen su. Knowledge-augmented language model for clinical question answering, 2023

work page 2023
[42]

Kg-bert: Bert for knowledge graph completion, 2019

Liang Yao, Chengsheng Mao, and Yuan Luo. Kg-bert: Bert for knowledge graph completion, 2019. URL https://arxiv.org/abs/1909.03193

work page arXiv 2019
[43]

The value of semantic parse labeling for knowledge base question answering

Wen-tau Yih, Matthew Richardson, Chris Meek, Ming-Wei Chang, and Jina Suh. The value of semantic parse labeling for knowledge base question answering. In ACL, 2016. URL https://aclanthology.org/P16-2033.pdf

work page 2016
[44]

Reinforcing multi-turn reasoning in llm agents via turn-level credit assignment

Siliang Zeng, Quan Wei, William Brown, Oana Frunza, Yuriy Nevmyvaka, and Mingyi Hong. Reinforcing multi-turn reasoning in llm agents via turn-level credit assignment. arXiv preprint arXiv:2505.11821, 2025

work page arXiv 2025
[45]

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging llm-as-a-judge with mt-bench and chatbot arena, 2023. URL https://arxiv.org/abs/2306.05685

work page internal anchor Pith review Pith/arXiv arXiv 2023
[46]

Knowledge graph-guided retrieval augmented generation, 2025

Xiangrong Zhu, Yuexiang Xie, Yi Liu, Yaliang Li, and Wei Hu. Knowledge graph-guided retrieval augmented generation, 2025. URL https://arxiv.org/abs/2502.06864

work page arXiv 2025
[47]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

work page
[48]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

work page
[49]

Victoria Beckham

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page arXiv 2014

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[2] [2]

In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024

Program-guided temporal knowledge graph qa (prog-tqa). In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024. URL https://aclanthology.org/2024.lrec-main.1528.pdf

work page 2024

[3] [3]

Knowledge-augmented language model prompting for zero-shot knowledge graph question answering, 2023

Jinheon Baek, Alham Fikri Aji, and Amir Saffari. Knowledge-augmented language model prompting for zero-shot knowledge graph question answering, 2023. URL https://arxiv.org/abs/2306.04136

work page arXiv 2023

[4] [4]

Gett-qa: Graph embedding based t2t transformer for knowledge graph question answering

Debayan Banerjee, Pranav Ajit Nair, Ricardo Usbeck, and Chris Biemann. Gett-qa: Graph embedding based t2t transformer for knowledge graph question answering. In ESWC 2023 Workshops (SemDeep-6), 2023. URL https://2023.eswc-conferences.org/wp-content/uploads/2023/05/paper_Banerjee_2023_GETT-QA.pdf

work page 2023

[5] [5]

Large-scale Simple Question Answering with Memory Networks

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. Large-scale Simple Question Answering with Memory Networks . arXiv preprint arXiv:1506.02075, 2015. URL https://arxiv.org/abs/1506.02075

work page internal anchor Pith review Pith/arXiv arXiv 2015

[6] [6]

Icews coded event data, 2015

Elizabeth Boschee, Jennifer Lautenschlager, Sean O'Brien, Steve Shellman, James Starz, and Michael Ward. Icews coded event data, 2015. URL https://doi.org/10.7910/DVN/28075

work page doi:10.7910/dvn/28075 2015

[7] [7]

Grounding dialogue systems via knowledge graph aware decoding with pre-trained transformers, 2021

Debanjan Chaudhuri, Md Rashad Al Hasan Rony, and Jens Lehmann. Grounding dialogue systems via knowledge graph aware decoding with pre-trained transformers, 2021. URL https://arxiv.org/abs/2103.16289

work page arXiv 2021

[8] [8]

Multi-granularity Temporal Question Answering over Knowledge Graphs

Ziyang Chen, Jinzhi Liao, and Xiang Zhao. Multi-granularity Temporal Question Answering over Knowledge Graphs . In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 11417--11431, Toronto, Canada, July 2023. Association for Computational Linguistics. doi:10.18653/v1/2023.acl-long.637. URL ...

work page doi:10.18653/v1/2023.acl-long.637 2023

[9] [9]

Temporal knowledge question answering via abstract reasoning induction

Ziyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, and Min Zhang. Temporal knowledge question answering via abstract reasoning induction. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024. URL https://openreview.net/pdf?id=Yb64YLwWk_

work page 2024

[10] [10]

Gold-medalist performance in solving olympiad geometry with alphageometry2.arXiv preprint arXiv:2502.03544, 2025

Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, Xiaomeng Yang, Hoang Nguyen, Marcelo Menegali, Junehyuk Jung, Vikas Verma, Quoc V. Le, and Thang Luong. Gold-medalist performance in solving olympiad geometry with alphageometry2, 2025. URL https://arxiv.org/abs/2502.03544

work page arXiv 2025

[11] [11]

Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model, 2024

Jiaxi Cui, Munan Ning, Zongjian Li, Bohua Chen, Yang Yan, Hao Li, Bin Ling, Yonghong Tian, and Li Yuan. Chatlaw: A multi-agent collaborative legal assistant with knowledge graph enhanced mixture-of-experts large language model, 2024. URL https://arxiv.org/abs/2306.16092

work page arXiv 2024

[12] [12]

Dynamic few-shot learning for knowledge graph question answering

Jacopo D'Abramo, Andrea Zugarini, and Paolo Torroni. Dynamic few-shot learning for knowledge graph question answering. arXiv preprint arXiv:2407.01409, 2024

work page arXiv 2024

[13] [13]

Investigating large language models for text-to-sparql generation

Jacopo D'Abramo, Andrea Zugarini, and Paolo Torroni. Investigating large language models for text-to-sparql generation. In Weijia Shi, Greg Durrett, Hannaneh Hajishirzi, and Luke Zettlemoyer (eds.), Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing, pp.\ 66--80, Albuquerque, New Mexico, USA, may 2...

work page doi:10.18653/v1/2025.knowledgenlp-1.5 2025

[14] [14]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-AI . Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv:2501.12948, 2025. URL https://arxiv.org/abs/2501.12948

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples

Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Fr \'e d \'e rique Laforest, and Elena Simperl. T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples . In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European La...

work page 2018

[16] [16]

Encode. Uvicorn. https://www.uvicorn.org/, 2018. ASGI server for Python. Accessed September 24, 2025

work page 2018

[17] [17]

Beyond seen data: Improving kbqa generalization through schema-guided logical form generation (sg-kbqa), 2025

Shengxiang Gao, Jey Han Lau, and Jianzhong Qi. Beyond seen data: Improving kbqa generalization through schema-guided logical form generation (sg-kbqa), 2025. URL https://arxiv.org/abs/2502.12737

work page arXiv 2025

[18] [18]

Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

Yu Gu, Sue Kase, Michelle Vanni, Brian Sadler, Percy Liang, Xifeng Yan, and Yu Su. Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases . In Proceedings of the Web Conference 2021, pp.\ 3375--3385, Ljubljana, Slovenia, 2021. ACM / IW3C2. doi:10.1145/3442381.3449992. URL https://doi.org/10.1145/3442381.3449992

work page doi:10.1145/3442381.3449992 2021

[19] [19]

Atlas: Reasoning over language models with retrieval

Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Stanislaw Jastrzebski, Sebastian Riedel, and Edouard Grave. Atlas: Reasoning over language models with retrieval. Journal of Machine Learning Research, 24, 2023. URL https://jmlr.org/papers/v24/22-1381.html

work page 2023

[20] [20]

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan \"Omer Arik, Dong Wang, Hamed Zamani, and Jiawei Han. Search-R1 : Training llms to reason and leverage search engines with reinforcement learning. arXiv preprint arXiv:2503.09516, 2025. doi:10.48550/arXiv.2503.09516. URL https://arxiv.org/abs/2503.09516

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.09516 2025

[21] [21]

Gonzalez, Hao Zhang, and Ion Stoica

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. Efficient memory management for large language model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023

work page 2023

[22] [22]

Query graph generation for answering multi-hop complex questions from knowledge bases

Yunshi Lan and Jing Jiang. Query graph generation for answering multi-hop complex questions from knowledge bases. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.\ 969--974, Online, July 2020. Association for Computational Linguistics. doi...

work page doi:10.18653/v1/2020.acl-main.91 2020

[23] [23]

Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, and Steven C. H. Hoi. Coderl: Mastering code generation through pretrained models and deep reinforcement learning, 2022. URL https://arxiv.org/abs/2207.01780

work page arXiv 2022

[24] [24]

Sparkle: Enhancing sparql generation with direct kg integration in decoding, 2024

Jaebok Lee and Hyeonjeong Shin. Sparkle: Enhancing sparql generation with direct kg integration in decoding, 2024. URL https://arxiv.org/abs/2407.01626

work page arXiv 2024

[25] [25]

Understanding R1-Zero-Like Training: A Critical Perspective

Zichen Liu, Changyu Chen, Wenjun Li, Penghui Qi, Tianyu Pang, Chao Du, Wee Sun Lee, and Min Lin. Understanding r1-zero-like training: A critical perspective, 2025. URL https://arxiv.org/abs/2503.20783

work page internal anchor Pith review Pith/arXiv arXiv 2025

[26] [26]

Reasoning on graphs: Faithful and interpretable large language model reasoning

Linhao Luo, Yuan-Fang Li, Gholamreza Haffari, and Shirui Pan. Reasoning on graphs: Faithful and interpretable large language model reasoning. In ICLR, 2024. URL https://arxiv.org/abs/2310.01061

work page arXiv 2024

[27] [27]

OpenAI o1 System Card

OpenAI et al. OpenAI o1 System Card . arXiv:2412.16720, 2024. URL https://arxiv.org/abs/2412.16720

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [28]

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback,...

work page internal anchor Pith review Pith/arXiv arXiv 2022

[29] [29]

ToolRL: Reward is All Tool Learning Needs

Cheng Qian, Emre Can Acikgoz, Qi He, Hongru Wang, Xiusi Chen, Dilek Hakkani-T\"ur, Gokhan Tur, and Heng Ji. ToolRL : Reward is all tool learning needs. arXiv preprint arXiv:2504.13958, 2025. doi:10.48550/arXiv.2504.13958. URL https://arxiv.org/abs/2504.13958

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2504.13958 2025

[30] [30]

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. Toolllm: Facilitating large language models to master 16000+ real-world apis. In ICLR (OpenReview), 2024. URL https://arxiv.org/abs/2307.16789

work page internal anchor Pith review Pith/arXiv arXiv 2024

[31] [31]

Qwen2.5 Technical Report

Qwen, :, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li,...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Sebasti \'a n Ram \' rez. Fastapi. https://fastapi.tiangolo.com/, 2018. Accessed September 24, 2025

work page 2018

[33] [33]

HybridFlow: A Flexible and Efficient RLHF Framework

Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, and Chuan Wu. Hybridflow: A flexible and efficient rlhf framework. arXiv preprint arXiv: 2409.19256, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[34] [34]

Ni, Heung-Yeung Shum, and Jian Guo

Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Lionel M. Ni, Heung-Yeung Shum, and Jian Guo. Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph. In ICLR, 2024. URL https://arxiv.org/abs/2307.07697

work page arXiv 2024

[35] [35]

The Web as a Knowledge-base for Answering Complex Questions

Alon Talmor and Jonathan Berant. The web as a knowledge-base for answering complex questions. In NAACL, 2018. URL https://arxiv.org/abs/1803.06643

work page internal anchor Pith review Pith/arXiv arXiv 2018

[36] [36]

Augmenting reasoning capabilities of LLM s with graph structures in knowledge base question answering

Yuhang Tian, Dandan Song, Zhijing Wu, Changzhi Zhou, Hao Wang, Jun Yang, Jing Xu, Ruanmin Cao, and HaoYu Wang. Augmenting reasoning capabilities of LLM s with graph structures in knowledge base question answering. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (eds.), Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 11967-...

work page doi:10.18653/v1/2024.findings-emnlp.699 2024

[37] [37]

QALD-10---The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA

Ricardo Usbeck, Christina Unger, Andreas Both, Nandana Mihindukulasooriya, Gushem Tadesse, Dayana Spagnuelo, Filip Ilievski, Diego Moussallem, and Axel-Cyrille Ngonga Ngomo. QALD-10---The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA . Semantic Web, 14 0 (1): 0 1--25, 2023. doi:10.3233/SW-222956....

work page doi:10.3233/sw-222956 2023

[38] [38]

Acting less is reasoning more! teaching model to act efficiently

Hongru Wang, Cheng Qian, Wanjun Zhong, Xiusi Chen, Jiahao Qiu, Shijue Huang, Bowen Jin, Mengdi Wang, Kam-Fai Wong, and Heng Ji. Acting less is reasoning more! teaching model to act efficiently. arXiv preprint arXiv:2504.14870, 2025 a . doi:10.48550/arXiv.2504.14870. URL https://arxiv.org/abs/2504.14870. Also referred to as ``OTC: Optimal Tool Calls via Re...

work page doi:10.48550/arxiv.2504.14870 2025

[39] [39]

Knowledge graph retrieval-augmented generation for llm-based recommendation, 2025 b

Shijie Wang, Wenqi Fan, Yue Feng, Shanru Lin, Xinyu Ma, Shuaiqiang Wang, and Dawei Yin. Knowledge graph retrieval-augmented generation for llm-based recommendation, 2025 b . URL https://arxiv.org/abs/2501.02226

work page arXiv 2025

[40] [40]

Reasoning of large language models over knowledge graphs with super-relations

Song Wang, Junhong Lin, Xiaojie Guo, Julian Shun, Jundong Li, and Yada Zhu. Reasoning of large language models over knowledge graphs with super-relations. arXiv:2503.22166, 2025 c . URL https://arxiv.org/abs/2503.22166

work page arXiv 2025

[41] [41]

Knowledge-augmented language model for clinical question answering, 2023

Hao Xiong, Zifeng Wang, Zhijian-huang, Hao tian jia, Yefan-huang, Cheng zhong xu, Zheng-li, Zhi hong chen, Zhi yuan liu, and Zhong zhen su. Knowledge-augmented language model for clinical question answering, 2023

work page 2023

[42] [42]

Kg-bert: Bert for knowledge graph completion, 2019

Liang Yao, Chengsheng Mao, and Yuan Luo. Kg-bert: Bert for knowledge graph completion, 2019. URL https://arxiv.org/abs/1909.03193

work page arXiv 2019

[43] [43]

The value of semantic parse labeling for knowledge base question answering

Wen-tau Yih, Matthew Richardson, Chris Meek, Ming-Wei Chang, and Jina Suh. The value of semantic parse labeling for knowledge base question answering. In ACL, 2016. URL https://aclanthology.org/P16-2033.pdf

work page 2016

[44] [44]

Reinforcing multi-turn reasoning in llm agents via turn-level credit assignment

Siliang Zeng, Quan Wei, William Brown, Oana Frunza, Yuriy Nevmyvaka, and Mingyi Hong. Reinforcing multi-turn reasoning in llm agents via turn-level credit assignment. arXiv preprint arXiv:2505.11821, 2025

work page arXiv 2025

[45] [45]

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. Judging llm-as-a-judge with mt-bench and chatbot arena, 2023. URL https://arxiv.org/abs/2306.05685

work page internal anchor Pith review Pith/arXiv arXiv 2023

[46] [46]

Knowledge graph-guided retrieval augmented generation, 2025

Xiangrong Zhu, Yuexiang Xie, Yi Liu, Yaliang Li, and Wei Hu. Knowledge graph-guided retrieval augmented generation, 2025. URL https://arxiv.org/abs/2502.06864

work page arXiv 2025

[47] [47]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

work page

[48] [48]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

work page

[49] [49]

Victoria Beckham

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page arXiv 2014