pith. sign in

arxiv: 2605.25985 · v2 · pith:5ILNLXUCnew · submitted 2026-05-25 · 💻 cs.AI

Neural Scalable Symbolic Search Framework for Complex Logical Queries with Multiple Free Variables

Pith reviewed 2026-06-29 21:50 UTC · model grok-4.3

classification 💻 cs.AI
keywords complex query answeringknowledge graphsneural symbolic searchjoint rankingexistential first-order queriesmulti-variable queriesapproximate inference
0
0 comments X

The pith

NS3 approximates joint rankings of answer tuples for existential queries with multiple free variables on knowledge graphs by reducing them step by step over pruned domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces NS3 to address the intractability of ranking complete tuples for EFO_k queries, which ask for combinations of entities satisfying a logical pattern with k free variables. Marginal rankings over single variables serve as a poor stand-in for true joint tuple quality, so the method first solves simpler sub-queries to collect candidate entities. It then merges free variables into hypernodes, applies a dynamic budget B to limit their domains, and reduces the original k-variable query to a (k-1)-variable query over the smaller space. This process repeats until the query has only one free variable. The result is improved joint ranking accuracy on three standard knowledge graph datasets while marginal performance stays competitive, plus a new benchmark for evaluating multi-variable queries directly.

Core claim

NS3 answers marginalized sub-queries to obtain necessary candidate sets, merges multiple free variables into hypernodes whose domains are pruned and controlled by a dynamic budget B, and progressively reduces an EFO_k query to an EFO_{k-1} query over a budgeted reduced domain, thereby approximating joint ranking of tuples in E^k without enumerating the full space.

What carries the argument

Hypernode merging combined with dynamic budget B pruning, which reduces an EFO_k query to an EFO_{k-1} query over a controlled candidate domain at each step.

If this is right

  • Joint ranking performance improves substantially across the three standard KG datasets.
  • Marginal accuracy on individual variables remains strong.
  • Queries with up to three free variables become evaluable without intractable enumeration.
  • The released joint-ranking benchmark enables direct measurement of tuple quality rather than marginal proxies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same budgeted reduction pattern could be tested on queries with four or more free variables by tuning the budget schedule.
  • The approach may connect to other approximate inference techniques that trade exactness for tractability in structured search problems.
  • If the pruning step preserves most high-quality tuples, the framework could support downstream tasks that consume ranked multi-entity answers rather than single entities.

Load-bearing premise

Candidate sets obtained from marginalized sub-queries remain rich enough after hypernode merging and dynamic budget pruning to support accurate joint ranking of the original tuples.

What would settle it

Compute exact joint rankings for a sample of EFO_2 and EFO_3 queries on one of the three datasets and measure how much NS3's ranked list deviates from the true ordering in precision or NDCG at small cutoffs.

Figures

Figures reproduced from arXiv: 2605.25985 by Hang Yin, Shukai Zhao, Wei Zhang, Weizhi Fei, Yangqiu Song, Zihao Wang.

Figure 1
Figure 1. Figure 1: Visualization of EFO2 in fraudulent activities. We present its logical formula and query graph. Notably, the answers to this query are tuples. set E, a 𝑘-variable query has candidates in the Cartesian product E 𝑘 . Under KG incompleteness, many tuples can be valid answers only after reasoning over missing facts, which makes exhaustive joint inference over E 𝑘 computationally prohibitive for realistic KGs. … view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of the inferring the EFO1 query with existing neural symbolic search methods [55]. These methods gradually remove the edges connected with constant nodes, self-loop edges, and the edges connected with leaf nodes. The fuzzy vectors are updated accordingly, and the final fuzzy vector for the free variable can induce the predicted answer set. including MRR, and HIT@10[38]. Answering EFO𝑘 queries… view at source ↗
Figure 3
Figure 3. Figure 3: Visualization to infer the joint ranking of [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of the query types in our scalable [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The performance of NS3 (J) and NS3 (M) varying with the domain budget [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Complex Query Answering (CQA) is a fundamental knowledge representation and reasoning task over incomplete knowledge graphs (KGs). Answering existential first-order queries with $k$ free variables (i.e., $\text{EFO}_k$ queries) is a crucial yet challenging problem, as it requires ranking answer tuples in $\mathcal{E}^k$, where $\mathcal{E}$ denotes the entity set of a KG. This quickly becomes intractable as $k$ grows. Consequently, existing benchmarks and methods rely on marginal rankings over individual variables; however, marginal rankings are a poor proxy for the true joint ranking of tuples. Building on neural symbolic search for $\text{EFO}_1$ queries, we propose Neural Scalable Symbolic Search (NS3), a budgeted framework that approximates joint ranking without enumerating $\mathcal{E}^k$. NS3 (i) answers marginalized sub-queries to obtain necessary candidate sets, (ii) merges multiple free variables into hypernodes whose domains are pruned and controlled by a dynamic budget $B$, and (iii) progressively reduces an $\text{EFO}_k$ query to an $\text{EFO}_{k-1}$ query over a budgeted reduced domain. Across three standard KG datasets, NS3 substantially improves joint ranking performance while retaining strong marginal accuracy. We further release a joint-ranking benchmark that extends existing $\text{EFO}_1$ datasets to $k=3$, enabling systematic evaluation of multi-variable queries. Our code is provided in https://github.com/HKUST-KnowComp/NS3_KDD2026.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes Neural Scalable Symbolic Search (NS3), a budgeted reduction framework for answering EFO_k queries over knowledge graphs. It obtains candidate sets from marginalized sub-queries, merges free variables into hypernodes whose domains are pruned by a dynamic budget B, and iteratively reduces EFO_k to EFO_{k-1}. The central empirical claim is that NS3 substantially improves joint ranking of answer tuples while retaining strong marginal accuracy on three standard KG datasets; the authors also release code and a new joint-ranking benchmark extending existing EFO_1 data to k=3.

Significance. If the pruning step reliably retains the top joint tuples, the work would meaningfully advance scalable complex query answering by moving beyond marginal proxies. The provision of reproducible code and an explicit joint-ranking benchmark for k=3 is a concrete strength that supports systematic evaluation.

major comments (1)
  1. [Abstract / Method] Abstract and method description: the claim that marginalized candidate sets plus hypernode merging and B-pruning are sufficient to approximate true joint ranking rests on the unverified assumption that these sets contain the relevant high-joint-score tuples; no recall of the generated candidate sets against ground-truth joint answers, nor exhaustive small-k enumeration to quantify pruning loss, is reported. This directly bears on the joint-ranking improvement result.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. The concern about verifying that candidate sets retain high-joint-score tuples is well-taken and directly relevant to the core claim. We respond to the major comment below.

read point-by-point responses
  1. Referee: [Abstract / Method] Abstract and method description: the claim that marginalized candidate sets plus hypernode merging and B-pruning are sufficient to approximate true joint ranking rests on the unverified assumption that these sets contain the relevant high-joint-score tuples; no recall of the generated candidate sets against ground-truth joint answers, nor exhaustive small-k enumeration to quantify pruning loss, is reported. This directly bears on the joint-ranking improvement result.

    Authors: We agree that the current manuscript does not report explicit recall of the marginalized candidate sets against ground-truth joint answers, nor exhaustive enumeration for small k to measure pruning loss. This verification would strengthen the empirical support for the approximation. In the revised version we will add a dedicated analysis: (i) recall@B of the hypernode candidate sets against the released k=3 joint-ranking benchmark ground truth, and (ii) exhaustive enumeration on small-k subsets (k=2) of the data to quantify any loss introduced by the dynamic budget B. These additions will be placed in the experimental section and will directly address whether the pruning step retains the relevant high-joint-score tuples. revision: yes

Circularity Check

0 steps flagged

Minor self-citation to prior EFO1 neural symbolic search; central budgeted reduction for EFOk is independent

full rationale

The paper explicitly builds on prior neural symbolic search for EFO1 queries but introduces new steps (marginalized sub-queries, hypernode merging, dynamic budget B pruning) to reduce EFOk to EFOk-1. The joint-ranking improvement is reported as an empirical outcome on three KG datasets rather than a quantity forced by construction from fitted inputs or self-citations. No equation or claim reduces the reported performance gain to a renamed fit or to an unverified self-citation chain; the released joint-ranking benchmark further supplies independent evaluation content.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that sub-query candidate sets plus budgeted hypernode pruning preserve joint ranking quality; budget B is a tunable control parameter whose specific values are not detailed in the abstract.

free parameters (1)
  • budget B
    Dynamic budget controlling reduced domain size for hypernodes; value chosen to trade off computation and accuracy.
axioms (1)
  • domain assumption Marginalized sub-queries produce candidate sets that contain all high-quality joint tuples.
    Invoked when NS3 answers marginalized sub-queries to obtain necessary candidate sets.
invented entities (1)
  • hypernode no independent evidence
    purpose: Merging multiple free variables into a single node whose domain is pruned by budget B.
    New construct introduced to enable progressive reduction of EFO_k to EFO_{k-1}.

pith-pipeline@v0.9.1-grok · 5825 in / 1299 out tokens · 30400 ms · 2026-06-29T21:50:57.876307+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 22 canonical work pages · 2 internal anchors

  1. [1]

    Erik Arakelyan, Daniel Daza, Pasquale Minervini, and Michael Cochez. 2020. Complex Query Answering with Neural Link Predictors. InInternational Confer- ence on Learning Representations

  2. [2]

    Erik Arakelyan, Pasquale Minervini, and Isabelle Augenstein. 2023. Adapting Neural Link Predictors for Complex Query Answering. doi:10.48550/arXiv.2301. 12313 arXiv:2301.12313 [cs]

  3. [3]

    Jiaxin Bai, Xin Liu, Weiqi Wang, Chen Luo, and Yangqiu Song. 2023. Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Con- straints. InThirty-seventh Conference on Neural Information Processing Systems. https://openreview.net/forum?id=qQnO1HLQHe

  4. [4]

    Jiaxin Bai, Zhaobo Wang, Junfei Cheng, Dan Yu, Zerui Huang, Weiqi Wang, Xin Liu, Chen Luo, Yanming Zhu, Bo Li, et al . 2026. Intention knowledge graph construction for user intention relation modeling. InProceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 466–484

  5. [5]

    Jiaxin Bai, Zihao Wang, Yukun Zhou, Hang Yin, Weizhi Fei, Qi Hu, Zheye Deng, Jiayang Cheng, Tianshi Zheng, Hong Ting Tsang, et al. 2025. Top ten challenges towards agentic neural graph databases.arXiv preprint arXiv:2501.14224(2025)

  6. [6]

    Jiaxin Bai, Tianshi Zheng, and Yangqiu Song. 2023. Sequential query encoding for complex query answering on knowledge graphs.arXiv preprint arXiv:2302.13114 (2023)

  7. [7]

    Yushi Bai, Xin Lv, Juanzi Li, and Lei Hou. 2023. Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization. In Neural Scalable Symbolic Search Framework for Complex Logical Queries with Multiple Free Variables KDD 2026, August 9–13, 2026, Jeju Island, Republic of Korea. Proceedings of the 40th International Conference o...

  8. [8]

    Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Ok- sana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. InAdvances in Neural Information Processing Systems, Vol. 26. Curran Associates, Inc. https://papers.nips.cc/paper_files/paper/2013/hash/1cecc7/ /a77928ca8133fa24680a88d2f9-Abstract.html

  9. [9]

    Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam Hruschka, and Tom Mitchell. 2010. Toward an architecture for never-ending language learning. InProceedings of the AAAI conference on artificial intelligence, Vol. 24. 1306–1313. Issue: 1

  10. [10]

    Xuelu Chen, Ziniu Hu, and Yizhou Sun. 2022. Fuzzy logic based logical query an- swering on knowledge graphs. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 3939–3948. Issue: 4

  11. [11]

    Nurendra Choudhary, Nikhil Rao, Sumeet Katariya, Karthik Subbian, and Chan- dan Reddy. 2021. Probabilistic entity representation model for reasoning over knowledge graphs.Advances in Neural Information Processing Systems34 (2021), 23440–23451

  12. [12]

    Nurendra Choudhary and Chandan K. Reddy. 2023. Complex Logical Reasoning over Knowledge Graphs Using Large Language Models. arXiv:2305.01157 [cs]

  13. [13]

    Weizhi Fei, Hao Shi, Jing Xu, Jingchen Peng, Jiazheng Li, Jingzhao Zhang, Bo Bai, Wei Han, Zhenyuan Chen, and Xueyan Niu. 2026. Scaling Knowledge Editing in LLMs to 100,000 Facts with Neural KV Database. InThe Fourteenth International Conference on Learning Representations. https://openreview.net/ forum?id=Z0CX62CSJQ

  14. [14]

    WeiZhi Fei, Zihao Wang, hang Yin, Shukai Zhao, Wei Zhang, and Yangqiu Song

  15. [15]

    https://api.semanticscholar.org/CorpusID:278534407

    Efficient and Scalable Neural Symbolic Search for Knowledge Graph Com- plex Query Answering. https://api.semanticscholar.org/CorpusID:278534407

  16. [16]

    Weizhi Fei, Zihao Wang, Hang Yin, Yang Duan, and Yangqiu Song. 2025. Extend- ing Complex Logical Queries on Uncertain Knowledge Graphs. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Moham- mad Taher Pilehvar (Eds.). Association for Comp...

  17. [17]

    Mikhail Galkin, Jincheng Zhou, Bruno Ribeiro, Jian Tang, and Zhaocheng Zhu

  18. [18]

    InThe Thirty- eighth Annual Conference on Neural Information Processing Systems

    A Foundation Model for Zero-shot Logical Query Reasoning. InThe Thirty- eighth Annual Conference on Neural Information Processing Systems. https:// openreview.net/forum?id=JRSyMBBJi6

  19. [19]

    Will Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, and Jure Leskovec

  20. [20]

    Embedding logical queries on knowledge graphs.Advances in neural information processing systems31 (2018)

  21. [21]

    Yunjie He, Bo Xiong, Daniel Hernández, Yuqicheng Zhu, Evgeny Kharlamov, and Steffen Staab. 2025. Dage: Dag query answering via relational combinator with logical constraints. InProceedings of the ACM on Web Conference 2025. 2514–2529

  22. [22]

    Christian Hirsch, John Hosking, and John Grundy. 2009. Interactive visualization tools for exploring the semantic graph of large knowledge spaces. (2009)

  23. [23]

    Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs.Advances in neural information processing systems 33 (2020), 22118–22133

  24. [24]

    Mayank Kharbanda, Rajiv Ratn Shah, and Raghava Mutharaju. 2024. RConE: Rough Cone Embedding for Multi-Hop Logical Query Answering on Multi-Modal Knowledge Graphs.arXiv preprint arXiv:2408.11526(2024)

  25. [25]

    Yufei Li, Yisen Gao, Jiaxin Bai, Jiaxuan Xiong, Haoyu Huang, Zhongwei Xie, Hong Ting Tsang, and Yangqiu Song. 2026. Towards Neural Graph Data Man- agement.arXiv preprint arXiv:2603.05529(2026)

  26. [26]

    Xueyuan Lin, Haihong E, Chengjin Xu, Gengxian Zhou, Haoran Luo, Tianyi Hu, Fenglong Su, Ningyuan Li, and Mingzhi Sun. 2023. TFLEX: Temporal Feature- Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph. InThirty-seventh Conference on Neural Information Processing Systems. https://openreview.net/forum?id=oaGdsgB18L

  27. [27]

    Lihui Liu. 2025. Graph-O1: Monte Carlo Tree Search with Reinforcement Learning for Text-Attributed Graph Reasoning.arXiv preprint arXiv:2512.17912(2025)

  28. [28]

    Lihui Liu. 2025. HyperKGR: Knowledge Graph Reasoning in Hyperbolic Space with Graph Neural Network Encoding Symbolic Path. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 25188–25199

  29. [29]

    Lihui Liu. 2025. Monte Carlo Tree Search for Graph Reasoning in Large Lan- guage Model Agents. InProceedings of the 34th ACM International Conference on Information and Knowledge Management. 4966–4970

  30. [30]

    Lihui Liu, Jiayuan Ding, Subhabrata Mukherjee, and Carl Yang. 2026. Mixrag: Mixture-of-experts retrieval-augmented generation for textual graph understand- ing and question answering. InProceedings of the ACM Web Conference 2026. 4350–4359

  31. [31]

    Lihui Liu, Boxin Du, Jiejun Xu, Yinglong Xia, and Hanghang Tong. 2022. Joint Knowledge Graph Completion and Question Answering. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1098–1108

  32. [32]

    Lihui Liu and Kai Shu. 2025. Unifying knowledge in agentic llms: Concepts, methods, and recent advancements.ACM SIGKDD Explorations Newsletter27, 2 (2025), 88–96

  33. [33]

    Lihui Liu, Zihao Wang, and Hanghang Tong. 2025. Neural-symbolic reasoning over knowledge graphs: A survey from a query perspective.ACM SIGKDD Explorations Newsletter27, 1 (2025), 124–136

  34. [34]

    Lihui Liu and Yuchen Yan. 2026. MORGAN: To Bridge Mixture of Experts and Spectral Graph Neural Network. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 23783–23791

  35. [35]

    Francois Luus, Prithviraj Sen, Pavan Kapanipathi, Ryan Riegel, Ndivhuwo Makondo, Thabang Lebese, and Alexander Gray. 2021. Logic embeddings for complex query answering.arXiv preprint arXiv:2103.00418(2021)

  36. [36]

    Jerry M Mendel. 1995. Fuzzy logic systems for engineering: a tutorial.Proc. IEEE 83, 3 (1995), 345–377

  37. [37]

    Rajeev Rastogi. 2012. Building knowledge bases from the web. InProceedings of the 18th International Conference on Management of Data. 5–5

  38. [38]

    Steffen Remus, Manuel Kaufmann, Kathrin Ballweg, Tatiana von Landesberger, and Chris Biemann. 2017. Storyfinder: Personalized knowledge base construc- tion and management by browsing the web. InProceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2519–2522

  39. [39]

    Hongyu Ren, Mikhail Galkin, Michael Cochez, Zhaocheng Zhu, and Jure Leskovec

  40. [40]

    doi:10.48550/arXiv.2303.14617 arXiv:2303.14617 [cs]

    Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases. doi:10.48550/arXiv.2303.14617 arXiv:2303.14617 [cs]

  41. [41]

    H Ren, W Hu, and J Leskovec. 2020. Query2box: Reasoning Over Knowledge Graphs In Vector Space Using Box Embeddings. InInternational Conference on Learning Representations (ICLR)

  42. [42]

    Hongyu Ren and Jure Leskovec. 2020. Beta embeddings for multi-hop logical reasoning in knowledge graphs.Advances in Neural Information Processing Systems33 (2020), 19716–19726

  43. [43]

    Tara Safavi and Danai Koutra. 2020. Codex: A comprehensive knowledge graph completion benchmark.arXiv preprint arXiv:2009.07810(2020)

  44. [44]

    Baoxu Shi and Tim Weninger. 2018. Open-world knowledge graph completion. InProceedings of the AAAI conference on artificial intelligence, Vol. 32

  45. [45]

    Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. InProceedings of the 3rd workshop on continuous vector space models and their compositionality. 57–66

  46. [46]

    Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choud- hury, and Michael Gamon. 2015. Representing text for joint embedding of text and knowledge bases. InProceedings of the 2015 conference on empirical methods in natural language processing. 1499–1509

  47. [47]

    Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. InInternational conference on machine learning. PMLR, 2071–2080

  48. [48]

    Zihao Wang, Weizhi Fei, Hang Yin, Yangqiu Song, Ginny Wong, and Simon See

  49. [49]

    InFindings of the Association for Compu- tational Linguistics: ACL 2023, Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.)

    Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport. InFindings of the Association for Compu- tational Linguistics: ACL 2023, Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 13679–13696. doi:10.18653/v1/2023.findings-acl.864

  50. [50]

    Zihao Wang, Yangqiu Song, Ginny Wong, and Simon See. 2023. Logical Message Passing Networks with One-hop Inference on Atomic Formulas. InThe Eleventh International Conference on Learning Representations. https://openreview.net/ forum?id=SoyOsp7i_l

  51. [51]

    Zihao Wang, Hang Yin, Lihui Liu, Hanghang Tong, Yangqiu Song, Ginny Wong, and Simon See. 2026. R2𝑘 is Theoretically Large Enough for Embedding-based Top-𝑘Retrieval.arXiv preprint arXiv:2601.20844(2026)

  52. [52]

    Zihao Wang, Hang Yin, and Yangqiu Song. 2021. Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs.Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks1 (Dec. 2021). https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/ hash/7eabe3a1649ffa2//b3ff8c02ebfd5659f-Abstract-...

  53. [53]

    Tianle Xia, Liang Ding, Guojia Wan, Yibing Zhan, Bo Du, and Dacheng Tao

  54. [54]

    InProceedings of the AAAI Conference on Artificial Intelligence, Vol

    Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. Philadelphia, Pennsylvania, 12881–12889. doi:10.1609/aaai.v39i12.33405

  55. [55]

    Wenhan Xiong, Thien Hoang, and William Yang Wang. 2017. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning. InEMNLP

  56. [56]

    Yao Xu, Shizhu He, Jiabei Chen, Zihao Wang, Yangqiu Song, Hanghang Tong, Guang Liu, Jun Zhao, and Kang Liu. 2024. Generate-on-Graph: Treat LLM as Both Agent and KG for Incomplete Knowledge Graph Question Answer- ing. InProceedings of the 2024 Conference on Empirical Methods in Natural Lan- guage Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Che...

  57. [57]

    Zezhong Xu, Wen Zhang, Peng Ye, Hui Chen, and Huajun Chen. 2022. Neural- Symbolic Entangled Framework for Complex Query Answering. doi:10.48550/ arXiv.2209.08779 arXiv:2209.08779 [cs]

  58. [58]

    Dong Yang, Peijun Qing, Yang Li, Haonan Lu, and Xiaodong Lin. 2022. GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs. doi:10.48550/ arXiv.2210.15578 arXiv:2210.15578 [cs]. KDD 2026, August 9–13, 2026, Jeju Island, Republic of Korea. Weizhi Fei, Hang Yin, Zihao Wang, Shukai Zhao, Wei Zhang, and Yangqiu Song

  59. [59]

    Hang Yin, Zihao Wang, Weizhi Fei, and Yangqiu Song. 2025. EFOk-CQA: To- wards Knowledge Graph Complex Query Answering beyond Set Operation. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2(Toronto ON, Canada)(KDD ’25). Association for Computing Machinery, New York, NY, USA, 5876–5887. doi:10.1145/3711896.3737426

  60. [60]

    Hang Yin, Zihao Wang, and Yangqiu Song. 2024. Meta Operator for Com- plex Query Answering on Knowledge Graphs. http://arxiv.org/abs/2403.10110 arXiv:2403.10110 [cs]

  61. [61]

    Hang Yin, Zihao Wang, and Yangqiu Song. 2024. Rethinking Existential First Order Queries and their Inference on Knowledge Graphs. InThe Twelfth Interna- tional Conference on Learning Representations

  62. [62]

    Chongzhi Zhang, Zhiping Peng, Junhao Zheng, and Qianli Ma. 2024. Condi- tional Logical Message Passing Transformer for Complex Query Answering. arXiv:2402.12954 [cs.LG] https://arxiv.org/abs/2402.12954

  63. [63]

    Zhanqiu Zhang, Jie Wang, Jiajun Chen, Shuiwang Ji, and Feng Wu. 2021. Cone: Cone embeddings for multi-hop reasoning over knowledge graphs.Advances in Neural Information Processing Systems34 (2021), 19172–19183

  64. [64]

    Tianshi Zheng, Jiaxin Bai, Yicheng Wang, Tianqing Fang, Yue Guo, Yauwai Yim, and Yangqiu Song. 2024. Clr-fact: Evaluating the complex logical reasoning capability of large language models over factual knowledge.arXiv preprint arXiv:2407.20564(2024)

  65. [65]

    Tianshi Zheng, Jiazheng Wang, Zihao Wang, Jiaxin Bai, Hang Yin, Zheye Deng, Yangqiu Song, and Jianxin Li. 2025. Enhancing transformers for generalizable first-order logical entailment. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 5505–5524

  66. [66]

    Zhaocheng Zhu, Mikhail Galkin, Zuobai Zhang, and Jian Tang. 2022. Neural- Symbolic Models for Logical Queries on Knowledge Graphs.arXiv preprint arXiv:2205.10128(2022). A Evaluation and baseline details A.1 Metrics of marginal ranking Complex query answering aims to discover new answers to logical queries over incomplete answers. Consider an observed know...