arxiv: 2605.12520 · v1 · submitted 2026-04-03 · 💻 cs.CL · cs.AI

Recognition: 1 theorem link

· Lean Theorem

BoostTaxo: Zero-Shot Taxonomy Induction via Boosting-Style Agentic Reasoning and Constraint-Aware Calibration

Yancheng Ling , Zhenlin Qin , Leizhen Wang , Zhenliang Ma

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:21 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords zero-shot taxonomy inductionLLM boostingparent candidate selectionstructure-aware calibrationWordNet benchmarkDBLPSemEval-Sci

0 comments

The pith

BoostTaxo induces taxonomies from domain terms using a boosting-style LLM framework in zero-shot settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces BoostTaxo, a boosting-style LLM framework for zero-shot taxonomy induction from a set of domain terms. It refines definitions with retrieval, selects parents in a coarse-to-fine hybrid manner using lightweight and large LLMs, and calibrates scores with structural features to build reliable hierarchies. This matters because existing methods struggle with generalization and structural reliability in new domains. The evaluation on WordNet, DBLP, and SemEval-Sci shows it achieves superior or comparable performance to state-of-the-art methods, with ablations confirming the key components.

Core claim

BoostTaxo performs parent identification in a coarse-to-fine manner, employing retrieval-augmented definition refinement, hybrid parent candidate selection, candidate rating, and structure-aware score calibration to improve taxonomy construction, where a lightweight LLM filters candidates and a large-scale LLM ranks them, with structural features calibrating edge weights.

What carries the argument

Boosting-style agentic reasoning with hybrid LLM selection and structure-aware score calibration for parent identification.

Load-bearing premise

The combination of retrieval-augmented refinement, hybrid selection, and structure calibration will consistently produce reliable taxonomies without inheriting biases from the underlying LLMs.

What would settle it

Evaluating BoostTaxo on a new benchmark dataset where its performance falls below that of existing zero-shot methods would falsify the superior performance claim.

Figures

Figures reproduced from arXiv: 2605.12520 by Leizhen Wang, Yancheng Ling, Zhenliang Ma, Zhenlin Qin.

**Figure 2.** Figure 2: The boosting-style zero-shot framework for taxonomy induction. Starting from a root concept and a term set, the framework first enriches term [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Prompt design for LLM-Based term definition refinement [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Prompt design for LLM-based candidate parent ranking and scoring [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Prompt design for LLM-Based calibration of candidate parent scores [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of taxonomy construction results across di [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Representative failure cases in taxonomy construction. In each subfigure, the left side shows the ground-truth taxonomy and the right side shows the [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

read the original abstract

Taxonomy induction is crucial for organizing concepts into explicit and interpretable semantic hierarchies. While existing methods have achieved promising results, their generalization, structural reliability, and efficiency remain limited, hindering their performance in zero-shot and large-scale scenarios. To overcome these limitations, we introduce BoostTaxo, a boosting-style LLM framework for zero-shot taxonomy induction. It takes a set of domain terms as inputs and performs parent identification in a coarse-to-fine manner, employing retrieval-augmented definition refinement, hybrid parent candidate selection, candidate rating, and structure-aware score calibration to improve taxonomy construction. Specifically, a lightweight LLM is used to efficiently filter candidate parents, while a large-scale LLM is employed to rank and score candidate parents for fine-grained parent selection. Structural features are further incorporated to calibrate candidate edge weights and enhance the reliability of the induced taxonomy. The unified BoostTaxo is evaluated on three public benchmark datasets, namely WordNet, DBLP, and SemEval-Sci, and achieves superior or comparable performance to state-of-the-art methods in zero-shot taxonomy induction. The ablation study validates the contribution of the hybrid parent candidate selection and the structure-aware score calibration to the overall performance. Further analysis investigates the impact of candidate selection size on taxonomy quality and presents representative case and failure studies, providing deeper insights into the effectiveness and limitations of the proposed framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BoostTaxo adds a staged LLM pipeline with hybrid selection and structure calibration for zero-shot taxonomy induction, but the zero-shot claim needs checking against pretraining leakage on the old benchmarks.

read the letter

The main thing to know is that this paper describes a boosting-style agentic setup for zero-shot taxonomy induction. It breaks parent finding into coarse filtering with a small LLM, then fine ranking with a large one, plus retrieval for definition refinement and a final structure-aware calibration step on edge scores. That combination is presented as new and is the part worth looking at if you work on automatic hierarchy building.

Referee Report

2 major / 2 minor

Summary. The paper introduces BoostTaxo, a boosting-style LLM framework for zero-shot taxonomy induction that takes domain terms as input and performs coarse-to-fine parent identification via retrieval-augmented definition refinement, hybrid parent candidate selection (lightweight LLM for filtering, large-scale LLM for ranking), candidate rating, and structure-aware score calibration. It evaluates the unified approach on WordNet, DBLP, and SemEval-Sci benchmarks, claiming superior or comparable performance to SOTA methods, with ablations validating the hybrid selection and calibration components plus analysis of candidate size and case studies.

Significance. If the performance gains are shown to stem from the proposed pipeline rather than LLM pretraining leakage, the work could meaningfully advance reliable zero-shot taxonomy induction by demonstrating practical ways to combine lightweight and large LLMs with structural constraints for better generalization and reduced error propagation in hierarchy construction.

major comments (2)

[Evaluation] Evaluation section (and abstract): the claim of superior/comparable zero-shot performance on WordNet, DBLP, and SemEval-Sci is load-bearing but vulnerable to pretraining contamination, as these long-standing public hierarchies are likely present in LLM training corpora; without decontamination experiments, post-cutoff held-out domains, or explicit checks that parent-ranking steps do not exploit memorized fragments, the results cannot isolate framework efficacy from leakage.
[Ablation study] Ablation study: while it validates hybrid candidate selection and structure-aware calibration, the reported metrics lack error bars, statistical significance tests, or full baseline details (e.g., exact SOTA implementations and hyperparameter settings), making it difficult to assess whether the gains are robust or merely incremental.

minor comments (2)

[Abstract] Abstract and method description: the term 'boosting-style' is used without a precise mapping to classical boosting mechanics (e.g., sequential error weighting), which could be clarified to avoid confusion with standard ensemble boosting.
[Results] Figure and table captions: ensure all quantitative results (precision, recall, F1, or hierarchy metrics) are explicitly labeled with dataset splits and LLM versions used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will incorporate revisions to strengthen the evaluation and ablation sections.

read point-by-point responses

Referee: [Evaluation] Evaluation section (and abstract): the claim of superior/comparable zero-shot performance on WordNet, DBLP, and SemEval-Sci is load-bearing but vulnerable to pretraining contamination, as these long-standing public hierarchies are likely present in LLM training corpora; without decontamination experiments, post-cutoff held-out domains, or explicit checks that parent-ranking steps do not exploit memorized fragments, the results cannot isolate framework efficacy from leakage.

Authors: We agree this is a valid and important concern for any LLM-based method evaluated on long-standing public benchmarks. Our framework is designed to reduce reliance on memorization through retrieval-augmented definition refinement, hybrid candidate selection (lightweight filtering followed by large-model ranking), and structure-aware calibration that enforces hierarchical constraints. These components encourage step-by-step reasoning over direct recall. Nevertheless, without explicit decontamination or post-cutoff experiments, full isolation of framework efficacy from leakage remains difficult. In the revised manuscript we will add a dedicated limitations subsection discussing pretraining contamination risks for the chosen benchmarks and, where feasible, report preliminary checks (e.g., membership inference on a small set of held-out terms or comparison against a synthetic domain). revision: yes
Referee: [Ablation study] Ablation study: while it validates hybrid candidate selection and structure-aware calibration, the reported metrics lack error bars, statistical significance tests, or full baseline details (e.g., exact SOTA implementations and hyperparameter settings), making it difficult to assess whether the gains are robust or merely incremental.

Authors: We thank the referee for highlighting this presentation gap. The ablation results were intended to isolate the contributions of hybrid selection and calibration, yet we acknowledge the absence of error bars, significance testing, and exhaustive baseline documentation reduces interpretability. In the revision we will (1) rerun ablations with multiple seeds where stochasticity exists and report means with standard deviations, (2) add statistical significance tests (paired t-tests on F1 and precision@1), and (3) expand the appendix with exact model versions, prompt templates, temperature settings, and reproduction details for all baselines. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation; framework uses external LLMs and benchmarks

full rationale

The paper describes a boosting-style LLM pipeline (retrieval-augmented refinement, hybrid candidate selection, structure-aware calibration) for zero-shot taxonomy induction. No equations, fitted parameters, or self-definitions reduce the claimed outputs to the inputs by construction. Performance is assessed via external public benchmarks (WordNet, DBLP, SemEval-Sci) and ablations that isolate component contributions without renaming or self-referential closure. No load-bearing self-citations or uniqueness theorems imported from prior author work appear in the provided text. The derivation chain remains independent of the target results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that LLMs guided by retrieval and structural calibration can perform accurate parent identification in a zero-shot manner; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption LLMs can reliably identify semantic parent concepts when provided with retrieval-augmented definitions and structure-aware score calibration
This assumption underpins the coarse-to-fine parent selection and overall taxonomy quality claims.

pith-pipeline@v0.9.0 · 5553 in / 1163 out tokens · 39086 ms · 2026-05-14T21:21:22.078518+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

BoostTaxo... employs retrieval-augmented definition refinement, hybrid parent candidate selection, candidate rating, and structure-aware score calibration... Maximum Spanning Arborescence... Chu–Liu/Edmonds algorithm

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 2 internal anchors

[1]

Hiexpan: Task-guided taxonomy construction by hierarchical tree expansion,

J. Shen, Z. Wu, D. Lei, C. Zhang, X. Ren, M. T. Vanni, B. M. Sadler, and J. Han, “Hiexpan: Task-guided taxonomy construction by hierarchical tree expansion,” inProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery&Data Mining, 2018, pp. 2180– 2189

work page 2018
[2]

Setexpan: Corpus- based set expansion via context feature selection and rank ensemble,

J. Shen, Z. Wu, D. Lei, J. Shang, X. Ren, and J. Han, “Setexpan: Corpus- based set expansion via context feature selection and rank ensemble,” inJoint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2017, pp. 288–304

work page 2017
[3]

Automatic taxonomy con- struction from keywords,

X. Liu, Y . Song, S. Liu, and H. Wang, “Automatic taxonomy con- struction from keywords,” inProceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 1433–1441

work page 2012
[4]

An intent taxonomy for questions asked in web search,

B. B. Cambazoglu, L. Tavakoli, F. Scholer, M. Sanderson, and B. Croft, “An intent taxonomy for questions asked in web search,” inProceedings of the 2021 Conference on Human Information Interaction and Retrieval, 2021, pp. 85–94

work page 2021
[5]

Efficiently answering technical questions—a knowledge graph approach,

S. Yang, L. Zou, Z. Wang, J. Yan, and J.-R. Wen, “Efficiently answering technical questions—a knowledge graph approach,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, 2017

work page 2017
[6]

Building taxonomy of web search intents for name entity queries,

X. Yin and S. Shah, “Building taxonomy of web search intents for name entity queries,” inProceedings of the 19th international conference on World wide web, 2010, pp. 1001–1010

work page 2010
[7]

Probase: A probabilistic taxonomy for text understanding,

W. Wu, H. Li, H. Wang, and K. Q. Zhu, “Probase: A probabilistic taxonomy for text understanding,” inProceedings of the 2012 ACM SIGMOD international conference on management of data, 2012, pp. 481–492

work page 2012
[8]

Understand short texts by harvesting and analyzing semantic knowledge,

W. Hua, Z. Wang, H. Wang, K. Zheng, and X. Zhou, “Understand short texts by harvesting and analyzing semantic knowledge,”IEEE transactions on Knowledge and data Engineering, vol. 29, no. 3, pp. 499–512, 2016

work page 2016
[9]

Taxonomy-aware multi-hop reasoning networks for sequential recom- mendation,

J. Huang, Z. Ren, W. X. Zhao, G. He, J.-R. Wen, and D. Dong, “Taxonomy-aware multi-hop reasoning networks for sequential recom- mendation,” inProceedings of the twelfth ACM international conference on web search and data mining, 2019, pp. 573–581

work page 2019
[10]

Enhancing recommendation with automated tag taxonomy construction in hyper- bolic space,

Y . Tan, C. Yang, X. Wei, C. Chen, L. Li, and X. Zheng, “Enhancing recommendation with automated tag taxonomy construction in hyper- bolic space,” in2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2022, pp. 1180–1192. 12 industrial processfractionationfractional distillationdestructive distillationcarbonizationfractionation fract...

work page 2022
[11]

Learning to rank hypernyms of financial terms using semantic textual similarity,

S. Ghosh, A. Chopra, and S. K. Naskar, “Learning to rank hypernyms of financial terms using semantic textual similarity,”SN Computer Science, vol. 4, no. 5, p. 610, 2023

work page 2023
[12]

Kepl: Knowledge enhanced prompt learning for chinese hypernym-hyponym extraction,

N. Ma, D. Wang, H. Bao, L. He, and S. Zheng, “Kepl: Knowledge enhanced prompt learning for chinese hypernym-hyponym extraction,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 5858–5867

work page 2023
[13]

Constructing taxonomies from pre- trained language models,

C. Chen, K. Lin, and D. Klein, “Constructing taxonomies from pre- trained language models,” inProceedings of the 2021 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies, 2021, pp. 4687–4700

work page 2021
[14]

Automatic acquisition of hyponyms from large text corpora,

M. A. Hearst, “Automatic acquisition of hyponyms from large text corpora,” inCOLING 1992 volume 2: The 14th international conference on computational linguistics, 1992

work page 1992
[15]

A semi-supervised method to learn and construct taxonomies using the web,

Z. Kozareva and E. Hovy, “A semi-supervised method to learn and construct taxonomies using the web,” inProceedings of the 2010 conference on empirical methods in natural language processing, 2010, pp. 1110–1118

work page 2010
[16]

Lexico-syntactic patterns for automatic ontology building,

C. Klaussner and D. Zhekova, “Lexico-syntactic patterns for automatic ontology building,” inProceedings of the Second Student Research Workshop associated with RANLP 2011, 2011, pp. 109–114

work page 2011
[17]

Taxi at semeval-2016 task 13: a taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling,

A. Panchenko, S. Faralli, E. Ruppert, S. Remus, H. Naets, C. Fairon, S. P. Ponzetto, and C. Biemann, “Taxi at semeval-2016 task 13: a taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling,” inProceedings of the 10th international workshop on semantic evaluation (SemEval-2016), 2016, pp. 1320–1327

work page 2016
[18]

Large language models for generative infor- mation extraction: A survey,

D. Xu, W. Chen, W. Peng, C. Zhang, T. Xu, X. Zhao, X. Wu, Y . Zheng, Y . Wang, and E. Chen, “Large language models for generative infor- mation extraction: A survey,”Frontiers of Computer Science, vol. 18, no. 6, p. 186357, 2024

work page 2024
[19]

How to unleash the power of large language models for few-shot relation extraction?

X. Xu, Y . Zhu, X. Wang, and N. Zhang, “How to unleash the power of large language models for few-shot relation extraction?” inProceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP), 2023, pp. 190–200

work page 2023
[20]

Language mod- els are few-shot learners,

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell,et al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020

work page 1901
[21]

Large language models are zero-shot text classifiers,

Z. Wang, Y . Pang, and Y . Lin, “Large language models are zero-shot text classifiers,”arXiv preprint arXiv:2312.01044, 2023

work page arXiv 2023
[22]

Gpt3mix: Leveraging large-scale language models for text augmentation,

K. M. Yoo, D. Park, J. Kang, S.-W. Lee, and W. Park, “Gpt3mix: Leveraging large-scale language models for text augmentation,” in Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 2225–2239

work page 2021
[23]

A review of knowledge graph construction using large language models in transportation: Problems, methods, and challenges,

Y . Ling, Z. Qin, and Z. Ma, “A review of knowledge graph construction using large language models in transportation: Problems, methods, and challenges,”Transportation Research Part C: Emerging Technologies, vol. 183, p. 105428, 2026

work page 2026
[24]

Enhancing knowl- edge graph construction using large language models,

M. Trajanoska, R. Stojanov, and D. Trajanov, “Enhancing knowl- edge graph construction using large language models,”arXiv preprint arXiv:2305.04676, 2023

work page arXiv 2023
[25]

Prompting or fine-tuning? a comparative study of large language models for taxonomy construction,

B. Chen, F. Yi, and D. Varr ´o, “Prompting or fine-tuning? a comparative study of large language models for taxonomy construction,” in2023 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C). IEEE, 2023, pp. 588–596

work page 2023
[26]

Distilling hypernymy relations from language models: On the effectiveness of zero-shot taxonomy induction,

D. Jain and L. E. Anke, “Distilling hypernymy relations from language models: On the effectiveness of zero-shot taxonomy induction,” in Proceedings of the 11th joint conference on lexical and computational semantics, 2022, pp. 151–156

work page 2022
[27]

Chain-of-layer: Iteratively prompting large language models for taxon- omy induction from limited examples,

Q. Zeng, Y . Bai, Z. Tan, S. Feng, Z. Liang, Z. Zhang, and M. Jiang, “Chain-of-layer: Iteratively prompting large language models for taxon- omy induction from limited examples,” inProceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024, pp. 3093–3102

work page 2024
[28]

Chain-of-thought prompting elicits reasoning in large language models,

J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V . Le, D. Zhou,et al., “Chain-of-thought prompting elicits reasoning in large language models,”Advances in neural information processing systems, vol. 35, pp. 24 824–24 837, 2022

work page 2022
[29]

TinyLlama: An Open-Source Small Language Model

P. Zhang, G. Zeng, T. Wang, and W. Lu, “Tinyllama: An open-source small language model, 2024,”URL https://arxiv. org/abs/2401.02385, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[30]

Towards reasoning ability of small language models,

G. Srivastava, S. Cao, and X. Wang, “Towards reasoning ability of small language models,”arXiv preprint arXiv:2502.11569, 2025

work page arXiv 2025
[31]

Beyond chinchilla- optimal: Accounting for inference in language model scaling laws,

N. Sardana, J. Portes, S. Doubov, and J. Frankle, “Beyond chinchilla- optimal: Accounting for inference in language model scaling laws,” arXiv preprint arXiv:2401.00448, 2023

work page arXiv 2023
[32]

Emergent Abilities of Large Language Models

J. Wei, Y . Tay, R. Bommasani, C. Raffel, B. Zoph, S. Borgeaud, D. Yogatama, M. Bosma, D. Zhou, D. Metzler,et al., “Emergent abilities of large language models,”arXiv preprint arXiv:2206.07682, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[33]

Inferring concept hierarchies from text corpora via hyperbolic embeddings,

M. Le, S. Roller, L. Papaxanthos, D. Kiela, and M. Nickel, “Inferring concept hierarchies from text corpora via hyperbolic embeddings,” inProceedings of the 57th annual meeting of the association for computational linguistics, 2019, pp. 3231–3241

work page 2019
[34]

Improving hypernymy detection with an integrated path-based and distributional method,

V . Shwartz, Y . Goldberg, and I. Dagan, “Improving hypernymy detection with an integrated path-based and distributional method,” inProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016, pp. 2389–2398. 13

work page 2016
[35]

Learning syntactic patterns for auto- matic hypernym discovery,

R. Snow, D. Jurafsky, and A. Ng, “Learning syntactic patterns for auto- matic hypernym discovery,”Advances in neural information processing systems, vol. 17, 2004

work page 2004
[36]

Learning semantic hierarchies via word embeddings,

R. Fu, J. Guo, B. Qin, W. Che, H. Wang, and T. Liu, “Learning semantic hierarchies via word embeddings,” inProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014, pp. 1199–1209

work page 2014
[37]

Learning term embeddings for taxonomic relation identification using dynamic weighting neural network,

L. A. Tuan, Y . Tay, S. C. Hui, and S. K. Ng, “Learning term embeddings for taxonomic relation identification using dynamic weighting neural network,” inProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016, pp. 403–413

work page 2016
[38]

End-to-end reinforcement learning for automatic taxonomy induction,

Y . Mao, X. Ren, J. Shen, X. Gu, and J. Han, “End-to-end reinforcement learning for automatic taxonomy induction,” inProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 2462–2472

work page 2018
[39]

Rebel: Relation extraction by end-to-end language generation,

P.-L. H. Cabot and R. Navigli, “Rebel: Relation extraction by end-to-end language generation,” inFindings of the association for computational linguistics: EMNLP 2021, 2021, pp. 2370–2381

work page 2021
[40]

Unified structure generation for universal information extraction,

Y . Lu, Q. Liu, D. Dai, X. Xiao, H. Lin, X. Han, L. Sun, and H. Wu, “Unified structure generation for universal information extraction,” in Proceedings of the 60th annual meeting of the association for compu- tational linguistics (volume 1: long papers), 2022, pp. 5755–5772

work page 2022
[41]

Instruct and extract: Instruction tuning for on-demand information extraction,

Y . Jiao, M. Zhong, S. Li, R. Zhao, S. Ouyang, H. Ji, and J. Han, “Instruct and extract: Instruction tuning for on-demand information extraction,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 10 030–10 051

work page 2023
[42]

Adelie: Aligning large language models on information extraction,

Y . Qi, H. Peng, X. Wang, B. Xu, L. Hou, and J. Li, “Adelie: Aligning large language models on information extraction,” inProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024, pp. 7371–7387

work page 2024
[43]

Genres: Rethinking evaluation for generative relation extraction in the era of large language models,

P. Jiang, J. Lin, Z. Wang, J. Sun, and J. Han, “Genres: Rethinking evaluation for generative relation extraction in the era of large language models,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 2820–2837

work page 2024
[44]

Towards robust universal information extraction: Dataset, evaluation, and solu- tion,

J. Zhu, A. Shi, Z. Li, L. Bai, X. Jin, J. Guo, and X. Cheng, “Towards robust universal information extraction: Dataset, evaluation, and solu- tion,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025, pp. 28 052– 28 070

work page 2025
[45]

Large lan- guage models are zero-shot reasoners,

T. Kojima, S. S. Gu, M. Reid, Y . Matsuo, and Y . Iwasawa, “Large lan- guage models are zero-shot reasoners,”Advances in neural information processing systems, vol. 35, pp. 22 199–22 213, 2022

work page 2022
[46]

Pal: Program-aided language models,

L. Gao, A. Madaan, S. Zhou, U. Alon, P. Liu, Y . Yang, J. Callan, and G. Neubig, “Pal: Program-aided language models,” inInternational conference on machine learning. PMLR, 2023, pp. 10 764–10 799

work page 2023
[47]

Reason-align-respond: Aligning llm reasoning with knowledge graphs for kgqa,

X. Shen, F. Wang, Z. Yang, B. Wang, W. Du, C. Zong, and R. Xia, “Reason-align-respond: Aligning llm reasoning with knowledge graphs for kgqa,”IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2026

work page 2026
[48]

Optimum branchings,

J. Edmondset al., “Optimum branchings,”Journal of Research of the national Bureau of Standards B, vol. 71, no. 4, pp. 233–240, 1967

work page 1967
[49]

Taxonomy construction of unseen domains via graph-based cross-domain knowledge transfer,

C. Shang, S. Dash, M. F. M. Chowdhury, N. Mihindukulasooriya, and A. Gliozzo, “Taxonomy construction of unseen domains via graph-based cross-domain knowledge transfer,” inProceedings of the 58th annual meeting of the Association for Computational Linguistics, 2020, pp. 2198–2208. Yancheng Lingreceived his B.E. degrees in Traf- fic Engineering and Internet...

work page 2020