Taxon: Hierarchical Tax Code Prediction with Semantically Aligned LLM Expert Guidance

Chuanfei Xu; Jihang Li; Jing Wang; Qing Liu; Wei Wang; Zeyi Wen; Zulong Chen

arxiv: 2601.08418 · v2 · submitted 2026-01-13 · 💻 cs.LG · cs.AI

Taxon: Hierarchical Tax Code Prediction with Semantically Aligned LLM Expert Guidance

Jihang Li , Qing Liu , Zulong Chen , Jing Wang , Wei Wang , Chuanfei Xu , Zeyi Wen This is my paper

Pith reviewed 2026-05-16 14:39 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords tax code predictionhierarchical classificationmixture of expertssemantic consistencye-commerce automationcompliance managementlarge language modelsmulti-source training

0 comments

The pith

A mixture-of-experts model guided by distilled LLM semantics maps products to hierarchical tax codes with high accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Taxon as a framework that routes multi-modal product features through a feature-gating mixture-of-experts architecture while using a semantic consistency model distilled from large language models to check alignment with official tax definitions. It trains on a combination of curated tax databases, invoice logs, and merchant data to handle noisy supervision in real business records. The central goal is accurate placement of each product at the correct node in a multi-level national tax hierarchy, where mistakes create financial and regulatory problems. If the approach holds, e-commerce platforms gain reliable automation for invoicing and compliance without constant manual correction. An added step that reconstructs full hierarchical paths further boosts structural consistency and overall scores.

Core claim

Taxon integrates a feature-gating mixture-of-experts architecture that adaptively routes multi-modal features across taxonomy levels with a semantic consistency model distilled from large language models that verifies alignment between product titles and official tax definitions, trained via a multi-source pipeline of tax databases, invoice validation logs, and merchant registration data to deliver state-of-the-art performance on hierarchical tax code prediction and full-path reconstruction.

What carries the argument

Feature-gating mixture-of-experts architecture paired with an LLM-distilled semantic consistency model that routes features across levels and checks title-to-definition alignment.

If this is right

The model outperforms strong baselines on both the proprietary TaxCode dataset and public benchmarks.
Full hierarchical path reconstruction produces the highest overall F1 scores by improving structural consistency.
The system supports production deployment at volumes above 500,000 queries per day.
Interpretability improves because the semantic consistency checks link predictions directly to official definitions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same routing-plus-alignment pattern could transfer to other multi-level regulatory classification tasks such as customs codes or accounting categories.
Periodic retraining on fresh invoice logs might keep the model aligned after tax code updates without full retraining from scratch.
The semantic verification step could be applied independently to flag low-confidence predictions for human review in high-stakes compliance flows.
Extending the multi-source pipeline to include user-generated product descriptions might further reduce reliance on merchant registration data.

Load-bearing premise

The combination of curated tax databases, invoice logs, and merchant registration data supplies clean enough and representative enough supervision for the model to generalize to unseen products and tax updates.

What would settle it

A large drop in F1 scores when the model is tested on products from merchant categories or tax-rule revisions absent from the training sources.

Figures

Figures reproduced from arXiv: 2601.08418 by Chuanfei Xu, Jihang Li, Jing Wang, Qing Liu, Wei Wang, Zeyi Wen, Zulong Chen.

**Figure 2.** Figure 2: Overview of the proposed framework. The system [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the training workflow, integrating hi [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Four-stage data processing and training pipeline. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Log-scale Cumulative distribution of prediction confi [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 7.** Figure 7: Structured requirements specifying input/output [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 6.** Figure 6: Role hinting the model to act as a tax expert and [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 9.** Figure 9: Integration of the proposed framework into Alibaba’s [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 11.** Figure 11: Model performance at different levels of path com [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 10.** Figure 10: Model performance at different levels of path com [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

read the original abstract

Tax code prediction is a crucial yet underexplored task in automating invoicing and compliance management for large-scale e-commerce platforms. Each product must be accurately mapped to a node within a multi-level taxonomic hierarchy defined by national standards, where errors lead to financial inconsistencies and regulatory risks. This paper presents Taxon, a semantically aligned and expert-guided framework for hierarchical tax code prediction. Taxon integrates (i) a feature-gating mixture-of-experts architecture that adaptively routes multi-modal features across taxonomy levels, and (ii) a semantic consistency model distilled from large language models acting as domain experts to verify alignment between product titles and official tax definitions. To address noisy supervision in real business records, we design a multi-source training pipeline that combines curated tax databases, invoice validation logs, and merchant registration data to provide both structural and semantic supervision. Extensive experiments on the proprietary TaxCode dataset and public benchmarks demonstrate that Taxon achieves state-of-the-art performance, outperforming strong baselines. Further, an additional full hierarchical paths reconstruction procedure significantly improves structural consistency, yielding the highest overall F1 scores. Taxon has been deployed in production within Alibaba's tax service system, handling an average of over 500,000 tax code queries per day and reaching peak volumes above five million requests during business event with improved accuracy, interpretability, and robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Taxon combines gated MoE routing with LLM semantic alignment for hierarchical tax code prediction in e-commerce, but the SOTA claims rest on unshown experiments.

read the letter

The main contribution is a feature-gating mixture-of-experts that routes multi-modal inputs across taxonomy levels, paired with a distilled LLM model that checks semantic alignment between product titles and official tax definitions. They train on a mix of curated tax data, invoice logs, and merchant records to manage noise, then add a full-path reconstruction step for better structural consistency. The production deployment at Alibaba handling hundreds of thousands of daily queries is the clearest sign that the system runs at real scale for this compliance task. The multi-source pipeline and path reconstruction are sensible engineering choices for hierarchical prediction under messy supervision. The architecture description is clear and directly tied to the multi-level structure. The soft spots sit in the evidence. The abstract states that Taxon beats strong baselines and reaches highest F1 scores, yet supplies no numbers, baseline list, ablation results, or error breakdown. Without those details it is hard to judge whether the new routing and distillation pieces actually drive the gains or whether simpler hierarchical classifiers would suffice. The proprietary TaxCode dataset further limits external checks. If the full paper contains detailed tables and comparisons, that would strengthen the case; on the current write-up the performance claims cannot be verified. This work is aimed at applied researchers and engineers building compliance tools for large e-commerce platforms or similar domains with deep taxonomies. A reader interested in MoE routing for structured outputs or LLM guidance in noisy business data could extract practical ideas. I would send it for peer review so the experiments can be examined in full.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Taxon, a framework for hierarchical tax code prediction that integrates a feature-gating mixture-of-experts architecture for routing multi-modal features across taxonomy levels with a semantic consistency model distilled from LLMs to align product titles with official tax definitions. It employs a multi-source training pipeline combining curated tax databases, invoice validation logs, and merchant registration data to handle noisy supervision, followed by a full hierarchical paths reconstruction procedure. Experiments on the proprietary TaxCode dataset and public benchmarks claim state-of-the-art performance with highest overall F1 scores, and the system is reported as deployed in production at Alibaba handling over 500,000 queries per day.

Significance. If the empirical claims hold, the work provides practical value for automating tax compliance in large-scale e-commerce, where accurate hierarchical mapping reduces financial and regulatory risks. The combination of MoE routing with LLM-distilled semantic guidance and the post-processing reconstruction step represents a targeted engineering contribution for noisy real-world data. Production deployment with high query volume offers concrete evidence of robustness and interpretability beyond benchmark results.

major comments (2)

[§4.2] §4.2 (Experiments): The SOTA claim on the proprietary TaxCode dataset and public benchmarks is stated without reporting specific F1 scores, baseline details, ablation studies on the MoE gating or LLM distillation components, or error analysis by taxonomy depth; this prevents verification of the improvement margins and the contribution of the full-path reconstruction procedure.
[§3.1] §3.1 (Multi-source training pipeline): The assumption that combining curated tax databases, invoice logs, and merchant data yields sufficiently clean supervision for generalization is central to the claims but lacks quantitative characterization of label noise rates or distribution shift metrics between sources.

minor comments (2)

[Abstract] The abstract and §5 (Deployment) mention improved accuracy and robustness but do not define the exact evaluation protocol (e.g., micro/macro F1, hierarchical distance metrics) used for the reported results.
[§3.2] Notation for the semantic consistency model (e.g., how LLM outputs are distilled into the consistency loss) should be formalized with an equation in §3.2 to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below and commit to revisions that will strengthen the empirical support and transparency of the results.

read point-by-point responses

Referee: [§4.2] §4.2 (Experiments): The SOTA claim on the proprietary TaxCode dataset and public benchmarks is stated without reporting specific F1 scores, baseline details, ablation studies on the MoE gating or LLM distillation components, or error analysis by taxonomy depth; this prevents verification of the improvement margins and the contribution of the full-path reconstruction procedure.

Authors: We agree that explicit numerical results and component-wise analysis are necessary to substantiate the SOTA claims. In the revised manuscript we will report the precise overall and per-level F1 scores for Taxon against all baselines on both the TaxCode dataset and the public benchmarks. We will add ablation tables that isolate the feature-gating MoE routing and the LLM-distilled semantic consistency model, and we will include an error analysis stratified by taxonomy depth that quantifies the contribution of the full hierarchical paths reconstruction step. revision: yes
Referee: [§3.1] §3.1 (Multi-source training pipeline): The assumption that combining curated tax databases, invoice logs, and merchant data yields sufficiently clean supervision for generalization is central to the claims but lacks quantitative characterization of label noise rates or distribution shift metrics between sources.

Authors: We acknowledge that quantitative characterization of label noise and distribution shifts would make the multi-source pipeline more convincing. In the revision we will add estimates of label noise rates obtained via cross-source consistency checks and will report distribution-shift metrics (e.g., feature-space divergence and semantic similarity scores) between the curated tax databases, invoice logs, and merchant registration data. These additions will directly support the claim that the combined supervision is sufficiently clean for generalization. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes an empirical ML architecture (feature-gating MoE plus LLM-distilled semantic consistency) trained on multi-source business data and evaluated via standard F1 metrics on proprietary and public benchmarks. No equations, derivations, or parameter-fitting steps are presented that could reduce a claimed prediction to its own inputs by construction. All performance claims rest on external experimental outcomes rather than self-referential definitions or self-citation chains, rendering the work self-contained against the listed circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the framework implicitly assumes reliable multi-source supervision and accurate LLM distillation.

pith-pipeline@v0.9.0 · 5554 in / 1176 out tokens · 32159 ms · 2026-05-16T14:39:47.954242+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

feature-gating mixture-of-experts architecture... semantic consistency model distilled from large language models... full hierarchical paths reconstruction procedure
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

hierarchical classification loss... auxiliary semantic consistency loss

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 3 internal anchors

[1]

Goods and Services Tax Classification Catalogue,

S. T. A. of the People’s Republic of China, “Goods and Services Tax Classification Catalogue,” 2017. [Online]. Available: https: //fgk.chinatax.gov.cn/zcfgk/c100012/c5194763/content.html

work page 2017
[2]

Classifying Short Text for the Harmonized System with Convolutional Neural Networks,

J. Luppes, A. P. de Vries, and F. Hasibi, “Classifying Short Text for the Harmonized System with Convolutional Neural Networks,”Radboud University, 2019

work page 2019
[3]

An Ensemble-Based Approach for Assigning Text to Correct Harmonized System Code,

Shubham, A. Arya, S. Roy, and S. Jonnala, “An Ensemble-Based Approach for Assigning Text to Correct Harmonized System Code,” in2023 International Conference on Artificial Intelligence and Smart Communication (AISC), Jan 2023, pp. 35–41

work page 2023
[4]

Enhanced HS Code Classification for Import and Export Goods via Multiscale Attention and ERNIE-BiLSTM,

M. Liao, L. Huang, J. Zhang, L. Song, and B. Li, “Enhanced HS Code Classification for Import and Export Goods via Multiscale Attention and ERNIE-BiLSTM,”Applied Sciences, vol. 14, no. 22, 2024. [Online]. Available: https://www.mdpi.com/2076-3417/14/22/10267

work page 2024
[5]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,”arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

Convolutional Neural Networks for Sentence Classification,

Y . Kim, “Convolutional Neural Networks for Sentence Classification,” inProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Oct 2014

work page 2014
[7]

XLNet: Generalized Autoregressive Pretraining for Language Understanding,

Z. Yang, Z. Dai, Y . Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V . Le, “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” inAdvances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019. [Online]. Available:...

work page 2019
[8]

R. Y . Rubinstein and D. P. Kroese,The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (In- formation Science and Statistics). Berlin, Heidelberg: Springer-Verlag, 2004

work page 2004
[9]

Qwen2.5 Technical Report

Q. Team, “Qwen2.5 Technical Report,”arXiv preprint arXiv:2412.15115, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[10]

HDLTex: Hierarchical Deep Learning for Text Classification,

K. Kowsari, D. E. Brown, M. Heidarysafa, K. Jafari Meimandi, M. S. Gerber, and L. E. Barnes, “HDLTex: Hierarchical Deep Learning for Text Classification,” in2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017, pp. 364–371

work page 2017
[11]

RCV1: A New Benchmark Collection for Text Categorization Research,

D. D. Lewis, Y . Yang, T. G. Rose, and F. Li, “RCV1: A New Benchmark Collection for Text Categorization Research,”J. Mach. Learn. Res., vol. 5, pp. 361–397, Dec 2004

work page 2004
[12]

The New York Times Annotated Corpus,

E. Sandhaus, “The New York Times Annotated Corpus,”Linguistic Data Consortium, Philadelphia, vol. 6, no. 12, p. e26752, 2008

work page 2008
[13]

Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification,

Z. Wang, P. Wang, L. Huang, X. Sun, and H. Wang, “Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association ...

work page 2022
[14]

HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification,

H. Zhu, J. Wu, R. Liu, Y . Hou, Z. Yuan, S. Li, Y . Pan, and K. Xu, “HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), K. Duh, H. Gome...

work page 2024
[15]

LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning,

F. Kong, R. Zhang, and Z. Wang, “LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .1, ser. KDD ’25. New York, NY , USA: Association for Computing Machinery, 2025, pp. 636–646. [Online]. Available: https://doi.org/10.1145/3690624.3709326

work page doi:10.1145/3690624.3709326 2025
[16]

HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,

Z. Wang, P. Wang, T. Liu, B. Lin, Y . Cao, Z. Sui, and H. Wang, “HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec 2...

work page 2022
[17]

HyILR: Hyperbolic Instance-Specific Local Relationships for Hierarchical Text Classification,

A. Kumar and D. Toshniwal, “HyILR: Hyperbolic Instance-Specific Local Relationships for Hierarchical Text Classification,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), J. Zhao, M. Wang, and Z. Liu, Eds. Vienna, Austria: Association for Computational Linguistics, Jul 2025, ...

work page 2025
[18]

Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN,

H. Peng, J. Li, Y . He, Y . Liu, M. Bao, L. Wang, Y . Song, and Q. Yang, “Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN,” inProceedings of the 2018 World Wide Web Conference, ser. WWW ’18. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee, 2018, pp. 1063–1072. [Online...

work page doi:10.1145/3178876.3186005 2018
[19]

HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization,

K. Shimura, J. Li, and F. Fukumoto, “HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds. Brussels, Belgium: Association for Computational Linguistics, Oct 2018, pp. 8...

work page 2018
[20]

Hierarchical Transfer Learning for Multi-label Text Classification,

S. Banerjee, C. Akkaya, F. Perez-Sorrosal, and K. Tsioutsiouliklis, “Hierarchical Transfer Learning for Multi-label Text Classification,” inProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, A. Korhonen, D. Traum, and L. M `arquez, Eds. Florence, Italy: Association for Computational Linguistics, Jul 2019, pp. 6295–630...

work page 2019
[21]

Hierarchical Multi-Label Classification Networks,

J. Wehrmann, R. Cerri, and R. Barros, “Hierarchical Multi-Label Classification Networks,” inProceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 5075–5084. [Online]. Available: https://proceedings.mlr. press/v80/wehrmann18a.html

work page 2018
[22]

Hierarchy-Aware Global Model for Hierarchical Text Classification,

J. Zhou, C. Ma, D. Long, G. Xu, N. Ding, H. Zhang, P. Xie, and G. Liu, “Hierarchy-Aware Global Model for Hierarchical Text Classification,” inProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Association for Computational Linguistics, Jul 2020, pp. 1106–1117. ...

work page 2020
[23]

Hierarchy-Aware Label Semantics Matching Network for Hierarchical Text Classification,

H. Chen, Q. Ma, Z. Lin, and J. Yan, “Hierarchy-Aware Label Semantics Matching Network for Hierarchical Text Classification,” inProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Ed...

work page 2021
[24]

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification,

C. Yu, Y . Shen, and Y . Mao, “Constrained Sequence-to-Tree Generation for Hierarchical Text Classification,” inProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, ser. SIGIR ’22. New York, NY , USA: Association for Computing Machinery, 2022, pp. 1865–1869. [Online]. Available: https://doi.org/1...

work page doi:10.1145/3477495.3531765 2022
[25]

SGM: Sequence Generation Model for Multi-label Classification,

P. Yang, X. Sun, W. Li, S. Ma, W. Wu, and H. Wang, “SGM: Sequence Generation Model for Multi-label Classification,” inProceedings of the 27th International Conference on Computational Linguistics, E. M. Bender, L. Derczynski, and P. Isabelle, Eds. Santa Fe, New Mexico, USA: Association for Computational Linguistics, Aug 2018, pp. 3915–3926. [Online]. Avai...

work page 2018
[26]

Exploring Label Hierarchy in a Generative Way for Hierarchical Text Classification,

W. Huang, C. Liu, B. Xiao, Y . Zhao, Z. Pan, Z. Zhang, X. Yang, and G. Liu, “Exploring Label Hierarchy in a Generative Way for Hierarchical Text Classification,” inProceedings of the 29th International Conference on Computational Linguistics, N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Do- natelli, H...

work page 2022
[27]

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y . Zhou, W. Li, and P. J. Liu, “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,”J. Mach. Learn. Res., vol. 21, no. 1, Jan 2020

work page 2020
[28]

UMP-MG: A Uni- directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification,

B. Ning, D. Zhao, X. Zhang, C. Wang, and S. Song, “UMP-MG: A Uni- directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification,”Data Science and Engineering, vol. 8, no. 2, pp. 112–123, Jun 2023

work page 2023
[29]

Hierarchical text classification as sub-hierarchy sequence generation,

S. Im, G. Kim, H.-S. Oh, S. Jo, and D. H. Kim, “Hierarchical text classification as sub-hierarchy sequence generation,” inProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence,...

work page
[30]

Available: https://doi.org/10.1609/aaai.v37i11.26520

[Online]. Available: https://doi.org/10.1609/aaai.v37i11.26520

work page doi:10.1609/aaai.v37i11.26520
[31]

HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification,

V . Jain, M. Rungta, Y . Zhuang, Y . Yu, Z. Wang, M. Gao, J. Skolnick, and C. Zhang, “HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification,” inProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Y . Graham and M. Purver, Eds. St. Julian’s, Malta: As...

work page 2024
[32]

AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification,

R. You, Z. Zhang, Z. Wang, S. Dai, H. Mamitsuka, and S. Zhu, “AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, I...

work page 2019
[33]

HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization,

Z. Deng, H. Peng, D. He, J. Li, and P. Yu, “HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization,” inProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, ...

work page 2021
[34]

Exploiting Global and Local Hierarchies for Hierarchical Text Classification,

T. Jiang, D. Wang, L. Sun, Z. Chen, F. Zhuang, and Q. Yang, “Exploiting Global and Local Hierarchies for Hierarchical Text Classification,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec ...

work page 2022
[35]

Enhancing Hierarchical Text Classification through Knowledge Graph Integration,

Y . Liu, K. Zhang, Z. Huang, K. Wang, Y . Zhang, Q. Liu, and E. Chen, “Enhancing Hierarchical Text Classification through Knowledge Graph Integration,” inFindings of the Association for Computational Linguistics: ACL 2023, A. Rogers, J. Boyd- Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul 2023, pp. 5797–5810. ...

work page 2023
[36]

HiTIN: Hierarchy- aware Tree Isomorphism Network for Hierarchical Text Classification,

H. Zhu, C. Zhang, J. Huang, J. Wu, and K. Xu, “HiTIN: Hierarchy- aware Tree Isomorphism Network for Hierarchical Text Classification,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics,...

work page 2023
[37]

Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification,

S. C. L. Yu, J. He, V . G. Basulto, and J. Z. Pan, “Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification,” inThe 2023 Conference on Empirical Methods in Natural Language Processing, 2023. [Online]. Available: https://openreview.net/forum?id=S0eqbM16k2

work page 2023
[38]

Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification,

J. Zhang, Y . Li, F. Shen, C. Xia, H. Tan, and Y . He, “Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification,”Knowledge-Based Systems, vol. 300, p. 112153, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0950705124007871

work page 2024
[39]

Hierarchy-aware Biased Bound Margin Loss Function for Hierarchical Text Classification,

G. Kim, S. Im, and H.-S. Oh, “Hierarchy-aware Biased Bound Margin Loss Function for Hierarchical Text Classification,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug 2024, pp. 7672–7682. [Online]. Available: https://aclant...

work page 2024
[40]

Utilizing Local Hierarchy with Adver- sarial Training for Hierarchical Text Classification,

Z. Wang, P. Wang, and H. Wang, “Utilizing Local Hierarchy with Adver- sarial Training for Hierarchical Text Classification,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), N. Calzo- lari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. Xue, Eds. Torino, Italia:...

work page 2024
[41]

A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification,

J. Zhou, L. Zhang, Y . He, R. Fan, L. Zhang, and J. Wan, “A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification,” inProceedings of the 31st International Conference on Computational Linguistics, O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, Eds. Abu Dhabi, UAE: Associ...

work page 2025
[42]

Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification,

K. Ji, Y . Lian, J. Gao, and B. Wang, “Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul 2023, pp. 2918–2933...

work page 2023
[43]

NER-guided Comprehensive Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,

F. Cai, D. Liu, Z. Zhang, G. Liu, X. Yang, and X. Fang, “NER-guided Comprehensive Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), N. Calzolari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. X...

work page 2024
[44]

Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification,

H. Chen, Y . Zhao, Z. Chen, M. Wang, L. Li, M. Zhang, and M. Zhang, “Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification,”Transactions of the Association for Computational Linguistics, vol. 12, pp. 1214–1231, 2024. [Online]. Available: https://aclanthology.org/2024.tacl-1.67/

work page 2024
[45]

Dual prompt tuning based contrastive learning for hierarchical text classification,

S. Xiong, Y . Zhao, J. Zhang, L. Mengxiang, Z. He, X. Li, and S. Song, “Dual prompt tuning based contrastive learning for hierarchical text classification,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug 2024, pp. 12 146–1...

work page 2024
[46]

TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision,

Y . Zhang, R. Yang, X. Xu, R. Li, J. Xiao, J. Shen, and J. Han, “TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision,” inProceedings of the ACM on Web Conference 2025, ser. WWW ’25. New York, NY , USA: Association for Computing Machinery, 2025, pp. 2032–2042. [Online]. Available: https://doi.org/10.114...

work page doi:10.1145/3696410.3714940 2025
[47]

Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification,

S. Chen, M. R. Bouadjenek, U. Naseem, B. Suleiman, S. Jameel, F. Salim, H. Hacid, and I. Razzak, “Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification,” inProceedings of the 31st International Conference on Computational Linguistics, O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, Eds. Abu ...

work page 2025
[48]

KG- HTC: Integrating Knowledge Graphs into LLMs for Effective Zero- shot Hierarchical Text Classification,

Q. Zang, C. Zgrzendek, I. Tchappi, A. Khadangi, and J. Sedlmeir, “KG- HTC: Integrating Knowledge Graphs into LLMs for Effective Zero- shot Hierarchical Text Classification,”arXiv preprint arXiv:2505.05583, 2025

work page internal anchor Pith review arXiv 2025
[49]

Tax Classification of Invoice Details Based on Directed Heterogeneous Graph,

P. Zhao, Q. Zheng, B. Dong, J. Ruan, and M. Luo, “Tax Classification of Invoice Details Based on Directed Heterogeneous Graph,” in Proceedings of the 19th Chinese National Conference on Computational Linguistics, M. Sun, S. Li, Y . Zhang, and Y . Liu, Eds. Haikou, China: Chinese Information Processing Society of China, Oct 2020, pp. 771–782. [Online]. Ava...

work page 2020
[50]

Multimodal Approach for Harmonized System Code Prediction,

O. Amel, S. Stassin, S. A. Mahmoudi, and X. Siebert, “Multimodal Approach for Harmonized System Code Prediction,” inESANN 2023 Proceesdings. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2023

work page 2023
[51]

Harmonized system code classification using supervised contrastive learning with sentence bert and multiple negative ranking loss,

A. W. Anggoro, P. Corcoran, D. De Widt, and Y . Li, “Harmonized system code classification using supervised contrastive learning with sentence bert and multiple negative ranking loss,”Data Technologies and Applications, vol. 59, no. 2, pp. 276–301, 12 2024. [Online]. Available: https://doi.org/10.1108/DTA-01-2024-0052

work page doi:10.1108/dta-01-2024-0052 2024

[1] [1]

Goods and Services Tax Classification Catalogue,

S. T. A. of the People’s Republic of China, “Goods and Services Tax Classification Catalogue,” 2017. [Online]. Available: https: //fgk.chinatax.gov.cn/zcfgk/c100012/c5194763/content.html

work page 2017

[2] [2]

Classifying Short Text for the Harmonized System with Convolutional Neural Networks,

J. Luppes, A. P. de Vries, and F. Hasibi, “Classifying Short Text for the Harmonized System with Convolutional Neural Networks,”Radboud University, 2019

work page 2019

[3] [3]

An Ensemble-Based Approach for Assigning Text to Correct Harmonized System Code,

Shubham, A. Arya, S. Roy, and S. Jonnala, “An Ensemble-Based Approach for Assigning Text to Correct Harmonized System Code,” in2023 International Conference on Artificial Intelligence and Smart Communication (AISC), Jan 2023, pp. 35–41

work page 2023

[4] [4]

Enhanced HS Code Classification for Import and Export Goods via Multiscale Attention and ERNIE-BiLSTM,

M. Liao, L. Huang, J. Zhang, L. Song, and B. Li, “Enhanced HS Code Classification for Import and Export Goods via Multiscale Attention and ERNIE-BiLSTM,”Applied Sciences, vol. 14, no. 22, 2024. [Online]. Available: https://www.mdpi.com/2076-3417/14/22/10267

work page 2024

[5] [5]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,”arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

Convolutional Neural Networks for Sentence Classification,

Y . Kim, “Convolutional Neural Networks for Sentence Classification,” inProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Oct 2014

work page 2014

[7] [7]

XLNet: Generalized Autoregressive Pretraining for Language Understanding,

Z. Yang, Z. Dai, Y . Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V . Le, “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” inAdvances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019. [Online]. Available:...

work page 2019

[8] [8]

R. Y . Rubinstein and D. P. Kroese,The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (In- formation Science and Statistics). Berlin, Heidelberg: Springer-Verlag, 2004

work page 2004

[9] [9]

Qwen2.5 Technical Report

Q. Team, “Qwen2.5 Technical Report,”arXiv preprint arXiv:2412.15115, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[10] [10]

HDLTex: Hierarchical Deep Learning for Text Classification,

K. Kowsari, D. E. Brown, M. Heidarysafa, K. Jafari Meimandi, M. S. Gerber, and L. E. Barnes, “HDLTex: Hierarchical Deep Learning for Text Classification,” in2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017, pp. 364–371

work page 2017

[11] [11]

RCV1: A New Benchmark Collection for Text Categorization Research,

D. D. Lewis, Y . Yang, T. G. Rose, and F. Li, “RCV1: A New Benchmark Collection for Text Categorization Research,”J. Mach. Learn. Res., vol. 5, pp. 361–397, Dec 2004

work page 2004

[12] [12]

The New York Times Annotated Corpus,

E. Sandhaus, “The New York Times Annotated Corpus,”Linguistic Data Consortium, Philadelphia, vol. 6, no. 12, p. e26752, 2008

work page 2008

[13] [13]

Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification,

Z. Wang, P. Wang, L. Huang, X. Sun, and H. Wang, “Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association ...

work page 2022

[14] [14]

HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification,

H. Zhu, J. Wu, R. Liu, Y . Hou, Z. Yuan, S. Li, Y . Pan, and K. Xu, “HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), K. Duh, H. Gome...

work page 2024

[15] [15]

LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning,

F. Kong, R. Zhang, and Z. Wang, “LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .1, ser. KDD ’25. New York, NY , USA: Association for Computing Machinery, 2025, pp. 636–646. [Online]. Available: https://doi.org/10.1145/3690624.3709326

work page doi:10.1145/3690624.3709326 2025

[16] [16]

HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,

Z. Wang, P. Wang, T. Liu, B. Lin, Y . Cao, Z. Sui, and H. Wang, “HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec 2...

work page 2022

[17] [17]

HyILR: Hyperbolic Instance-Specific Local Relationships for Hierarchical Text Classification,

A. Kumar and D. Toshniwal, “HyILR: Hyperbolic Instance-Specific Local Relationships for Hierarchical Text Classification,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), J. Zhao, M. Wang, and Z. Liu, Eds. Vienna, Austria: Association for Computational Linguistics, Jul 2025, ...

work page 2025

[18] [18]

Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN,

H. Peng, J. Li, Y . He, Y . Liu, M. Bao, L. Wang, Y . Song, and Q. Yang, “Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN,” inProceedings of the 2018 World Wide Web Conference, ser. WWW ’18. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee, 2018, pp. 1063–1072. [Online...

work page doi:10.1145/3178876.3186005 2018

[19] [19]

HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization,

K. Shimura, J. Li, and F. Fukumoto, “HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds. Brussels, Belgium: Association for Computational Linguistics, Oct 2018, pp. 8...

work page 2018

[20] [20]

Hierarchical Transfer Learning for Multi-label Text Classification,

S. Banerjee, C. Akkaya, F. Perez-Sorrosal, and K. Tsioutsiouliklis, “Hierarchical Transfer Learning for Multi-label Text Classification,” inProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, A. Korhonen, D. Traum, and L. M `arquez, Eds. Florence, Italy: Association for Computational Linguistics, Jul 2019, pp. 6295–630...

work page 2019

[21] [21]

Hierarchical Multi-Label Classification Networks,

J. Wehrmann, R. Cerri, and R. Barros, “Hierarchical Multi-Label Classification Networks,” inProceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 5075–5084. [Online]. Available: https://proceedings.mlr. press/v80/wehrmann18a.html

work page 2018

[22] [22]

Hierarchy-Aware Global Model for Hierarchical Text Classification,

J. Zhou, C. Ma, D. Long, G. Xu, N. Ding, H. Zhang, P. Xie, and G. Liu, “Hierarchy-Aware Global Model for Hierarchical Text Classification,” inProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Association for Computational Linguistics, Jul 2020, pp. 1106–1117. ...

work page 2020

[23] [23]

Hierarchy-Aware Label Semantics Matching Network for Hierarchical Text Classification,

H. Chen, Q. Ma, Z. Lin, and J. Yan, “Hierarchy-Aware Label Semantics Matching Network for Hierarchical Text Classification,” inProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Ed...

work page 2021

[24] [24]

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification,

C. Yu, Y . Shen, and Y . Mao, “Constrained Sequence-to-Tree Generation for Hierarchical Text Classification,” inProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, ser. SIGIR ’22. New York, NY , USA: Association for Computing Machinery, 2022, pp. 1865–1869. [Online]. Available: https://doi.org/1...

work page doi:10.1145/3477495.3531765 2022

[25] [25]

SGM: Sequence Generation Model for Multi-label Classification,

P. Yang, X. Sun, W. Li, S. Ma, W. Wu, and H. Wang, “SGM: Sequence Generation Model for Multi-label Classification,” inProceedings of the 27th International Conference on Computational Linguistics, E. M. Bender, L. Derczynski, and P. Isabelle, Eds. Santa Fe, New Mexico, USA: Association for Computational Linguistics, Aug 2018, pp. 3915–3926. [Online]. Avai...

work page 2018

[26] [26]

Exploring Label Hierarchy in a Generative Way for Hierarchical Text Classification,

W. Huang, C. Liu, B. Xiao, Y . Zhao, Z. Pan, Z. Zhang, X. Yang, and G. Liu, “Exploring Label Hierarchy in a Generative Way for Hierarchical Text Classification,” inProceedings of the 29th International Conference on Computational Linguistics, N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Do- natelli, H...

work page 2022

[27] [27]

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y . Zhou, W. Li, and P. J. Liu, “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,”J. Mach. Learn. Res., vol. 21, no. 1, Jan 2020

work page 2020

[28] [28]

UMP-MG: A Uni- directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification,

B. Ning, D. Zhao, X. Zhang, C. Wang, and S. Song, “UMP-MG: A Uni- directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification,”Data Science and Engineering, vol. 8, no. 2, pp. 112–123, Jun 2023

work page 2023

[29] [29]

Hierarchical text classification as sub-hierarchy sequence generation,

S. Im, G. Kim, H.-S. Oh, S. Jo, and D. H. Kim, “Hierarchical text classification as sub-hierarchy sequence generation,” inProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence,...

work page

[30] [30]

Available: https://doi.org/10.1609/aaai.v37i11.26520

[Online]. Available: https://doi.org/10.1609/aaai.v37i11.26520

work page doi:10.1609/aaai.v37i11.26520

[31] [31]

HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification,

V . Jain, M. Rungta, Y . Zhuang, Y . Yu, Z. Wang, M. Gao, J. Skolnick, and C. Zhang, “HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification,” inProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Y . Graham and M. Purver, Eds. St. Julian’s, Malta: As...

work page 2024

[32] [32]

AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification,

R. You, Z. Zhang, Z. Wang, S. Dai, H. Mamitsuka, and S. Zhu, “AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, I...

work page 2019

[33] [33]

HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization,

Z. Deng, H. Peng, D. He, J. Li, and P. Yu, “HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization,” inProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, ...

work page 2021

[34] [34]

Exploiting Global and Local Hierarchies for Hierarchical Text Classification,

T. Jiang, D. Wang, L. Sun, Z. Chen, F. Zhuang, and Q. Yang, “Exploiting Global and Local Hierarchies for Hierarchical Text Classification,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec ...

work page 2022

[35] [35]

Enhancing Hierarchical Text Classification through Knowledge Graph Integration,

Y . Liu, K. Zhang, Z. Huang, K. Wang, Y . Zhang, Q. Liu, and E. Chen, “Enhancing Hierarchical Text Classification through Knowledge Graph Integration,” inFindings of the Association for Computational Linguistics: ACL 2023, A. Rogers, J. Boyd- Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul 2023, pp. 5797–5810. ...

work page 2023

[36] [36]

HiTIN: Hierarchy- aware Tree Isomorphism Network for Hierarchical Text Classification,

H. Zhu, C. Zhang, J. Huang, J. Wu, and K. Xu, “HiTIN: Hierarchy- aware Tree Isomorphism Network for Hierarchical Text Classification,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics,...

work page 2023

[37] [37]

Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification,

S. C. L. Yu, J. He, V . G. Basulto, and J. Z. Pan, “Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification,” inThe 2023 Conference on Empirical Methods in Natural Language Processing, 2023. [Online]. Available: https://openreview.net/forum?id=S0eqbM16k2

work page 2023

[38] [38]

Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification,

J. Zhang, Y . Li, F. Shen, C. Xia, H. Tan, and Y . He, “Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification,”Knowledge-Based Systems, vol. 300, p. 112153, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0950705124007871

work page 2024

[39] [39]

Hierarchy-aware Biased Bound Margin Loss Function for Hierarchical Text Classification,

G. Kim, S. Im, and H.-S. Oh, “Hierarchy-aware Biased Bound Margin Loss Function for Hierarchical Text Classification,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug 2024, pp. 7672–7682. [Online]. Available: https://aclant...

work page 2024

[40] [40]

Utilizing Local Hierarchy with Adver- sarial Training for Hierarchical Text Classification,

Z. Wang, P. Wang, and H. Wang, “Utilizing Local Hierarchy with Adver- sarial Training for Hierarchical Text Classification,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), N. Calzo- lari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. Xue, Eds. Torino, Italia:...

work page 2024

[41] [41]

A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification,

J. Zhou, L. Zhang, Y . He, R. Fan, L. Zhang, and J. Wan, “A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification,” inProceedings of the 31st International Conference on Computational Linguistics, O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, Eds. Abu Dhabi, UAE: Associ...

work page 2025

[42] [42]

Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification,

K. Ji, Y . Lian, J. Gao, and B. Wang, “Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul 2023, pp. 2918–2933...

work page 2023

[43] [43]

NER-guided Comprehensive Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,

F. Cai, D. Liu, Z. Zhang, G. Liu, X. Yang, and X. Fang, “NER-guided Comprehensive Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), N. Calzolari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. X...

work page 2024

[44] [44]

Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification,

H. Chen, Y . Zhao, Z. Chen, M. Wang, L. Li, M. Zhang, and M. Zhang, “Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification,”Transactions of the Association for Computational Linguistics, vol. 12, pp. 1214–1231, 2024. [Online]. Available: https://aclanthology.org/2024.tacl-1.67/

work page 2024

[45] [45]

Dual prompt tuning based contrastive learning for hierarchical text classification,

S. Xiong, Y . Zhao, J. Zhang, L. Mengxiang, Z. He, X. Li, and S. Song, “Dual prompt tuning based contrastive learning for hierarchical text classification,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug 2024, pp. 12 146–1...

work page 2024

[46] [46]

TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision,

Y . Zhang, R. Yang, X. Xu, R. Li, J. Xiao, J. Shen, and J. Han, “TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision,” inProceedings of the ACM on Web Conference 2025, ser. WWW ’25. New York, NY , USA: Association for Computing Machinery, 2025, pp. 2032–2042. [Online]. Available: https://doi.org/10.114...

work page doi:10.1145/3696410.3714940 2025

[47] [47]

Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification,

S. Chen, M. R. Bouadjenek, U. Naseem, B. Suleiman, S. Jameel, F. Salim, H. Hacid, and I. Razzak, “Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification,” inProceedings of the 31st International Conference on Computational Linguistics, O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, Eds. Abu ...

work page 2025

[48] [48]

KG- HTC: Integrating Knowledge Graphs into LLMs for Effective Zero- shot Hierarchical Text Classification,

Q. Zang, C. Zgrzendek, I. Tchappi, A. Khadangi, and J. Sedlmeir, “KG- HTC: Integrating Knowledge Graphs into LLMs for Effective Zero- shot Hierarchical Text Classification,”arXiv preprint arXiv:2505.05583, 2025

work page internal anchor Pith review arXiv 2025

[49] [49]

Tax Classification of Invoice Details Based on Directed Heterogeneous Graph,

P. Zhao, Q. Zheng, B. Dong, J. Ruan, and M. Luo, “Tax Classification of Invoice Details Based on Directed Heterogeneous Graph,” in Proceedings of the 19th Chinese National Conference on Computational Linguistics, M. Sun, S. Li, Y . Zhang, and Y . Liu, Eds. Haikou, China: Chinese Information Processing Society of China, Oct 2020, pp. 771–782. [Online]. Ava...

work page 2020

[50] [50]

Multimodal Approach for Harmonized System Code Prediction,

O. Amel, S. Stassin, S. A. Mahmoudi, and X. Siebert, “Multimodal Approach for Harmonized System Code Prediction,” inESANN 2023 Proceesdings. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2023

work page 2023

[51] [51]

Harmonized system code classification using supervised contrastive learning with sentence bert and multiple negative ranking loss,

A. W. Anggoro, P. Corcoran, D. De Widt, and Y . Li, “Harmonized system code classification using supervised contrastive learning with sentence bert and multiple negative ranking loss,”Data Technologies and Applications, vol. 59, no. 2, pp. 276–301, 12 2024. [Online]. Available: https://doi.org/10.1108/DTA-01-2024-0052

work page doi:10.1108/dta-01-2024-0052 2024