Taxon: Hierarchical Tax Code Prediction with Semantically Aligned LLM Expert Guidance
Pith reviewed 2026-05-16 14:39 UTC · model grok-4.3
The pith
A mixture-of-experts model guided by distilled LLM semantics maps products to hierarchical tax codes with high accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Taxon integrates a feature-gating mixture-of-experts architecture that adaptively routes multi-modal features across taxonomy levels with a semantic consistency model distilled from large language models that verifies alignment between product titles and official tax definitions, trained via a multi-source pipeline of tax databases, invoice validation logs, and merchant registration data to deliver state-of-the-art performance on hierarchical tax code prediction and full-path reconstruction.
What carries the argument
Feature-gating mixture-of-experts architecture paired with an LLM-distilled semantic consistency model that routes features across levels and checks title-to-definition alignment.
If this is right
- The model outperforms strong baselines on both the proprietary TaxCode dataset and public benchmarks.
- Full hierarchical path reconstruction produces the highest overall F1 scores by improving structural consistency.
- The system supports production deployment at volumes above 500,000 queries per day.
- Interpretability improves because the semantic consistency checks link predictions directly to official definitions.
Where Pith is reading between the lines
- The same routing-plus-alignment pattern could transfer to other multi-level regulatory classification tasks such as customs codes or accounting categories.
- Periodic retraining on fresh invoice logs might keep the model aligned after tax code updates without full retraining from scratch.
- The semantic verification step could be applied independently to flag low-confidence predictions for human review in high-stakes compliance flows.
- Extending the multi-source pipeline to include user-generated product descriptions might further reduce reliance on merchant registration data.
Load-bearing premise
The combination of curated tax databases, invoice logs, and merchant registration data supplies clean enough and representative enough supervision for the model to generalize to unseen products and tax updates.
What would settle it
A large drop in F1 scores when the model is tested on products from merchant categories or tax-rule revisions absent from the training sources.
Figures
read the original abstract
Tax code prediction is a crucial yet underexplored task in automating invoicing and compliance management for large-scale e-commerce platforms. Each product must be accurately mapped to a node within a multi-level taxonomic hierarchy defined by national standards, where errors lead to financial inconsistencies and regulatory risks. This paper presents Taxon, a semantically aligned and expert-guided framework for hierarchical tax code prediction. Taxon integrates (i) a feature-gating mixture-of-experts architecture that adaptively routes multi-modal features across taxonomy levels, and (ii) a semantic consistency model distilled from large language models acting as domain experts to verify alignment between product titles and official tax definitions. To address noisy supervision in real business records, we design a multi-source training pipeline that combines curated tax databases, invoice validation logs, and merchant registration data to provide both structural and semantic supervision. Extensive experiments on the proprietary TaxCode dataset and public benchmarks demonstrate that Taxon achieves state-of-the-art performance, outperforming strong baselines. Further, an additional full hierarchical paths reconstruction procedure significantly improves structural consistency, yielding the highest overall F1 scores. Taxon has been deployed in production within Alibaba's tax service system, handling an average of over 500,000 tax code queries per day and reaching peak volumes above five million requests during business event with improved accuracy, interpretability, and robustness.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Taxon, a framework for hierarchical tax code prediction that integrates a feature-gating mixture-of-experts architecture for routing multi-modal features across taxonomy levels with a semantic consistency model distilled from LLMs to align product titles with official tax definitions. It employs a multi-source training pipeline combining curated tax databases, invoice validation logs, and merchant registration data to handle noisy supervision, followed by a full hierarchical paths reconstruction procedure. Experiments on the proprietary TaxCode dataset and public benchmarks claim state-of-the-art performance with highest overall F1 scores, and the system is reported as deployed in production at Alibaba handling over 500,000 queries per day.
Significance. If the empirical claims hold, the work provides practical value for automating tax compliance in large-scale e-commerce, where accurate hierarchical mapping reduces financial and regulatory risks. The combination of MoE routing with LLM-distilled semantic guidance and the post-processing reconstruction step represents a targeted engineering contribution for noisy real-world data. Production deployment with high query volume offers concrete evidence of robustness and interpretability beyond benchmark results.
major comments (2)
- [§4.2] §4.2 (Experiments): The SOTA claim on the proprietary TaxCode dataset and public benchmarks is stated without reporting specific F1 scores, baseline details, ablation studies on the MoE gating or LLM distillation components, or error analysis by taxonomy depth; this prevents verification of the improvement margins and the contribution of the full-path reconstruction procedure.
- [§3.1] §3.1 (Multi-source training pipeline): The assumption that combining curated tax databases, invoice logs, and merchant data yields sufficiently clean supervision for generalization is central to the claims but lacks quantitative characterization of label noise rates or distribution shift metrics between sources.
minor comments (2)
- [Abstract] The abstract and §5 (Deployment) mention improved accuracy and robustness but do not define the exact evaluation protocol (e.g., micro/macro F1, hierarchical distance metrics) used for the reported results.
- [§3.2] Notation for the semantic consistency model (e.g., how LLM outputs are distilled into the consistency loss) should be formalized with an equation in §3.2 to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the major comments point by point below and commit to revisions that will strengthen the empirical support and transparency of the results.
read point-by-point responses
-
Referee: [§4.2] §4.2 (Experiments): The SOTA claim on the proprietary TaxCode dataset and public benchmarks is stated without reporting specific F1 scores, baseline details, ablation studies on the MoE gating or LLM distillation components, or error analysis by taxonomy depth; this prevents verification of the improvement margins and the contribution of the full-path reconstruction procedure.
Authors: We agree that explicit numerical results and component-wise analysis are necessary to substantiate the SOTA claims. In the revised manuscript we will report the precise overall and per-level F1 scores for Taxon against all baselines on both the TaxCode dataset and the public benchmarks. We will add ablation tables that isolate the feature-gating MoE routing and the LLM-distilled semantic consistency model, and we will include an error analysis stratified by taxonomy depth that quantifies the contribution of the full hierarchical paths reconstruction step. revision: yes
-
Referee: [§3.1] §3.1 (Multi-source training pipeline): The assumption that combining curated tax databases, invoice logs, and merchant data yields sufficiently clean supervision for generalization is central to the claims but lacks quantitative characterization of label noise rates or distribution shift metrics between sources.
Authors: We acknowledge that quantitative characterization of label noise and distribution shifts would make the multi-source pipeline more convincing. In the revision we will add estimates of label noise rates obtained via cross-source consistency checks and will report distribution-shift metrics (e.g., feature-space divergence and semantic similarity scores) between the curated tax databases, invoice logs, and merchant registration data. These additions will directly support the claim that the combined supervision is sufficiently clean for generalization. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper describes an empirical ML architecture (feature-gating MoE plus LLM-distilled semantic consistency) trained on multi-source business data and evaluated via standard F1 metrics on proprietary and public benchmarks. No equations, derivations, or parameter-fitting steps are presented that could reduce a claimed prediction to its own inputs by construction. All performance claims rest on external experimental outcomes rather than self-referential definitions or self-citation chains, rendering the work self-contained against the listed circularity patterns.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
feature-gating mixture-of-experts architecture... semantic consistency model distilled from large language models... full hierarchical paths reconstruction procedure
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
hierarchical classification loss... auxiliary semantic consistency loss
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Goods and Services Tax Classification Catalogue,
S. T. A. of the People’s Republic of China, “Goods and Services Tax Classification Catalogue,” 2017. [Online]. Available: https: //fgk.chinatax.gov.cn/zcfgk/c100012/c5194763/content.html
work page 2017
-
[2]
Classifying Short Text for the Harmonized System with Convolutional Neural Networks,
J. Luppes, A. P. de Vries, and F. Hasibi, “Classifying Short Text for the Harmonized System with Convolutional Neural Networks,”Radboud University, 2019
work page 2019
-
[3]
An Ensemble-Based Approach for Assigning Text to Correct Harmonized System Code,
Shubham, A. Arya, S. Roy, and S. Jonnala, “An Ensemble-Based Approach for Assigning Text to Correct Harmonized System Code,” in2023 International Conference on Artificial Intelligence and Smart Communication (AISC), Jan 2023, pp. 35–41
work page 2023
-
[4]
M. Liao, L. Huang, J. Zhang, L. Song, and B. Li, “Enhanced HS Code Classification for Import and Export Goods via Multiscale Attention and ERNIE-BiLSTM,”Applied Sciences, vol. 14, no. 22, 2024. [Online]. Available: https://www.mdpi.com/2076-3417/14/22/10267
work page 2024
-
[5]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,”arXiv preprint arXiv:1810.04805, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[6]
Convolutional Neural Networks for Sentence Classification,
Y . Kim, “Convolutional Neural Networks for Sentence Classification,” inProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Oct 2014
work page 2014
-
[7]
XLNet: Generalized Autoregressive Pretraining for Language Understanding,
Z. Yang, Z. Dai, Y . Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V . Le, “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” inAdvances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019. [Online]. Available:...
work page 2019
-
[8]
R. Y . Rubinstein and D. P. Kroese,The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (In- formation Science and Statistics). Berlin, Heidelberg: Springer-Verlag, 2004
work page 2004
-
[9]
Q. Team, “Qwen2.5 Technical Report,”arXiv preprint arXiv:2412.15115, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[10]
HDLTex: Hierarchical Deep Learning for Text Classification,
K. Kowsari, D. E. Brown, M. Heidarysafa, K. Jafari Meimandi, M. S. Gerber, and L. E. Barnes, “HDLTex: Hierarchical Deep Learning for Text Classification,” in2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 2017, pp. 364–371
work page 2017
-
[11]
RCV1: A New Benchmark Collection for Text Categorization Research,
D. D. Lewis, Y . Yang, T. G. Rose, and F. Li, “RCV1: A New Benchmark Collection for Text Categorization Research,”J. Mach. Learn. Res., vol. 5, pp. 361–397, Dec 2004
work page 2004
-
[12]
The New York Times Annotated Corpus,
E. Sandhaus, “The New York Times Annotated Corpus,”Linguistic Data Consortium, Philadelphia, vol. 6, no. 12, p. e26752, 2008
work page 2008
-
[13]
Z. Wang, P. Wang, L. Huang, X. Sun, and H. Wang, “Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association ...
work page 2022
-
[14]
H. Zhu, J. Wu, R. Liu, Y . Hou, Z. Yuan, S. Li, Y . Pan, and K. Xu, “HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), K. Duh, H. Gome...
work page 2024
-
[15]
LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning,
F. Kong, R. Zhang, and Z. Wang, “LH-Mix: Local Hierarchy Correlation Guided Mixup over Hierarchical Prompt Tuning,” inProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .1, ser. KDD ’25. New York, NY , USA: Association for Computing Machinery, 2025, pp. 636–646. [Online]. Available: https://doi.org/10.1145/3690624.3709326
-
[16]
HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,
Z. Wang, P. Wang, T. Liu, B. Lin, Y . Cao, Z. Sui, and H. Wang, “HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec 2...
work page 2022
-
[17]
HyILR: Hyperbolic Instance-Specific Local Relationships for Hierarchical Text Classification,
A. Kumar and D. Toshniwal, “HyILR: Hyperbolic Instance-Specific Local Relationships for Hierarchical Text Classification,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), J. Zhao, M. Wang, and Z. Liu, Eds. Vienna, Austria: Association for Computational Linguistics, Jul 2025, ...
work page 2025
-
[18]
Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN,
H. Peng, J. Li, Y . He, Y . Liu, M. Bao, L. Wang, Y . Song, and Q. Yang, “Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN,” inProceedings of the 2018 World Wide Web Conference, ser. WWW ’18. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee, 2018, pp. 1063–1072. [Online...
-
[19]
HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization,
K. Shimura, J. Li, and F. Fukumoto, “HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds. Brussels, Belgium: Association for Computational Linguistics, Oct 2018, pp. 8...
work page 2018
-
[20]
Hierarchical Transfer Learning for Multi-label Text Classification,
S. Banerjee, C. Akkaya, F. Perez-Sorrosal, and K. Tsioutsiouliklis, “Hierarchical Transfer Learning for Multi-label Text Classification,” inProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, A. Korhonen, D. Traum, and L. M `arquez, Eds. Florence, Italy: Association for Computational Linguistics, Jul 2019, pp. 6295–630...
work page 2019
-
[21]
Hierarchical Multi-Label Classification Networks,
J. Wehrmann, R. Cerri, and R. Barros, “Hierarchical Multi-Label Classification Networks,” inProceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 5075–5084. [Online]. Available: https://proceedings.mlr. press/v80/wehrmann18a.html
work page 2018
-
[22]
Hierarchy-Aware Global Model for Hierarchical Text Classification,
J. Zhou, C. Ma, D. Long, G. Xu, N. Ding, H. Zhang, P. Xie, and G. Liu, “Hierarchy-Aware Global Model for Hierarchical Text Classification,” inProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Association for Computational Linguistics, Jul 2020, pp. 1106–1117. ...
work page 2020
-
[23]
Hierarchy-Aware Label Semantics Matching Network for Hierarchical Text Classification,
H. Chen, Q. Ma, Z. Lin, and J. Yan, “Hierarchy-Aware Label Semantics Matching Network for Hierarchical Text Classification,” inProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Ed...
work page 2021
-
[24]
Constrained Sequence-to-Tree Generation for Hierarchical Text Classification,
C. Yu, Y . Shen, and Y . Mao, “Constrained Sequence-to-Tree Generation for Hierarchical Text Classification,” inProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, ser. SIGIR ’22. New York, NY , USA: Association for Computing Machinery, 2022, pp. 1865–1869. [Online]. Available: https://doi.org/1...
-
[25]
SGM: Sequence Generation Model for Multi-label Classification,
P. Yang, X. Sun, W. Li, S. Ma, W. Wu, and H. Wang, “SGM: Sequence Generation Model for Multi-label Classification,” inProceedings of the 27th International Conference on Computational Linguistics, E. M. Bender, L. Derczynski, and P. Isabelle, Eds. Santa Fe, New Mexico, USA: Association for Computational Linguistics, Aug 2018, pp. 3915–3926. [Online]. Avai...
work page 2018
-
[26]
Exploring Label Hierarchy in a Generative Way for Hierarchical Text Classification,
W. Huang, C. Liu, B. Xiao, Y . Zhao, Z. Pan, Z. Zhang, X. Yang, and G. Liu, “Exploring Label Hierarchy in a Generative Way for Hierarchical Text Classification,” inProceedings of the 29th International Conference on Computational Linguistics, N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Do- natelli, H...
work page 2022
-
[27]
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y . Zhou, W. Li, and P. J. Liu, “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,”J. Mach. Learn. Res., vol. 21, no. 1, Jan 2020
work page 2020
-
[28]
B. Ning, D. Zhao, X. Zhang, C. Wang, and S. Song, “UMP-MG: A Uni- directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification,”Data Science and Engineering, vol. 8, no. 2, pp. 112–123, Jun 2023
work page 2023
-
[29]
Hierarchical text classification as sub-hierarchy sequence generation,
S. Im, G. Kim, H.-S. Oh, S. Jo, and D. H. Kim, “Hierarchical text classification as sub-hierarchy sequence generation,” inProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence,...
-
[30]
Available: https://doi.org/10.1609/aaai.v37i11.26520
[Online]. Available: https://doi.org/10.1609/aaai.v37i11.26520
-
[31]
HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification,
V . Jain, M. Rungta, Y . Zhuang, Y . Yu, Z. Wang, M. Gao, J. Skolnick, and C. Zhang, “HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification,” inProceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Y . Graham and M. Purver, Eds. St. Julian’s, Malta: As...
work page 2024
-
[32]
R. You, Z. Zhang, Z. Wang, S. Dai, H. Mamitsuka, and S. Zhu, “AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch ´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, I...
work page 2019
-
[33]
HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization,
Z. Deng, H. Peng, D. He, J. Li, and P. Yu, “HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization,” inProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, ...
work page 2021
-
[34]
Exploiting Global and Local Hierarchies for Hierarchical Text Classification,
T. Jiang, D. Wang, L. Sun, Z. Chen, F. Zhuang, and Q. Yang, “Exploiting Global and Local Hierarchies for Hierarchical Text Classification,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec ...
work page 2022
-
[35]
Enhancing Hierarchical Text Classification through Knowledge Graph Integration,
Y . Liu, K. Zhang, Z. Huang, K. Wang, Y . Zhang, Q. Liu, and E. Chen, “Enhancing Hierarchical Text Classification through Knowledge Graph Integration,” inFindings of the Association for Computational Linguistics: ACL 2023, A. Rogers, J. Boyd- Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul 2023, pp. 5797–5810. ...
work page 2023
-
[36]
HiTIN: Hierarchy- aware Tree Isomorphism Network for Hierarchical Text Classification,
H. Zhu, C. Zhang, J. Huang, J. Wu, and K. Xu, “HiTIN: Hierarchy- aware Tree Isomorphism Network for Hierarchical Text Classification,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics,...
work page 2023
-
[37]
S. C. L. Yu, J. He, V . G. Basulto, and J. Z. Pan, “Instances and Labels: Hierarchy-aware Joint Supervised Contrastive Learning for Hierarchical Multi-Label Text Classification,” inThe 2023 Conference on Empirical Methods in Natural Language Processing, 2023. [Online]. Available: https://openreview.net/forum?id=S0eqbM16k2
work page 2023
-
[38]
Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification,
J. Zhang, Y . Li, F. Shen, C. Xia, H. Tan, and Y . He, “Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification,”Knowledge-Based Systems, vol. 300, p. 112153, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0950705124007871
work page 2024
-
[39]
Hierarchy-aware Biased Bound Margin Loss Function for Hierarchical Text Classification,
G. Kim, S. Im, and H.-S. Oh, “Hierarchy-aware Biased Bound Margin Loss Function for Hierarchical Text Classification,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug 2024, pp. 7672–7682. [Online]. Available: https://aclant...
work page 2024
-
[40]
Utilizing Local Hierarchy with Adver- sarial Training for Hierarchical Text Classification,
Z. Wang, P. Wang, and H. Wang, “Utilizing Local Hierarchy with Adver- sarial Training for Hierarchical Text Classification,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), N. Calzo- lari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. Xue, Eds. Torino, Italia:...
work page 2024
-
[41]
J. Zhou, L. Zhang, Y . He, R. Fan, L. Zhang, and J. Wan, “A Novel Negative Sample Generation Method for Contrastive Learning in Hierarchical Text Classification,” inProceedings of the 31st International Conference on Computational Linguistics, O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, Eds. Abu Dhabi, UAE: Associ...
work page 2025
-
[42]
Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification,
K. Ji, Y . Lian, J. Gao, and B. Wang, “Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computational Linguistics, Jul 2023, pp. 2918–2933...
work page 2023
-
[43]
NER-guided Comprehensive Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,
F. Cai, D. Liu, Z. Zhang, G. Liu, X. Yang, and X. Fang, “NER-guided Comprehensive Hierarchy-aware Prompt Tuning for Hierarchical Text Classification,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), N. Calzolari, M.-Y . Kan, V . Hoste, A. Lenci, S. Sakti, and N. X...
work page 2024
-
[44]
Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification,
H. Chen, Y . Zhao, Z. Chen, M. Wang, L. Li, M. Zhang, and M. Zhang, “Retrieval-style In-context Learning for Few-shot Hierarchical Text Classification,”Transactions of the Association for Computational Linguistics, vol. 12, pp. 1214–1231, 2024. [Online]. Available: https://aclanthology.org/2024.tacl-1.67/
work page 2024
-
[45]
Dual prompt tuning based contrastive learning for hierarchical text classification,
S. Xiong, Y . Zhao, J. Zhang, L. Mengxiang, Z. He, X. Li, and S. Song, “Dual prompt tuning based contrastive learning for hierarchical text classification,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug 2024, pp. 12 146–1...
work page 2024
-
[46]
Y . Zhang, R. Yang, X. Xu, R. Li, J. Xiao, J. Shen, and J. Han, “TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision,” inProceedings of the ACM on Web Conference 2025, ser. WWW ’25. New York, NY , USA: Association for Computing Machinery, 2025, pp. 2032–2042. [Online]. Available: https://doi.org/10.114...
-
[47]
Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification,
S. Chen, M. R. Bouadjenek, U. Naseem, B. Suleiman, S. Jameel, F. Salim, H. Hacid, and I. Razzak, “Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification,” inProceedings of the 31st International Conference on Computational Linguistics, O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. D. Eugenio, and S. Schockaert, Eds. Abu ...
work page 2025
-
[48]
Q. Zang, C. Zgrzendek, I. Tchappi, A. Khadangi, and J. Sedlmeir, “KG- HTC: Integrating Knowledge Graphs into LLMs for Effective Zero- shot Hierarchical Text Classification,”arXiv preprint arXiv:2505.05583, 2025
work page internal anchor Pith review arXiv 2025
-
[49]
Tax Classification of Invoice Details Based on Directed Heterogeneous Graph,
P. Zhao, Q. Zheng, B. Dong, J. Ruan, and M. Luo, “Tax Classification of Invoice Details Based on Directed Heterogeneous Graph,” in Proceedings of the 19th Chinese National Conference on Computational Linguistics, M. Sun, S. Li, Y . Zhang, and Y . Liu, Eds. Haikou, China: Chinese Information Processing Society of China, Oct 2020, pp. 771–782. [Online]. Ava...
work page 2020
-
[50]
Multimodal Approach for Harmonized System Code Prediction,
O. Amel, S. Stassin, S. A. Mahmoudi, and X. Siebert, “Multimodal Approach for Harmonized System Code Prediction,” inESANN 2023 Proceesdings. Louvain-la-Neuve (Belgium): Ciaco - i6doc.com, 2023
work page 2023
-
[51]
A. W. Anggoro, P. Corcoran, D. De Widt, and Y . Li, “Harmonized system code classification using supervised contrastive learning with sentence bert and multiple negative ranking loss,”Data Technologies and Applications, vol. 59, no. 2, pp. 276–301, 12 2024. [Online]. Available: https://doi.org/10.1108/DTA-01-2024-0052
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.