pith. sign in

arxiv: 2512.18994 · v2 · submitted 2025-12-22 · 💻 cs.CV

Dual-Margin Embedding for Fine-Grained Long-Tailed Plant Taxonomy

Pith reviewed 2026-05-16 20:28 UTC · model grok-4.3

classification 💻 cs.CV
keywords plant taxonomyfine-grained recognitionlong-tailed learningembedding learningdual-margin objectiveopen-world classificationbiodiversity monitoring
0
0 comments X

The pith

TaxoNet uses a dual-margin embedding objective to reshape decision boundaries for better fine-grained plant taxonomy under long-tailed imbalance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TaxoNet as an embedding learning method that combines fine-grained species discrimination with handling of severe class imbalance in plant images. Its dual-margin objective adjusts boundaries to give rare taxa stronger geometric support while keeping similar species separable. This matters for real biodiversity monitoring because ecological datasets routinely mix fine details, imbalance, domain shifts, and unknown taxa. Tests across urban tree photos, broad natural observations, and herbarium sheets show consistent gains over baselines. If the method works as described, automated tools become more reliable for conservation work in open-world conditions.

Core claim

TaxoNet is an embedding learning framework with a theoretically grounded dual-margin objective that reshapes class decision boundaries under class imbalance to improve fine-grained discrimination while strengthening rare-class representation geometry.

What carries the argument

The dual-margin objective in embedding space, which simultaneously widens separation for fine-grained classes and tightens representation for rare classes.

If this is right

  • TaxoNet produces higher accuracy than multimodal foundation models on Google Auto-Arborist, iNaturalist Plantae, and NAFlora-Mini collections.
  • The method improves rare-class geometry without sacrificing performance on common classes.
  • Open-world performance holds when spatiotemporal shifts and previously unseen taxa are present.
  • The framework applies directly to other hierarchical, imbalanced fine-grained image tasks in ecology.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same margin adjustment could be tested on non-plant domains such as insect or bird fine-grained datasets with similar imbalance.
  • Explicit use of the taxonomic hierarchy during margin calculation might further reduce confusion between close relatives.
  • Scaling the approach to millions of images would test whether the dual-margin formulation stays stable at web scale.

Load-bearing premise

The dual-margin objective remains effective when fine-grained similarity, long-tailed imbalance, domain shift, and unseen taxa all appear together in the same dataset.

What would settle it

Run TaxoNet and standard embedding baselines on a held-out long-tailed plant dataset with many rare species; if rare-class accuracy shows no gain or drops, the central claim is false.

Figures

Figures reproduced from arXiv: 2512.18994 by Cheng Yaw Low, Heejoon Koo, Jaewoo Park, Meeyoung Cha.

Figure 1
Figure 1. Figure 1: The proposed Open-World Ecological Taxonomy Chal￾lenge, which organizes general, unique and deployment-level chal￾lenges according to realistic ecological scenarios. The challenges targetted in this work are shown with their corresponding problem settings—for example, the open-set task (C1) involves recognizing both known and unknown taxa during inference; and so on. gets biodiversity loss, alongside SDG 1… view at source ↗
Figure 2
Figure 2. Figure 2: Schematic comparison of softmax-based losses and the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Embedding norm distribution for 200 highest and lowest [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Success and failure cases on Auto-Arborist and iNat [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Zero-shot chain-of-thought (CoT) prompt template used to evaluate MLLMs, instructing the models to perform hierarchical [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
read the original abstract

Taxonomic classification of ecological families, genera, and species underpins biodiversity monitoring and conservation. Existing computer vision methods typically address fine-grained recognition and long-tailed learning in isolation. However, additional challenges such as spatiotemporal domain shift, hierarchical taxonomic structure, and previously unseen taxa often co-occur in real-world deployment, leading to brittle performance under open-world conditions. We propose TaxoNet, an embedding learning framework with a theoretically grounded dual-margin objective that reshapes class decision boundaries under class imbalance to improve fine-grained discrimination while strengthening rare-class representation geometry. We evaluate TaxoNet in open-world settings that capture co-occurring recognition challenges. Leveraging diverse plant datasets, including Google Auto-Arborist (urban tree imagery), iNaturalist (Plantae observations across heterogeneous ecosystems), and NAFlora-Mini (herbarium collections), we demonstrate that TaxoNet consistently outperforms strong baselines, including multimodal foundation models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces TaxoNet, an embedding learning framework for fine-grained plant taxonomy classification that incorporates a dual-margin objective claimed to be theoretically grounded. This objective is designed to reshape class decision boundaries under long-tailed imbalance, improving fine-grained discrimination and rare-class representation geometry while addressing co-occurring challenges such as spatiotemporal domain shift, hierarchical structure, and unseen taxa in open-world settings. Evaluations on Google Auto-Arborist, iNaturalist, and NAFlora-Mini datasets report consistent outperformance over baselines including multimodal foundation models.

Significance. If the dual-margin objective can be shown to be theoretically grounded with explicit derivations and if the reported gains are supported by ablations isolating its contribution, the work would offer a unified approach to multiple real-world challenges in ecological computer vision. The choice of diverse plant datasets spanning urban, ecosystem, and herbarium imagery strengthens potential applicability to biodiversity monitoring, provided the open-world handling is rigorously validated.

major comments (2)
  1. [§3] §3 (Dual-Margin Objective): The abstract asserts that the dual-margin objective is 'theoretically grounded' and reshapes boundaries under class imbalance, yet no derivation, proof sketch, or explicit reduction to the loss terms is provided; without this, it is impossible to verify whether the objective introduces hidden dependencies on fitted hyperparameters or reduces to standard margin losses.
  2. [§4] §4 (Experiments and Ablations): The evaluation claims consistent outperformance on three datasets and handling of open-world unseen taxa, but provides no ablation isolating the dual-margin term, no error analysis stratified by class frequency or taxonomic level, and no details on how spatiotemporal shift or hierarchical structure is explicitly modeled or tested; these omissions leave the central claim that the framework successfully addresses co-occurring challenges unsupported.
minor comments (2)
  1. [§3] Notation for the dual-margin loss (Eq. 3 or equivalent) uses symbols that are not defined until later sections; a consolidated notation table would improve readability.
  2. [§2] The related-work section could more explicitly contrast the proposed dual-margin approach with recent hierarchical or open-set embedding methods to clarify novelty.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important areas for strengthening the theoretical presentation and empirical validation of TaxoNet. We address each major comment point by point below and have revised the manuscript to incorporate the suggested improvements.

read point-by-point responses
  1. Referee: [§3] §3 (Dual-Margin Objective): The abstract asserts that the dual-margin objective is 'theoretically grounded' and reshapes boundaries under class imbalance, yet no derivation, proof sketch, or explicit reduction to the loss terms is provided; without this, it is impossible to verify whether the objective introduces hidden dependencies on fitted hyperparameters or reduces to standard margin losses.

    Authors: We acknowledge that the current manuscript does not contain an explicit derivation or proof sketch of the dual-margin objective. The objective was constructed from geometric considerations of margin-based separation in embedding space to counteract long-tailed imbalance, but these steps were not formalized in §3. In the revised manuscript we will add a dedicated subsection with a step-by-step derivation showing the reduction from the standard margin loss, the role of the two margin parameters, and an analysis of their hyperparameter sensitivity. This addition will make the theoretical grounding verifiable. revision: yes

  2. Referee: [§4] §4 (Experiments and Ablations): The evaluation claims consistent outperformance on three datasets and handling of open-world unseen taxa, but provides no ablation isolating the dual-margin term, no error analysis stratified by class frequency or taxonomic level, and no details on how spatiotemporal shift or hierarchical structure is explicitly modeled or tested; these omissions leave the central claim that the framework successfully addresses co-occurring challenges unsupported.

    Authors: We agree that the experimental section would benefit from targeted ablations and stratified analyses. The revised manuscript will include: (i) an ablation that isolates the dual-margin term by comparing the full objective against its single-margin and standard-contrastive variants; (ii) error breakdowns stratified by class frequency (head/medium/tail) and taxonomic rank (family/genus/species); and (iii) explicit description of how the embedding framework and open-world evaluation protocol address spatiotemporal shift and hierarchy (via the loss geometry and the unseen-taxa test split). These additions will directly support the claim that the framework handles the co-occurring challenges. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes TaxoNet with a dual-margin objective stated as theoretically grounded for reshaping boundaries under imbalance. The provided abstract and evaluation description report empirical gains on Auto-Arborist, iNaturalist, and NAFlora-Mini over baselines including multimodal models, with no visible equations reducing by construction to fitted hyperparameters, self-definitional loops, or load-bearing self-citations. The derivation chain appears self-contained, relying on the proposed objective and external dataset validations rather than renaming or smuggling inputs as outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the dual-margin objective is presented as theoretically grounded without visible derivation or parameter list.

pith-pipeline@v0.9.0 · 5459 in / 1069 out tokens · 29881 ms · 2026-05-16T20:28:33.602567+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 3 internal anchors

  1. [1]

    M. P. Barajas-Barbosa, D. Craven, P. Weigelt, et al. Global patterns of vascular plant alpha diversity.Nat. Commun., 13 (1):1–9, 2022. 3

  2. [2]

    The auto arborist dataset: a large-scale benchmark for multiview urban for- est monitoring under domain shift

    Sara Beery, Guanhang Wu, Trevor Edwards, Filip Pavetic, Bo Majewski, Shreyasee Mukherjee, Stanley Chan, John Mor- gan, Vivek Rathod, and Jonathan Huang. The auto arborist dataset: a large-scale benchmark for multiview urban for- est monitoring under domain shift. InProceedings of the IEEE/CVF conference on computer vision and pattern recog- nition, pages ...

  3. [3]

    Botanic gardens are vital for delivering the kunming-montreal global biodiversity framework.Bio- logical Diversity, 1(3-4):120–123, 2024

    Stephen Blackmore. Botanic gardens are vital for delivering the kunming-montreal global biodiversity framework.Bio- logical Diversity, 1(3-4):120–123, 2024. 2

  4. [4]

    Learning imbalanced datasets with label- distribution-aware margin loss.Advances in neural informa- tion processing systems, 32, 2019

    Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. Learning imbalanced datasets with label- distribution-aware margin loss.Advances in neural informa- tion processing systems, 32, 2019. 2, 6, 7

  5. [5]

    Howard, and Serge J

    Yin Cui, Yang Song, Chen Sun, Andrew G. Howard, and Serge J. Belongie. Large scale fine-grained categorization and domain-specific transfer learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4109–4118, 2018. 2

  6. [6]

    Class-balanced loss based on effective number of samples

    Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. Class-balanced loss based on effective number of samples. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9268–9277,

  7. [7]

    Arcface: Additive angular margin loss for deep face recognition

    Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690– 4699, 2019. 3

  8. [8]

    Anantha Kumar Duraiappah and Deborah Rogers. The in- tergovernmental platform on biodiversity and ecosystem ser- vices: opportunities for the social sciences.Innovation: The European Journal of Social Science Research, 24(3):217– 224, 2011. 1

  9. [9]

    The world checklist of vascular plants, a continuously updated resource for exploring global plant diversity.Scientific Data, 8(1): 1–10, 2021

    Rafa ¨el Govaerts, Eimear Nic Lughadha, et al. The world checklist of vascular plants, a continuously updated resource for exploring global plant diversity.Scientific Data, 8(1): 1–10, 2021. 3

  10. [10]

    Aug- mix: A simple method to improve robustness and uncertainty under data shift

    Dan Hendrycks, Norman Mu, Ekin Dogus Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. Aug- mix: A simple method to improve robustness and uncertainty under data shift. InInternational Conference on Learning Representations, 2020. 5, 6, 1

  11. [11]

    GPT-4o System Card

    Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. Gpt-4o system card.arXiv preprint arXiv:2410.21276, 2024. 6, 8

  12. [12]

    Survey on deep learning with class imbalance.Journal of big data, 6 (1):1–54, 2019

    Justin M Johnson and Taghi M Khoshgoftaar. Survey on deep learning with class imbalance.Journal of big data, 6 (1):1–54, 2019. 2

  13. [13]

    Next visit diagnosis prediction via medical code-centric multimodal contrastive ehr modelling with hi- erarchical regularisation

    Heejoon Koo. Next visit diagnosis prediction via medical code-centric multimodal contrastive ehr modelling with hi- erarchical regularisation. InFindings of the Association for Computational Linguistics: EACL 2024, pages 41–55, 2024. 5

  14. [14]

    Gist: Generating image-specific text for fine-grained object classification.arXiv preprint arXiv:2307.11315, 2023

    Kathleen M Lewis, Emily Mu, Adrian V Dalca, and John Guttag. Gist: Generating image-specific text for fine-grained object classification.arXiv preprint arXiv:2307.11315, 2023. 3

  15. [15]

    Focal loss for dense object detection

    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll´ar. Focal loss for dense object detection. InPro- ceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. 2

  16. [16]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 1

  17. [17]

    Slackedface: Learn- ing a slacked margin for low-resolution face recognition

    Cheng Yaw Low, Jacky Chen Long Chai, Jaewoo Park, Kyeongjin Ann, and Meeyoung Cha. Slackedface: Learn- ing a slacked margin for low-resolution face recognition. In Proc. of the BMVC, 2023. 4

  18. [18]

    Cheng Yaw Low, Meeyoung Cha, Jana W ¨aldchen, and Kr- ishna P. Gummadi. Open-set classification for rare and un- known urban tree taxa. InInternational Conference on In- formation Technology for Social Good (GoodIT ’25), pages 1–7, Antwerp, Belgium, 2025. ACM. 2

  19. [19]

    Magface: A universal representation for face recognition and quality assessment

    Qiang Meng, Shichao Zhao, Zhida Huang, and Feng Zhou. Magface: A universal representation for face recognition and quality assessment. In2021 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 14220– 14229, 2021. 4, 5, 7

  20. [20]

    sweater",

    Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, and Sanjiv Kumar. Long-tail learning via logit adjustment.arXiv preprint arXiv:2007.07314, 2020. 2, 6, 7

  21. [21]

    Divergent angular representation for open set image recogni- tion.IEEE Transactions on Image Processing, 31:176–189,

    Jaewoo Park, Cheng Yaw Low, and Andrew Beng Jin Teoh. Divergent angular representation for open set image recogni- tion.IEEE Transactions on Image Processing, 31:176–189,

  22. [22]

    Naflora-1m: Continental-scale high-resolution fine-grained plant classification dataset.Jour- nal of Data-centric Machine Learning Research, 2024

    John Park, Riccardo de Lutio, Brendan Rappazzo, Barbara Ambrose, Fabian Michelangeli, Kimberly Watson, Serge Be- longie, and Damon Little. Naflora-1m: Continental-scale high-resolution fine-grained plant classification dataset.Jour- nal of Data-centric Machine Learning Research, 2024. 1, 6

  23. [23]

    Global biodiversity scenarios for the year 2100.science, 287(5459):1770–1774, 2000

    Osvaldo E Sala, FIII Stuart Chapin, Juan J Armesto, Eric Berlow, Janine Bloomfield, Rodolfo Dirzo, Elisabeth Huber- Sanwald, Laura F Huenneke, Robert B Jackson, Ann Kinzig, et al. Global biodiversity scenarios for the year 2100.science, 287(5459):1770–1774, 2000. 1

  24. [24]

    Biodiversity and the 2030 agenda for sustainable development

    SCBD. Biodiversity and the 2030 agenda for sustainable development. Technical report, Secretariat of the Convention on Biological Diversity, 2017. 1

  25. [25]

    Toward open set recogni- tion.IEEE transactions on pattern analysis and machine intelligence, 35(7):1757–1772, 2012

    Walter J Scheirer, Anderson de Rezende Rocha, Archana Sapkota, and Terrance E Boult. Toward open set recogni- tion.IEEE transactions on pattern analysis and machine intelligence, 35(7):1757–1772, 2012. 3

  26. [26]

    Role play with large language models.Nature, 623(7987):493– 498, 2023

    Murray Shanahan, Kyle McDonell, and Laria Reynolds. Role play with large language models.Nature, 623(7987):493– 498, 2023. 8

  27. [27]

    Smith and S

    J. Smith and S. Patel. Open-set classification strategies for long-term acoustic biodiversity monitoring.Journal of the Acoustical Society of America, 151(6):4028–4042, 2024. 3

  28. [28]

    Fine-grained visual prompt learning of vision-language mod- els for image recognition

    Hongbo Sun, Xiangteng He, Jiahuan Zhou, and Yuxin Peng. Fine-grained visual prompt learning of vision-language mod- els for image recognition. InProceedings of the 31st ACM International Conference on Multimedia, pages 5828–5836,

  29. [29]

    Gemini: A Family of Highly Capable Multimodal Models

    Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean- Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. Gemini: a family of highly capable multimodal models.arXiv preprint arXiv:2312.11805, 2023. 6, 8

  30. [30]

    The inaturalist species classification and detection dataset

    Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. The inaturalist species classification and detection dataset. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 8769–8778,

  31. [31]

    Automated plant species identification—trends and future directions.PLoS computational biology, 14(4): e1005993, 2018

    Jana W ¨aldchen, Michael Rzanny, Marco Seeland, and Patrick M¨ader. Automated plant species identification—trends and future directions.PLoS computational biology, 14(4): e1005993, 2018. 1, 3

  32. [32]

    Normface: L2 hypersphere embedding for face verification

    Feng Wang, Jiancheng Cheng, Weiyang Liu, and Haijun Liu. Normface: L2 hypersphere embedding for face verification. InProceedings of the 25th ACM International Conference on Multimedia (ACM MM), pages 1041–1049, 2017. 3, 6

  33. [33]

    Cosface: Large margin cosine loss for deep face recognition

    Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. Cosface: Large margin cosine loss for deep face recognition. InPro- ceedings of the IEEE conference on computer vision and pattern recognition, pages 5265–5274, 2018. 3, 6, 7

  34. [34]

    Bioclip: A vision-language foundation model for the tree of life.Nature Communications,

    Jiahui Wang, Yutong Li, et al. Bioclip: A vision-language foundation model for the tree of life.Nature Communications,

  35. [35]

    Wang and Q

    Y. Wang and Q. Zhao. Open-set fish species recognition with non-parametric methods.Sensors, 25(5):1570, 2023. 3

  36. [36]

    Chain-of- thought prompting elicits reasoning in large language mod- els.Advances in neural information processing systems, 35: 24824–24837, 2022

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of- thought prompting elicits reasoning in large language mod- els.Advances in neural information processing systems, 35: 24824–24837, 2022. 8

  37. [37]

    Deep long-tailed learning: A survey.IEEE transactions on pattern analysis and machine intelligence, 45(9):10795–10816, 2023

    Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan, and Jiashi Feng. Deep long-tailed learning: A survey.IEEE transactions on pattern analysis and machine intelligence, 45(9):10795–10816, 2023. 2 Towards AI-Guided Open-World Ecological Taxonomic Classification Supplementary Material

  38. [38]

    Training Pipeline TaxoNet introduces a minimal-overhead extension to stan- dard training: oversampling an additional𝑏tail-class ex- amples on top of the initial batch size𝐵, where typically 𝐵 > 𝑏. From the augmented batch of𝐵+𝑏samples,only the first𝐵samples are retained through norm-guided sampling, while the remaining𝑏samples, primarily corresponding to ...

  39. [39]

    Implementation Details Datasets.Dataset statistics are summarized in Table 9. The regional subsets of Auto-Arborist (AA) exhibit the most pronounced class imbalance; for example, in AA-Central, the largest genus class contains 6,269 training examples, while the smallest contains only 6 (see Table 10). Model Backbone.All models, including our implementa- t...

  40. [40]

    For classes with only a single test sample, misclassifying that sample results in a 100% drop in recall

    Key Hyperparameters Long-tailed classification is particularly sensitive to the number of test examples per class. For classes with only a single test sample, misclassifying that sample results in a 100% drop in recall. In addition to rank-1 accuracy (R@1) and macro-averaged recall, we also report precision and F1 for a more comprehensive evaluation. Base...

  41. [41]

    Additional Results: MLLMs and VLFMs To complement Table 5 in the main manuscript, we re- veal class-level recall for TaxoNet and multimodal founda- tion models. Whereas the main table reports only macro- averaged recall across head, between, and tail classes, the expanded results in Tables 10 and 11 expose per-class per- formance and variability that are ...

  42. [42]

    Prompt Templates We provide the prompt template used for zero-shot chain- of-thought (CoT) reasoning with GPT-4.0 and Gemini-2.5. We also evaluate a CoT variant augmented with Wikipedia- curated taxon descriptions, but omit it here for compactness, as the substantially longer prompts offer only marginal per- formance gains and likely introduce reasoning n...

  43. [43]

    Replace the angle-bracketed fields with your actual reasoning and predictions

  44. [44]

    Do not include any commentary, formatting, markdown, or extra text outside of the JSON object

  45. [45]

    a photo ofQuercus robur

    Always select exactly one genus and one species. Figure 5. Zero-shot chain-of-thought (CoT) prompt template used to evaluate MLLMs, instructing the models to perform hierarchical reasoning by first predicting the genus and then refining the prediction to the species level. This approach is inspired by sequential diagnosis prediction utilizing medical onto...