pith. sign in

arxiv: 2605.17773 · v1 · pith:UVGGQ2DInew · submitted 2026-05-18 · 💻 cs.CV

PlantPose: Universal Plant Skeleton Estimation via Tree-constrained Graph Generation

Pith reviewed 2026-05-20 12:49 UTC · model grok-4.3

classification 💻 cs.CV
keywords plant skeleton estimationgraph generationtree constraintscomputer visionsmart agriculturepose estimationgeneralization
0
0 comments X

The pith

PlantPose estimates variable plant branching structures from images by generating graphs while enforcing tree topology during training on mixed real and synthetic data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Plant skeletons differ from human ones because they lack fixed topology and instead form arbitrary branching trees. The work introduces a method that learns to generate graph representations of these structures from images while inserting classical graph algorithms into the training process to guarantee the output remains a valid tree. A large training collection mixes photographs, synthetic renders, sketches, and abstract drawings so the learned model encounters many visual styles and plant categories. If the approach holds, accurate skeleton extraction would become feasible for previously unseen plant species or image types without retraining or post-processing fixes.

Core claim

The central claim is that combining learning-based graph generation with traditional graph algorithms to enforce tree constraints inside the training loop, together with training on a curated mix of real-world, synthetic, sketch, and abstract plant images, produces robust and topologically consistent skeleton estimates across multiple domains including out-of-domain cases.

What carries the argument

Tree-constrained graph generation, which augments a learned graph predictor with classical algorithms that correct outputs to valid trees during every training step.

If this is right

  • Skeleton estimates remain trees even when the input image shows complex or occluded branches.
  • A single model handles both photographic and simplified drawing inputs without domain-specific retraining.
  • Topological errors that normally require separate post-processing steps are reduced by the integrated constraint mechanism.
  • The same trained weights apply to previously unseen plant categories and visual styles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same training strategy of mixing concrete and abstract depictions could be tested on other variable-topology structures such as river networks or blood vessels.
  • Integrating the method with 3D reconstruction from multiple views might produce consistent volumetric plant models.
  • Failure cases on highly heterogeneous data could point to the need for additional constraint types beyond trees.

Load-bearing premise

Enforcing tree constraints inside the training loop plus training on a curated mix of real, synthetic, sketch, and abstract images is sufficient to produce topologically consistent outputs on arbitrary new plant image distributions.

What would settle it

Run the trained model on a fresh collection of plant photographs that contain branching patterns or image styles absent from the training mix and observe whether the outputs contain cycles, disconnected components, or large topological errors.

read the original abstract

Accurate estimation of plant skeletal structures (e.g., branching structures) from images is essential for smart agriculture and plant science. Unlike human skeletons with fixed topology, plant skeleton estimation presents a unique challenge, i.e., estimating arbitrary tree graphs from images. To address this problem, we introduce PlantPose, a universal plant skeleton estimator via tree-constrained graph generation. PlantPose combines learning-based graph generation with traditional graph algorithms to enforce tree constraints during the training loop. To enhance the model's generalization capability, we curate a large and diverse dataset comprising real-world and synthetic plant images, along with simplified representations (e.g., sketches and abstract drawings). This dataset enables the generalized model to adapt to diverse input styles and categories of plant images while preserving topological consistency. Our approach demonstrates robust and accurate plant skeleton estimation across multiple domains, including previously unseen out-of-domain scenarios. Further analyses highlight the method's strengths and limitations in handling complex, heterogeneous data distributions. All implementations and datasets are available at https://github.com/huntorochi/PlantPose/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper introduces PlantPose, a universal plant skeleton estimator that combines learning-based graph generation with traditional graph algorithms to enforce tree constraints during the training loop. It curates a diverse dataset of real-world, synthetic, sketch, and abstract plant images to promote generalization across styles and categories, claiming robust and accurate estimation of arbitrary tree graphs on multiple domains including previously unseen out-of-domain scenarios while preserving topological consistency. Code and datasets are released at the provided GitHub link.

Significance. If the central claims are supported by the experiments, the work could meaningfully advance plant phenotyping and smart agriculture by offering a generalizable approach to variable-topology skeleton estimation, where fixed-topology methods like human pose estimation do not apply. The multi-style training strategy and explicit integration of graph algorithms for constraints represent a practical strength, and the public release of implementations supports reproducibility and further research.

major comments (1)
  1. [Abstract] Abstract: The method description states that tree constraints are enforced 'during the training loop' via combination with traditional graph algorithms, but the central claim of topological consistency on arbitrary out-of-domain images requires clarification on whether equivalent enforcement (e.g., cycle removal, connectivity enforcement, or MST projection) occurs at inference. If the generative model can produce cycles or disconnected components on unseen inputs without post-processing, the topological guarantees do not necessarily extend beyond the training distribution.
minor comments (1)
  1. [Abstract] The abstract asserts robust performance across domains but does not preview any quantitative metrics, ablation studies, or failure cases; these should be summarized early to allow readers to assess the strength of the generalization claims.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed and constructive review. The feedback has helped us clarify key aspects of our method. We address the major comment point-by-point below and have updated the manuscript to improve clarity and precision.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The method description states that tree constraints are enforced 'during the training loop' via combination with traditional graph algorithms, but the central claim of topological consistency on arbitrary out-of-domain images requires clarification on whether equivalent enforcement (e.g., cycle removal, connectivity enforcement, or MST projection) occurs at inference. If the generative model can produce cycles or disconnected components on unseen inputs without post-processing, the topological guarantees do not necessarily extend beyond the training distribution.

    Authors: We appreciate this observation and agree that explicit clarification is warranted. Tree constraints are enforced during training by incorporating classical graph algorithms (e.g., MST projection and cycle detection) directly into the graph generation objective and loss computation; this shapes the learned distribution toward valid tree structures. At inference, the model generates graphs directly from the trained parameters without mandatory post-processing, as the training procedure encourages outputs that are already topologically consistent. Our experiments across out-of-domain styles (sketches, abstract drawings, and unseen plant categories) confirm that generated skeletons exhibit no cycles or disconnected components, with quantitative topology metrics reported in the results section. To address the referee's concern, we have revised the abstract and added a dedicated paragraph in the method section describing the inference procedure, including an optional lightweight projection step for edge cases while noting that it was not required in our evaluations. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the derivation chain

full rationale

The paper presents PlantPose as a hybrid approach that combines a learning-based graph generation model with separate traditional graph algorithms to enforce tree constraints only during the training loop, then evaluates generalization on a curated multi-domain dataset. This structure relies on external algorithmic post-processing steps and empirical training rather than any self-referential definition where the target output (e.g., tree topology) is defined in terms of itself or a fitted parameter renamed as a prediction. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation are described in the abstract or method outline. The central claim of robust out-of-domain performance is framed as an empirical result from the training regimen and dataset diversity, not a mathematical reduction to the inputs by construction. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; the central claim rests on the unstated premise that tree constraints can be injected into the training loop without destroying gradient flow and that the collected real/synthetic/sketch/abstract images adequately sample the space of plant appearances. No free parameters, axioms, or invented entities are explicitly named.

pith-pipeline@v0.9.0 · 5715 in / 1181 out tokens · 35312 ms · 2026-05-20T12:49:47.341468+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    PlantPose combines learning-based graph generation with traditional graph algorithms to enforce tree constraints during the training loop... we propose to project an unconstrained graph into a tree graph by a non-differentiable MST algorithm during each training loop. Our selective feature suppression (SFS) layer then converts the inferred unconstrained graph to the MST-based tree graph in a differentiable manner

  • IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We use Kruskal’s MST algorithm implemented in NetworkX... the output graph often violates the required constraints

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · 2 internal anchors

  1. [1]

    Computers and Electronics in Agriculture207, 107736 (2023)

    Gentilhomme, T., Villamizar, M., Corre, J., Odobez, J.-M.: Towards smart pruning: ViNet, a deep-learning approach for grapevine structure estimation. Computers and Electronics in Agriculture207, 107736 (2023)

  2. [2]

    New Phytologist 212(1), 269–281 (2016)

    Cabrera-Bosquet, L., Fournier, C., Brichet, N., Welcker, C., Suard, B., Tardieu, F.: High-throughput estimation of incident light, light interception and radiation- use efficiency of thousands of plants in a phenotyping platform. New Phytologist 212(1), 269–281 (2016)

  3. [3]

    Frontiers in Plant Science10, 248 (2019)

    Sheng, W., Wen, W., Xiao, B., Guo, X., Du, J.J., Wang, C., Wang, Y.: An accu- rate skeleton extraction approach from 3d point clouds of maize plants. Frontiers in Plant Science10, 248 (2019)

  4. [4]

    In: Proceedings of European Conference on Computer Vision (ECCV) Workshops (2020)

    Gaillard, M., Miao, C., Schnable, J., Benes, B.: Sorghum segmentation by skeleton extraction. In: Proceedings of European Conference on Computer Vision (ECCV) Workshops (2020)

  5. [5]

    Computers and Electronics in Agriculture187, 106310 (2021)

    Miao, T., Zhu, C., Xu, T., Yang, T., Li, N., Zhou, Y., Deng, H.: Automatic stem- leaf segmentation of maize shoots using three-dimensional point cloud. Computers and Electronics in Agriculture187, 106310 (2021)

  6. [6]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2D pose esti- mation using part affinity fields. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  7. [7]

    In: Proceedings of European Conference on Computer Vision (ECCV), pp

    He, S., Bastani, F., Jagwani, S., Alizadeh, M., Balakrishnan, H., Chawla, S., Elshrif, M.M., Madden, S., Sadeghi, M.A.: Sat2Graph: Road graph extraction through graph-tensor encoding. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 51–67 (2020)

  8. [8]

    IEEE Robotics and Automation Letters, 1–8 (2023)

    Xu, Z., Liu, Y., Sun, Y., Liu, M., Wang, L.: RNGDet++: Road network graph detection by transformer with instance segmentation and multi-scale features enhancement. IEEE Robotics and Automation Letters, 1–8 (2023)

  9. [9]

    IEEE Robotics and Automation Letters6, 1097–1104 (2021)

    Xu, Z., Sun, Y., Liu, M.: iCurb: Imitation learning-based detection of road curbs using aerial images for autonomous driving. IEEE Robotics and Automation Letters6, 1097–1104 (2021)

  10. [10]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient interactive annotation of segmen- tation datasets with polygon-RNN++. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 859–868 (2018)

  11. [11]

    In: Proceedings of European Conference on Computer Vision 61 (ECCV), pp

    Li, W., Lu, Y., Zheng, K., Liao, H., Lin, C., Luo, J., Cheng, C.-T., Xiao, J., Lu, L., Kuo, C.-F.,et al.: Structured landmark detection via topology-adapting deep graph learning. In: Proceedings of European Conference on Computer Vision 61 (ECCV), pp. 266–283 (2020)

  12. [12]

    In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp

    Li, W., Zhao, W., Zhong, H., He, C., Lin, D.: Joint semantic-geometric learning for polygonal building segmentation. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 1958–1965 (2021)

  13. [13]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Ling, H., Gao, J., Kar, A., Chen, W., Fidler, S.: Fast interactive object annotation with curve-GCN. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5252–5261 (2019)

  14. [14]

    In: Proceedings of NeurIPS Workshop on Graph Representation Learning (2019)

    Belli, D., Kipf, T.: Image-conditioned graph generation for road network extrac- tion. In: Proceedings of NeurIPS Workshop on Graph Representation Learning (2019)

  15. [15]

    Chemistry-Methods 2(1), 202100069 (2022)

    Khokhlov, I., Krasnov, L., Fedorov, M.V., Sosnin, S.: Image2SMILES: Transformer-based molecular optical recognition engine. Chemistry-Methods 2(1), 202100069 (2022)

  16. [16]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Li, R., Zhang, S., He, X.: SGTR: End-to-end scene graph generation with trans- former. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19486–19496 (2022)

  17. [17]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Liang, J., Homayounfar, N., Ma, W.-C., Xiong, Y., Hu, R., Urtasun, R.: Poly- Transform: Deep polygon transformer for instance segmentation. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9128–9137 (2020)

  18. [18]

    In: Proceedings of European Conference on Computer Vision (ECCV), pp

    Shit, S., Koner, R., Wittmann, B., Paetzold, J., Ezhov, I., Li, H., Pan, J., Shar- ifzadeh, S., Kaissis, G., Tresp, V.,et al.: Relationformer: A unified framework for image-to-graph generation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 422–439 (2022)

  19. [19]

    In: Proceedings of International Joint Conferences on Artificial Intelligence (IJCAI) (2021)

    Kotary, J., Fioretto, F., Van Hentenryck, P., Wilder, B.: End-to-end con- strained optimization learning: A survey. In: Proceedings of International Joint Conferences on Artificial Intelligence (IJCAI) (2021)

  20. [20]

    In: Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2025)

    Liu, X., Santo, H., Toda, Y., Okura, F.: TreeFormer: Single-view plant skeleton estimation via tree-constrained graph generation. In: Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2025)

  21. [21]

    Breeding Science 72(1), 31–47 (2022)

    Okura, F.: 3D modeling and reconstruction of plants and trees: A cross-cutting review across computer graphics, vision, and plant phenotyping. Breeding Science 72(1), 31–47 (2022)

  22. [22]

    Remote Sensing12(22) (2020) 62

    Ai, M., Yao, Y., Hu, Q., Wang, Y., Wang, W.: An automatic tree skeleton extrac- tion approach based on multi-view slicing using terrestrial LiDAR scans data. Remote Sensing12(22) (2020) 62

  23. [23]

    ACM Transactions on Graphics (TOG)26(4), 19 (2007)

    Xu, H., Gossett, N., Chen, B.: Knowledge and heuristic-based modeling of laser- scanned trees. ACM Transactions on Graphics (TOG)26(4), 19 (2007)

  24. [24]

    Plant Phenomics2020(2020)

    Wu, S., Wen, W., Wang, Y., Fan, J., Wang, C., Gou, W., Guo, X.: MVS-Pheno: A portable and low-cost phenotyping platform for maize shoots using multiview stereo 3D reconstruction. Plant Phenomics2020(2020)

  25. [25]

    Applications in Plant Sciences2(8), 1400005 (2014)

    Bucksch, A.: A practical introduction to skeletons for the plant sciences. Applications in Plant Sciences2(8), 1400005 (2014)

  26. [26]

    Remote Sensing11(18) (2019)

    Du, S., Lindenbergh, R., Ledoux, H., Stoter, J., Nan, L.: AdTree: Accurate, detailed, and automatic modelling of laser-scanned trees. Remote Sensing11(18) (2019)

  27. [27]

    ACM Transactions on Graphics (TOG)32(4), 65 (2013)

    Huang, H., Wu, S., Cohen-Or, D., Gong, M., Zhang, H., Li, G., Chen, B.: L1- medial skeleton of point cloud. ACM Transactions on Graphics (TOG)32(4), 65 (2013)

  28. [28]

    Plant Physiology 181(4), 1425–1440 (2019)

    Ziamtsov, I., Navlakha, S.: Machine learning approaches to improve three basic plant phenotyping tasks using three-dimensional point clouds. Plant Physiology 181(4), 1425–1440 (2019)

  29. [29]

    Frontiers in Plant Science11(2020)

    Chaudhury, A., Godin, C.: Skeletonization of plant point cloud data using stochastic optimization framework. Frontiers in Plant Science11(2020)

  30. [30]

    In: Proceedings of Workshop on Computer Vision Problems in Plant Phenotyping (CVPPP), pp

    Giuffrida, V., Minervini, M., Tsaftaris, S.: Learning to count leaves in rosette plants. In: Proceedings of Workshop on Computer Vision Problems in Plant Phenotyping (CVPPP), pp. 1–1113 (2015)

  31. [31]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Isokane, T., Okura, F., Ide, A., Matsushita, Y., Yagi, Y.: Probabilistic plant mod- eling via multi-view image-to-image translation. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2906–2915 (2018)

  32. [32]

    IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)45(9), 11169–11183 (2023)

    Cong, Y., Yang, M., Rosenhahn, B.: RelTR: Relation transformer for scene graph generation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)45(9), 11169–11183 (2023)

  33. [33]

    In: Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV), pp

    Abdelkarim, S., Agarwal, A., Achlioptas, P., Chen, J., Huang, J., Li, B., Church, K.W., Elhoseiny, M.: Exploring long tail visual relationship recognition with large vocabulary. In: Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15901–15910 (2020)

  34. [34]

    In: Proceedings of ACM International Conference on Multimedia (MM), pp

    Chiou, M.-J., Ding, H., Yan, H., Wang, C., Zimmermann, R., Feng, J.: Recov- ering the unbiased scene graphs from the biased ones. In: Proceedings of ACM International Conference on Multimedia (MM), pp. 1581–1590 (2021) 63

  35. [35]

    In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp

    Sharifzadeh, S., Baharlou, S.M., Schmitt, M., Sch¨ utze, H., Tresp, V.: Improving scene graph classification by exploiting knowledge from texts. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 2189–2197 (2022)

  36. [36]

    In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp

    Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word rep- resentation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

  37. [37]

    Learning Deep Generative Models of Graphs

    Li, Y., Vinyals, O., Dyer, C., Pascanu, R., Battaglia, P.: Learning deep generative models of graphs. arXiv preprint arXiv:1803.03324 (2018)

  38. [38]

    In: Advances in Neural Information Processing Systems (NeurIPS) (2018)

    Liu, Q., Allamanis, M., Brockschmidt, M., Gaunt, A.L.: Constrained graph vari- ational autoencoders for molecule design. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)

  39. [39]

    In: Proceedings of International Conference on Machine Learning (ICML) (2021)

    Luo, Y., Yan, K., Ji, S.: Graphdf: A discrete flow model for molecular graph gener- ation. In: Proceedings of International Conference on Machine Learning (ICML) (2021)

  40. [40]

    IEEE Transactions on Visualization and Computer Graphics (2023)

    Zhou, X., Li, B., Benes, B., Fei, S., Pirk, S.: Deeptree: Modeling trees with situated latents. IEEE Transactions on Visualization and Computer Graphics (2023)

  41. [41]

    In: Proceedings of International Conference on Learning Representations (ICLR) (2014)

    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Proceedings of International Conference on Learning Representations (ICLR) (2014)

  42. [42]

    In: Proceedings on NeurIPS Workshop on Bayesian Deep Learning (2016)

    Kipf, T.N., Welling, M.: Variational graph auto-encoders. In: Proceedings on NeurIPS Workshop on Bayesian Deep Learning (2016)

  43. [43]

    In: Proceedings of European Conference on Computer Vision (ECCV), pp

    Yang, J., Ang, Y.Z., Guo, Z., Zhou, K., Zhang, W., Liu, Z.: Panoptic scene graph generation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 178–196 (2022)

  44. [44]

    In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), pp

    Zang, C., Wang, F.: Moflow: A invertible flow model for generating molecular graphs. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), pp. 617–626 (2020)

  45. [45]

    In: Proceedings of International Conference on Learning Representations (ICLR) (2022)

    Ahn, S., Chen, B., Wang, T., Song, L.: Spanning tree-based graph genera- tion for molecules. In: Proceedings of International Conference on Learning Representations (ICLR) (2022)

  46. [46]

    In: Proceedings of International Conference on Machine Learning (ICML), pp

    Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In: Proceedings of International Conference on Machine Learning (ICML), pp. 2323–2332 (2018)

  47. [47]

    In: Proceedings of International Conference on Machine Learning (ICML) (2020) 64

    Jin, W., Barzilay, R., Jaakkola, T.: Hierarchical generation of molecular graphs using structural motifs. In: Proceedings of International Conference on Machine Learning (ICML) (2020) 64

  48. [48]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Xu, D., Zhu, Y., Choy, C.B., Fei-Fei, L.: Scene graph generation by iterative message passing. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5410–5419 (2017)

  49. [49]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Tang, K., Zhang, H., Wu, B., Luo, W., Liu, W.: Learning to compose dynamic tree structures for visual contexts. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6619–6628 (2019)

  50. [50]

    In: Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV), pp

    Zhong, Y., Shi, J., Yang, J., Xu, C., Li, Y.: Learning to generate scene graph from natural language supervision. In: Proceedings of IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1823–1834 (2021)

  51. [51]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Zhang, Y., Pan, Y., Yao, T., Huang, R., Mei, T., Chen, C.-W.: Learning to generate language-supervised and open-vocabulary scene graph using pre-trained visual-semantic space. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2915–2924 (2023)

  52. [52]

    In: Pro- ceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Li, L.H., Zhang, P., Zhang, H., Yang, J., Li, C., Zhong, Y., Wang, L., Yuan, L., Zhang, L., Hwang, J.-N.,et al.: Grounded language-image pre-training. In: Pro- ceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10965–10975 (2022)

  53. [53]

    In: Advances in Neural Information Processing Systems (NeurIPS), vol

    Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., Kolter, J.Z.: Differen- tiable convex optimization layers. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 32 (2019)

  54. [54]

    In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp

    Wilder, B., Dilkina, B., Tambe, M.: Melding the data-decisions pipeline: Decision- focused learning for combinatorial optimization. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 1658–1665 (2019)

  55. [55]

    In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp

    Ferber, A., Wilder, B., Dilkina, B., Tambe, M.: Mipaal: Mixed integer program as a layer. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 1504–1511 (2020)

  56. [56]

    Networks26(4), 231–241 (1995)

    Myung, Y.-S., Lee, C.-H., Tcha, D.-W.: On the generalized minimum spanning tree problem. Networks26(4), 231–241 (1995)

  57. [57]

    European Journal of Operational Research283(1), 1–15 (2020)

    Pop, P.C.: The generalized minimum spanning tree problem: An overview of formulations, solution procedures and latest advances. European Journal of Operational Research283(1), 1–15 (2020)

  58. [58]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Xie, C., Tan, M., Gong, B., Wang, J., Yuille, A.L., Le, Q.V.: Adversarial exam- ples improve image recognition. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

  59. [59]

    IEEE Trans- actions on Medical Imaging37(6), 1440–1453 (2018)

    Gupta, H., Jin, K.H., Nguyen, H.Q., McCann, M.T., Unser, M.: CNN-based 65 projected gradient descent for consistent CT image reconstruction. IEEE Trans- actions on Medical Imaging37(6), 1440–1453 (2018)

  60. [60]

    In: Proceedings of International Conference on Learning Representations (ICLR) (2018)

    Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)

  61. [61]

    In: Proceedings of European Conference on Computer Vision (ECCV), pp

    Yoshida, M., Torii, A., Okutomi, M., Endo, K., Sugiyama, Y., Taniguchi, R.-i., Nagahara, H.: Joint optimization for compressive video sensing and reconstruc- tion under hardware constraints. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 634–649 (2018)

  62. [62]

    IEEE Reviews in Biomedical Engineering16, 225–240 (2021)

    Bohlender, S., ¨Oks¨ uz, I., Mukhopadhyay, A.: A survey on shape-constraint deep learning for medical image segmentation. IEEE Reviews in Biomedical Engineering16, 225–240 (2021)

  63. [63]

    IEEE Transactions on Medical Imaging 37(5), 1081–1091 (2017)

    Li, Y., Ho, C.P., Toulemonde, M., Chahal, N., Senior, R., Tang, M.-X.: Fully automatic myocardial segmentation of contrast echocardiography sequence using random forests guided by shape model. IEEE Transactions on Medical Imaging 37(5), 1081–1091 (2017)

  64. [64]

    In: Proceedings of International Conference on Learning Representations (ICLR) (2021)

    Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: Deformable transformers for end-to-end object detection. In: Proceedings of International Conference on Learning Representations (ICLR) (2021)

  65. [65]

    Layer Normalization

    Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)

  66. [66]

    Proceedings of the American Mathematical Society7(1), 48– 50 (1956)

    Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society7(1), 48– 50 (1956)

  67. [67]

    ACM Transactions on Graphics (TOG)39(5), 1–13 (2020)

    Guo, J., Jiang, H., Benes, B., Deussen, O., Zhang, X., Lischinski, D., Huang, H.: Inverse procedural modeling of branching structures by inferring L-systems. ACM Transactions on Graphics (TOG)39(5), 1–13 (2020)

  68. [68]

    Filaments with one-sided inputs

    Lindenmayer, A.: Mathematical models for cellular interactions in development I. Filaments with one-sided inputs. Journal of Theoretical Biology18(3), 280–299 (1968)

  69. [69]

    Computers and Electronics in Agriculture (2022)

    Li, L., Hu, W., Lu, J., Zhang, C.: Leaf vein segmentation with self-supervision. Computers and Electronics in Agriculture (2022)

  70. [70]

    bioRxiv (2024) 66

    Wang, P., Chang, J., Deng, W., Liu, B., Lai, H., Hou, Z., Dong, L., Chen, Q., Zhou, Y., Zhang, Z., Liu, H., Ruan, J.: MIPDB: A maize image-phenotype database with multi-angle and multi-time characteristics. bioRxiv (2024) 66

  71. [71]

    In: Proceedings of ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp

    He, S., Bastani, F., Abbar, S., Alizadeh, M., Balakrishnan, H., Chawla, S., Mad- den, S.: RoadRunner: Improving the precision of road network inference from gps trajectories. In: Proceedings of ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 3–12 (2018)

  72. [72]

    In: Proceedings of Conference on High Performance Graphics (HPG), pp

    Vineet, V., Harish, P., Patidar, S., Narayanan, P.: Fast minimum spanning tree for large graphs on the GPU. In: Proceedings of Conference on High Performance Graphics (HPG), pp. 167–171 (2009)

  73. [73]

    In: Proceedings of European Conference on Computer Vision (ECCV), pp

    Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose esti- mation. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 11–16 (2016)

  74. [74]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recogni- tion. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  75. [75]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Lin, T.-Y., Doll´ ar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125 (2017)

  76. [76]

    Scientific Data9(1), 397 (2022)

    Lyu, X., Cheng, L., Zhang, S.: The reta benchmark for retinal vascular tree analysis. Scientific Data9(1), 397 (2022)

  77. [77]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Bastani, F., He, S., Abbar, S., Alizadeh, M., Balakrishnan, H., Chawla, S., Mad- den, S., DeWitt, D.: Roadtracer: Automatic extraction of road networks from aerial images. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4720–4728 (2018)

  78. [78]

    IEEE Transactions on Geoscience and Remote Sensing60, 1–12 (2022)

    Xu, Z., Liu, Y., Gan, L., Sun, Y., Wu, X., Liu, M., Wang, L.: Rngdet: Road network graph detection by transformer in aerial images. IEEE Transactions on Geoscience and Remote Sensing60, 1–12 (2022)

  79. [79]

    arXiv preprint arXiv:2209.10150 (2023)

    Xu, Z., Liu, Y., Sun, Y., Liu, M., Wang, L.: Rngdet++: Road network graph detection by transformer with instance segmentation and multi-scale features enhancement. arXiv preprint arXiv:2209.10150 (2023)

  80. [80]

    In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

    Yin, P., Li, K., Cao, X., Yao, J., Liu, L., Bai, X., Zhou, F., Meng, D.: Towards satellite image road graph extraction: A global-scale dataset and a novel method. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1527–1537 (2025)

Showing first 80 references.