3D Reconstruction and Knowledge Distillation to Improve Multi-View Image Models to Explore Spike Volume Estimation in Wheat
Pith reviewed 2026-05-21 05:30 UTC · model grok-4.3
The pith
Distilling geometric knowledge from 3D reconstructions into 2D image Transformers improves wheat spike volume estimation accuracy while cutting inference time from 160 ms to 1.4 ms per spike.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We train a rigid-invariant point cloud network using distance-based histogram features to obtain pose-robust geometric representations, combine it with a multi-view image-based regulated Transformer in an ensemble, and distill the ensemble into a purely image-based student model using either feature-based or label-based distillation. The two distilled models reduce mean absolute error from 654.31 mm³ to 639.93 mm³ and 644.62 mm³, raise correlation from 0.76 to 0.77 and 0.82, cut inference time from 160 ms to 1.4 ms per spike, mitigate volume-dependent bias, and reshape the image model's latent space toward geometry-aware shape.
What carries the argument
Knowledge distillation from an ensemble of a rigid-invariant 3D point cloud network and a multi-view regulated Transformer into a purely image-based student model.
If this is right
- The final model performs volume estimation from images only, with no 3D data or reconstruction required at inference time.
- Distillation reduces volume-dependent bias in the estimates compared with the non-distilled image model.
- The method produces a geometry-aware latent representation inside an otherwise standard image Transformer.
- Inference becomes fast enough for routine high-throughput field phenotyping of many spikes.
Where Pith is reading between the lines
- The same distillation pattern could be tested on other plant organs or crops where 3D shape informs important traits but only 2D capture is practical at scale.
- If the 3D teacher is replaced by a different geometric representation, such as voxel or mesh features, the student might retain similar speed gains with altered accuracy.
- Extending the regulated Transformer to additional views or temporal sequences might further strengthen the geometric signal transferred during distillation.
Load-bearing premise
The 3D point cloud reconstructions and their rigid-invariant histogram features supply accurate geometric knowledge that transfers usefully to the 2D student without adding reconstruction biases or errors.
What would settle it
Running the distilled image model on a fresh collection of wheat spikes imaged outdoors under motion or lighting conditions different from the training set and finding equal or higher volume estimation error than the non-distilled baseline would show the transferred geometric knowledge does not generalize.
Figures
read the original abstract
Accurate estimation of wheat spike volume is important for yield component analysis and stress resilience assessment, yet field-based measurement remains challenging. Active 3D sensing methods such as Light Detection and Ranging (LiDAR) or time-of-flight (ToF) are sensitive to plant motion or poorly suited to outdoor conditions, while 3D reconstructions are computationally expensive. Direct 2D image processing would offer computational advantages, but image-based models lack explicit geometric information. We therefore propose a hybrid 2D-3D approach with knowledge distillation during training while enabling efficient image-only inference. First, we train a rigid-invariant point cloud network using distance-based histogram features to obtain pose-robust geometric representations. We then combine the 3D model with a proposed multi-view image-based regulated Transformer (RT) in an ensemble architecture. Finally, we distill the ensemble knowledge into a purely image-based student model using either feature-based or label-based distillation. The two distilled RTs reduce the mean absolute error (MAE) from 654.31 mm$^3$ of the non-distilled RT to 639.93 mm$^3$ and 644.62 mm$^3$, and increase correlation from 0.76 to 0.77 and 0.82, respectively. At the same time, inference time is reduced from 160 ms to 1.4 ms per spike. Distillation further mitigates volume-dependent bias and reshapes the latent representation of the image model toward a geometry-aware shape. Our results demonstrate that 3D-informed training of a 2D Transformer allows for scalable and efficient spike volume estimation for high-throughput field phenotyping.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a hybrid 2D-3D pipeline for wheat spike volume estimation: a rigid-invariant point cloud network is trained on distance-based histogram features, combined with a multi-view regulated Transformer (RT) in an ensemble, and the ensemble knowledge is distilled (via feature- or label-based methods) into a pure image-based student model. This enables accurate yet fast 2D-only inference. The abstract reports that the two distilled RT variants lower MAE from 654.31 mm³ (non-distilled RT) to 639.93 mm³ and 644.62 mm³, raise correlation from 0.76 to 0.77 and 0.82, cut inference time from 160 ms to 1.4 ms per spike, and mitigate volume-dependent bias.
Significance. If the central claims hold, the work demonstrates a practical route to scalable, high-throughput field phenotyping by transferring geometric knowledge from expensive 3D reconstructions to lightweight 2D models. The concrete numerical gains (specific MAE and correlation deltas) together with the >100× inference speedup constitute a tangible engineering contribution for agricultural computer vision. The rigid-invariant histogram features and dual distillation strategies are presented as reproducible design choices that could be adopted or extended by others.
major comments (2)
- [Abstract] Abstract: the 3D point cloud teacher’s own accuracy (MAE, correlation, or reconstruction error on the same test spikes) is never reported. Without these numbers it is impossible to determine whether the observed MAE reductions (654.31 mm³ → 639.93/644.62 mm³) and correlation gains are attributable to successful geometric knowledge transfer or to other factors such as ensemble averaging or regularization.
- [Abstract] Abstract / Methods description: no quantitative ablation or invariance test is supplied for the rigid-invariant histogram features under realistic field conditions (pose variation, occlusion, lighting). The central claim that these features constitute a “reliable, bias-free teacher” therefore rests on an unverified assumption; if the features themselves degrade or introduce systematic bias, the distillation benefit cannot be credited to geometric knowledge.
minor comments (1)
- [Abstract] The abstract states that distillation “mitigates volume-dependent bias” but does not define the metric or show the corresponding plots or statistical test; a brief clarification or supplementary figure would strengthen the claim.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. We have addressed each major comment point by point below, providing clarifications and committing to revisions where the manuscript can be strengthened without misrepresenting our results.
read point-by-point responses
-
Referee: [Abstract] Abstract: the 3D point cloud teacher’s own accuracy (MAE, correlation, or reconstruction error on the same test spikes) is never reported. Without these numbers it is impossible to determine whether the observed MAE reductions (654.31 mm³ → 639.93/644.62 mm³) and correlation gains are attributable to successful geometric knowledge transfer or to other factors such as ensemble averaging or regularization.
Authors: We acknowledge that explicitly reporting the standalone performance of the 3D point cloud teacher (MAE, correlation, and reconstruction error) on the identical test spikes would help isolate the contribution of geometric knowledge transfer from ensemble effects. These metrics were computed during our experiments but omitted from the abstract and main text to emphasize the distilled student model's efficiency gains. In the revised manuscript, we will add the teacher's test-set results in a new table or results subsection for direct comparison, enabling clearer attribution of the observed improvements. revision: yes
-
Referee: [Abstract] Abstract / Methods description: no quantitative ablation or invariance test is supplied for the rigid-invariant histogram features under realistic field conditions (pose variation, occlusion, lighting). The central claim that these features constitute a “reliable, bias-free teacher” therefore rests on an unverified assumption; if the features themselves degrade or introduce systematic bias, the distillation benefit cannot be credited to geometric knowledge.
Authors: We agree that a dedicated quantitative ablation or invariance test for the distance-based histogram features under field-like variations would provide stronger empirical support for their reliability. The methods section describes their design for pose robustness, but no explicit numerical evaluation (e.g., error under rotations, simulated occlusions, or lighting changes) is included. We will add such an analysis—either in the main text or as supplementary material—reporting quantitative metrics on feature stability to substantiate the assumption and rule out systematic bias. revision: yes
Circularity Check
Empirical test-set metrics with no reduction to fitted inputs or self-citation chains
full rationale
The paper's central claims rest on reported improvements in MAE and correlation for distilled image-only models versus a non-distilled baseline, measured on held-out test spikes. These are standard empirical outcomes from applying knowledge distillation (feature-based or label-based) between independently trained 3D point-cloud and 2D regulated Transformer models. No equations reduce a prediction to a fitted parameter by construction, no uniqueness theorems or ansatzes are smuggled via self-citation, and the histogram features and distillation losses are not defined circularly in terms of the final volume estimates. The approach is self-contained against external benchmarks, with only minor implicit hyperparameter choices for distillation weights that do not force the reported gains.
Axiom & Free-Parameter Ledger
free parameters (1)
- distillation loss weights and temperature
axioms (1)
- domain assumption 3D point-cloud reconstructions provide reliable ground-truth geometry for the teacher model
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
rigid-invariant point cloud network using distance-based histogram features to obtain pose-robust geometric representations
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Regulated Transformer ... single image predictor ... global predictor ... Gaussian parameters
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
TheOo Mon and Nay ZarAung. Vision based volume esti- mation method for automatic mango grading system.Biosys- tems Engineering, 198:338–349, 2020. 1
work page 2020
-
[2]
Xuhua Dong, Woo-Young Kim, Zheng Yu, Ju-Youl Oh, Reza Ehsani, and Kyeong-Hwan Lee. Improved voxel-based vol- ume estimation and pruning severity mapping of apple trees during the pruning period.Computers and Electronics in Agriculture, 219:108834, 2024
work page 2024
-
[3]
Automated Stock V olume Estimation Using UA V-RGB Imagery.Sensors, 24(23):7559, 2024
Anurupa Goswami, Unmesh Khati, Ishan Goyal, Anam Sabir, and Sakshi Jain. Automated Stock V olume Estimation Using UA V-RGB Imagery.Sensors, 24(23):7559, 2024
work page 2024
-
[4]
Qinghua Su, Naoshi Kondo, Minzan Li, Hong Sun, and Di- mas Firmanda Al Riza. Potato feature prediction based on machine vision and 3D model rebuilding.Computers and Electronics in Agriculture, 137:41–51, 2017
work page 2017
-
[5]
Ali Bulent Koc. Determination of watermelon volume using ellipsoid approximation and image processing.Postharvest Biology and Technology, 45(3):366–371, 2007
work page 2007
-
[6]
G. P. Moreda, J. Ortiz-Ca ˜navate, F. J. Garc ´ıa-Ramos, and M. Ruiz-Altisent. Non-destructive technologies for fruit and vegetable size determination – A review.Journal of Food Engineering, 92(2):119–136, 2009
work page 2009
-
[7]
Zhiping Xie, Junhao Wang, Yufei Yang, Peixuan Mao, Jial- ing Guo, and Manyu Sun. Image processing based modeling for Rosa roxburghii fruits mass and volume estimation.Sci- entific Reports, 14(1):15507, 2024. 1
work page 2024
-
[8]
Jonas Anderegg, Norbert Kirchgessner, Helge Aasen, Olivia Zumsteg, Beat Keller, Radek Zenkl, Achim Walter, and An- dreas Hund. Thermal imaging can reveal variation in stay- green functionality of wheat canopies under temperate con- ditions.Frontiers in Plant Science, 15, 2024. 1
work page 2024
-
[9]
Rachakonda, Bala Muralikrishnan, and Daniel S
Prem K. Rachakonda, Bala Muralikrishnan, and Daniel S. Sawyer. Sources of Errors in Structured Light 3D Scanners. NIST, 2019. 1
work page 2019
-
[10]
Time-of-Flight Cameras: Principles, Methods and Applica- tions
Miles Hansard, Seungkyu Lee, Ouk Choi, and Radu Horaud. Time-of-Flight Cameras: Principles, Methods and Applica- tions. Springer, London, 2013. 2
work page 2013
-
[11]
Joan Ramon Rosell Polo, Ricardo Sanz, Jordi Llorens, Jaume Arn ´o, Alexandre Escol `a, Manel Ribes-Dasi, Joan Masip, Ferran Camp, Felip Gr `acia, Francesc Solanelles, Tom`as Pallej `a, Luis Val, Santiago Planas, Emilio Gil, and Jordi Palac´ın. A tractor-mounted scanning LIDAR for the non-destructive measurement of vegetative volume and sur- face area of t...
work page 2009
-
[12]
Jose A. Jimenez-Berni, David M. Deery, Pablo Rozas- Larraondo, Anthony (Tony) G. Condon, Greg J. Rebetzke, Richard A. James, William D. Bovill, Robert T. Furbank, and Xavier R. R. Sirault. High Throughput Determination of Plant Height, Ground Cover, and Above-Ground Biomass in Wheat with LiDAR.Frontiers in Plant Science, 9, 2018. 2
work page 2018
-
[13]
What Can We Learn from Depth Camera Sensor Noise?Sensors, 22(14):5448, 2022
Azmi Haider and Hagit Hel-Or. What Can We Learn from Depth Camera Sensor Noise?Sensors, 22(14):5448, 2022. 2
work page 2022
-
[14]
Mohit Gupta, Qi Yin, and Shree K. Nayar. Structured Light in Sunlight. In2013 IEEE International Conference on Com- puter Vision, pages 545–552, Sydney, Australia, 2013. 2
work page 2013
-
[15]
Mathieu Dassot, Thi ´ery Constant, and Meriem Fournier. The use of terrestrial LiDAR technology in forest science: Appli- cation fields, benefits and challenges.Annals of Forest Sci- ence, 68(5):959–974, 2011. 2
work page 2011
-
[16]
Lukas Roth, Mar ´ıa Xos ´e Rodr ´ıguez-´Alvarez, Fred van Eeuwijk, Hans-Peter Piepho, and Andreas Hund. Phenomics data processing: A plot-level model for repeated measure- ments to extract the timing of key stages and quantities at defined time points.Field Crops Research, 274:1–17, 2021. 2
work page 2021
-
[17]
Lukas Roth, Moritz Camenzind, Helge Aasen, Lukas Kro- nenberg, Christoph Barendregt, Karl-Heinz Camp, Achim Walter, Norbert Kirchgessner, and Andreas Hund. Repeated Multiview Imaging for Estimating Seedling Tiller Counts of Wheat Genotypes Using Drones.Plant Phenomics, 2020: 1–20, 2020
work page 2020
-
[18]
S ´ebastien Dandrifosse, Elias Ennadifi, Alexis Carlier, Bernard Gosselin, Benjamin Dumont, and Beno ˆıt Merca- toris. Deep learning for wheat ear segmentation and ear den- sity measurement: From heading to maturity.Computers and Electronics in Agriculture, 199:107161, 2022. 2
work page 2022
-
[19]
ImageNet classification with deep convolutional neural net- works
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. ImageNet classification with deep convolutional neural net- works. InAdvances in Neural Information Processing Sys- tems. Curran Associates, Inc., 2012. 2
work page 2012
-
[20]
Badhon, Curtis Pozniak, Benoit de Solan, Andreas Hund, Scott C
Etienne David, Simon Madec, Pouria Sadeghi-Tehran, Helge Aasen, Bangyou Zheng, Shouyang Liu, Norbert Kirchgess- ner, Goro Ishikawa, Koichi Nagasawa, Minhajul A. Badhon, Curtis Pozniak, Benoit de Solan, Andreas Hund, Scott C. Chapman, Fr ´ed´eric Baret, Ian Stavness, and Wei Guo. Global Wheat Head Detection (GWHD) Dataset: A Large and Diverse Dataset of Hi...
work page 2020
-
[21]
YOLOv12: Attention-Centric Real-Time Object Detectors
Yunjie Tian, Qixiang Ye, and David Doermann. YOLOv12: Attention-Centric Real-Time Object Detectors.arXiv preprint arXiv:2502.12524, 2025. 2
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[22]
Ranjan Sapkota and Manoj Karkee. Comparing YOLOv11 and YOLOv8 for instance segmentation of occluded and non-occluded immature green fruits in complex orchard en- vironment.arXiv preprint arXiv:2410.19869, 2025. 2
-
[23]
Daiwei Zhang, Joaquin Gajardo, Tomislav Medic, Isinsu Katircioglu, Mike Boss, Norbert Kirchgessner, Achim Wal- ter, and Lukas Roth. Wheat3DGS: In-Field 3D Recon- struction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting. In2025 IEEE/CVF Con- ference on Computer Vision and Pattern Recognition Work- shops (CVPRW), pages 5360–53...
work page 2025
-
[24]
Transfer learning in agriculture: A review.Artificial Intelligence Review, 58(4):97, 2025
Md Ismail Hossen, Mohammad Awrangjeb, Shirui Pan, and Abdullah Al Mamun. Transfer learning in agriculture: A review.Artificial Intelligence Review, 58(4):97, 2025. 2
work page 2025
-
[25]
Ismail Oztel, Gozde Yolcu, and Cemil Oz. Performance Comparison of Transfer Learning and Training from Scratch Approaches for Deep Facial Expression Recognition. In 9 2019 4th International Conference on Computer Science and Engineering (UBMK), pages 1–6, 2019. 2
work page 2019
-
[26]
Paolo Didier Alfano, Vito Paolo Pastore, Lorenzo Rosasco, and Francesca Odone. Top-tuning: A study on transfer learn- ing for an efficient alternative to fine tuning for image clas- sification with fast kernel methods.Image and Vision Com- puting, 142:104894, 2024. 2
work page 2024
-
[27]
Deep Residual Learning for Image Recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016. 2
work page 2016
-
[28]
Sakshi Gandotra, Rita Chhikara, and Anuradha Dhull. Smart farming: Real-time rice yield forecasting on mobile devices using lightweight CNN-LSTM.Smart Agricultural Technol- ogy, 13:101664, 2026. 2
work page 2026
-
[29]
I. Abourabia, S. Ounacer, M.Y . Elghoumari, S. Ardchir, and M. Azzouazi. A hybrid machine learning and deep learning framework for agricultural yield prediction using irrigation data.International Journal on Technical and Physical Prob- lems of Engineering, 17(64):436–449, 2025. 2
work page 2025
-
[30]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.arXiv preprint arXiv:2010.11929, 2021. 2
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[31]
End- to-End Object Detection with Transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End- to-End Object Detection with Transformers. InComputer Vision – ECCV 2020, pages 213–229, Cham, 2020. Springer International Publishing. 2
work page 2020
-
[32]
Vi- sion Transformers for Dense Prediction
Rene Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. Vi- sion Transformers for Dense Prediction. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 12159–12168, Montreal, QC, Canada, 2021. 2
work page 2021
-
[33]
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mah- moud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Herv ´e Je- gou, Julien Mairal, ...
work page internal anchor Pith review Pith/arXiv arXiv
-
[34]
Oriane Sim ´eoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timoth´ee Darcet, Th´eo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie,...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[35]
A tale of two features: Stable diffusion complements DINO for zero-shot semantic correspondence
Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Pola- nia Cabrera, Varun Jampani, Deqing Sun, and Ming-Hsuan Yang. A tale of two features: Stable diffusion complements DINO for zero-shot semantic correspondence. InAdvances in Neural Information Processing Systems, pages 45533– 45547. Curran Associates, Inc., 2023. 2
work page 2023
-
[36]
Aran Nayebi, Rishi Rajalingham, Mehrdad Jazayeri, and Guangyu Robert Yang. Neural foundations of mental simula- tion: Future prediction of latent representations on dynamic scenes. InAdvances in Neural Information Processing Sys- tems, pages 70548–70561. Curran Associates, Inc., 2023. 2
work page 2023
-
[37]
Bing Han, Chen Zhu, Dong Han, Rui Yu, Songliang Cao, Jianhui Wu, Scott Chapman, Zijian Wang, Bangyou Zheng, Wei Guo, Marie Weiss, Benoit de Solan, Andreas Hund, Lukas Roth, Kirchgessner Norbert, Andrea Visioni, Yufeng Ge, Wenjuan Li, Alexis Comar, Dong Jiang, De- jun Han, Fred Baret, Yanfeng Ding, Hao Lu, and Shouyang Liu. FoMo4Wheat: Toward reliable crop...
-
[38]
Long Short-Term Memory.Neural Computation, 9(8):1735–1780, 1997
Sepp Hochreiter and J ¨urgen Schmidhuber. Long Short-Term Memory.Neural Computation, 9(8):1735–1780, 1997. 2
work page 1997
-
[39]
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Shaojie Bai, J. Zico Kolter, and Vladlen Koltun. An Empirical Evaluation of Generic Convolutional and Recur- rent Networks for Sequence Modeling.arXiv preprint arXiv:1803.01271, 2018. 2
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[40]
Yuanhang Su and C.-C. Jay Kuo. On Extended Long Short- term Memory and Dependent Bidirectional Recurrent Neural Network.Neurocomputing, 356:151–161, 2019. 2
work page 2019
-
[41]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neu- ral Information Processing Systems. Curran Associates, Inc.,
-
[42]
Yue Wang and Yuanyuan Zha. Comparison of transformer, LSTM and coupled algorithms for soil moisture prediction in shallow-groundwater-level areas with interpretability analy- sis.Agricultural Water Management, 305:109120, 2024. 2
work page 2024
-
[43]
Marco Castangia, Lina Maria Medina Grajales, Alessandro Aliberti, Claudio Rossi, Alberto Macii, Enrico Macii, and Edoardo Patti. Transformer neural networks for interpretable flood forecasting.Environmental Modelling & Software, 160:105581, 2023. 2
work page 2023
-
[44]
Yinghua Wang, Songtao Hu, He Ren, Wanneng Yang, and Ruifang Zhai. 3DPhenoMVS: A Low-Cost 3D Tomato Phenotyping Pipeline Using 3D Reconstruction Point Cloud Based on Multiview Images.Agronomy, 12(8):1865, 2022. 2
work page 2022
-
[45]
Xin Yang, Xuqi Lu, Pengyao Xie, Ziyue Guo, Hui Fang, Haowei Fu, Xiaochun Hu, Zhenbiao Sun, and Haiyan Cen. PanicleNeRF: Low-Cost, High-Precision In-Field Phenotyp- ing of Rice Panicles with Smartphone.Plant Phenomics, 6: 0279, 2024
work page 2024
-
[46]
Hong-Beom Choi, Jae-Kun Park, Soo Hyun Park, and Taek Sung Lee. NeRF-based 3D reconstruction pipeline for acquisition and analysis of tomato crop morphology.Fron- tiers in Plant Science, 15, 2024
work page 2024
-
[47]
Peng Shen, Xueyao Jing, Wenzhe Deng, Hanyue Jia, and Tingting Wu. PlantGaussian: Exploring 3D Gaussian splat- ting for cross-time, cross-scene, and realistic 3D plant vi- 10 sualization and beyond.The Crop Journal, 13(2):607–618,
-
[48]
NeRF: Neural Radiance Field in 3D Vision: A Comprehensive Review (Updated Post-Gaussian Splatting)
Kyle Gao, Yina Gao, Hongjie He, Dening Lu, Linlin Xu, and Jonathan Li. NeRF: Neural Radiance Field in 3D Vision: A Comprehensive Review (Updated Post-Gaussian Splatting). arXiv preprint arXiv:2210.00379, 2026. 2
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[49]
Sheng Wu, Weiliang Wen, Wenbo Gou, Xianju Lu, Wenqi Zhang, Chenxi Zheng, Zhiwei Xiang, Liping Chen, and Xinyu Guo. A miniaturized phenotyping platform for in- dividual plants using multi-view stereo 3D reconstruction. Frontiers in Plant Science, 13, 2022
work page 2022
-
[50]
Jiajia Li, Keyi Zhu, Qianwen Zhang, Dong Chen, Qi Sun, and Zhaojian Li. Object-centric 3D Gaussian splatting for strawberry plant reconstruction and phenotyping.Smart Agricultural Technology, 13:101810, 2026. 2
work page 2026
-
[51]
Cristian Bucilu ˇa, Rich Caruana, and Alexandru Niculescu- Mizil. Model compression. InProceedings of the 12th ACM SIGKDD International Conference on Knowledge Discov- ery and Data Mining, pages 535–541, New York, NY , USA,
-
[52]
Association for Computing Machinery. 2
-
[53]
Seyed Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa, and Hassan Ghasemzadeh. Im- proved Knowledge Distillation via Teacher Assistant.Pro- ceedings of the AAAI Conference on Artificial Intelligence, 34(04):5191–5198, 2020. 3
work page 2020
-
[54]
Distilling the Knowledge in a Neural Network
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the Knowledge in a Neural Network.arXiv preprint arXiv:1503.02531, 2015. 3
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[55]
An Efficient Method of Training Small Models for Regres- sion Problems with Knowledge Distillation
Makoto Takamoto, Yusuke Morishita, and Hitoshi Imaoka. An Efficient Method of Training Small Models for Regres- sion Problems with Knowledge Distillation. In2020 IEEE Conference on Multimedia Information Processing and Re- trieval (MIPR), pages 67–72, 2020. 3
work page 2020
-
[56]
Cross Modal Distillation for Supervision Transfer
Saurabh Gupta, Judy Hoffman, and Jitendra Malik. Cross Modal Distillation for Supervision Transfer. In2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2827–2836, Las Vegas, NV , USA, 2016. 3
work page 2016
-
[57]
3D-to-2D Distillation for Indoor Scene Parsing
Zhengzhe Liu, Xiaojuan Qi, and Chi-Wing Fu. 3D-to-2D Distillation for Indoor Scene Parsing. In2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4462–4472, 2021. 3
work page 2021
-
[58]
Shivam Pande, Avinandan Banerjee, Saurabh Kumar, Bi- plab Banerjee, and Subhasis Chaudhuri. An Adversarial Ap- proach to Discriminative Modality Distillation for Remote Sensing Image Classification. In2019 IEEE/CVF Interna- tional Conference on Computer Vision Workshop (ICCVW), pages 4571–4580, 2019. 3
work page 2019
-
[59]
DistillBEV: Boosting Multi-Camera 3D Ob- ject Detection with Cross-Modal Knowledge Distillation
Zeyu Wang, Dingwen Li, Chenxu Luo, Cihang Xie, and Xi- aodong Yang. DistillBEV: Boosting Multi-Camera 3D Ob- ject Detection with Cross-Modal Knowledge Distillation. In 2023 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 8603–8612, Paris, France, 2023. 3
work page 2023
-
[60]
RadOcc: Learning cross-modality occupancy knowledge through rendering assisted distillation
Haiming Zhang, Xu Yan, Dongfeng Bai, Jiantao Gao, Pan Wang, Bingbing Liu, Shuguang Cui, and Zhen Li. RadOcc: Learning cross-modality occupancy knowledge through rendering assisted distillation. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artifi- cial Intelligence a...
work page 2024
-
[61]
Masked Generative Distillation
Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, and Chun Yuan. Masked Generative Distillation. In Computer Vision – ECCV 2022, pages 53–69, Cham, 2022. Springer Nature Switzerland. 3
work page 2022
-
[62]
A Simple and Generic Framework for Feature Distillation via Channel-wise Transformation
Ziwei Liu, Yongtao Wang, Xiaojie Chu, Nan Dong, Shengx- iang Qi, and Haibin Ling. A Simple and Generic Framework for Feature Distillation via Channel-wise Transformation. In 2023 IEEE/CVF International Conference on Computer Vi- sion Workshops (ICCVW), pages 1121–1130, Paris, France,
work page 2023
-
[63]
Norbert Kirchgessner, Frank Liebisch, Kang Yu, Johannes Pfeifer, Michael Friedli, Andreas Hund, and Achim Walter. The ETH field phenotyping platform FIP: A cable-suspended multi-sensor system.Functional plant biology: FPB, 44(1): 154–168, 2016. 3
work page 2016
-
[64]
Olivia Zumsteg, Nico Graf, Aaron Haeusler, Norbert Kirchgessner, Nicola Storni, Lukas Roth, and Andreas Hund. Fine-Tuned Vision Transformers Capture Complex Wheat Spike Morphology for V olume Estimation from RGB Im- ages.arXiv preprint arXiv:2506.18060, 2025. 3
-
[65]
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R¨adle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junt- ing Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao- Yuan Wu, Ross Girshick, Piotr Doll´ar, and Christoph Feicht- enhofer. SAM 2: Segment Anything in Images and Videos. arXiv preprint arXiv:...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[66]
Descriptor-Free Multi-view Region Matching for Instance-Wise 3D Reconstruction
Takuma Doi, Fumio Okura, Toshiki Nagahara, Yasuyuki Matsushita, and Yasushi Yagi. Descriptor-Free Multi-view Region Matching for Instance-Wise 3D Reconstruction. In Computer Vision – ACCV 2020, pages 581–599, Cham,
work page 2020
-
[67]
Springer International Publishing. 3
-
[68]
Usha Nandini Raghavan, R ´eka Albert, and Soundar Kumara. Near linear time algorithm to detect community structures in large-scale networks.Physical Review E, 76(3):036106,
-
[69]
OpenMVS: Multi-view stereo reconstruction library
Dan Cernea. OpenMVS: Multi-view stereo reconstruction library. 2020. 4
work page 2020
-
[70]
Point Transformer V3: Simpler, Faster, Stronger
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xi- hui Liu, Yu Qiao, Wanli Ouyang, Tong He, and Hengshuang Zhao. Point Transformer V3: Simpler, Faster, Stronger. In 2024 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 4840–4851, 2024. 4, 8
work page 2024
-
[71]
Qi Charles, Hao Su, Mo Kaichun, and Leonidas J
R. Qi Charles, Hao Su, Mo Kaichun, and Leonidas J. Guibas. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 77–85, Hon- olulu, HI, 2017. 4, 8
work page 2017
-
[72]
Panoptic Mapping with Fruit Completion and Pose Estimation for Horticultural Robots
Yue Pan, Federico Magistri, Thomas L ¨abe, Elias Marks, Claus Smitt, Chris McCool, Jens Behley, and Cyrill Stach- niss. Panoptic Mapping with Fruit Completion and Pose Estimation for Horticultural Robots. In2023 IEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS), pages 4226–4233, Detroit, MI, USA, 2023. 8 11
work page 2023
-
[73]
DeepSDF: Learning Continuous Signed Distance Functions for Shape Represen- tation
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. DeepSDF: Learning Continuous Signed Distance Functions for Shape Represen- tation. In2019 IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 165–174, Long Beach, CA, USA, 2019. 8
work page 2019
-
[74]
MemDistill: Distilling LiDAR Knowledge into Memory for Camera-Only 3D Object Detection
Donghyeon Kwon, Youngseok Yoon, Hyeongseok Son, and Suha Kwak. MemDistill: Distilling LiDAR Knowledge into Memory for Camera-Only 3D Object Detection. InProceed- ings of the IEEE/CVF International Conference on Com- puter Vision, pages 6828–6838, 2025. 8
work page 2025
-
[75]
INVITE Consortium. INnovations in plant VarIety Testing in Europe to foster the introduction of new varieties better adapted to varying biotic and abiotic conditions and to more sustainable crop management practices|INVITE|Project|Fact Sheet|H2020. https://cordis.europa.eu/project/id/817970, n.d. 13
- [76]
-
[77]
Pyrender.https://github.com/ mmatl/pyrender, 2019
Matthew Matl. Pyrender.https://github.com/ mmatl/pyrender, 2019. 13
work page 2019
-
[78]
OpenMVG: Open Multiple View Geometry
Pierre Moulon, Pascal Monasse, Romuald Perrot, and Re- naud Marlet. OpenMVG: Open Multiple View Geometry. In Reproducible Research in Pattern Recognition, pages 60–74, Cham, 2017. Springer International Publishing. 13 12 3D Reconstruction and Knowledge Distillation to Improve Multi-View Image Models to Explore Spike V olume Estimation in Wheat Supplementa...
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.