Occlusion Handling by Pushing for Enhanced Fruit Detection

Andrea Cherubini; Dana Kuli\'c; Ege Gursoy

arxiv: 2604.06341 · v1 · submitted 2026-04-07 · 💻 cs.RO

Occlusion Handling by Pushing for Enhanced Fruit Detection

Ege Gursoy , Dana Kuli\'c , Andrea Cherubini This is my paper

Pith reviewed 2026-05-10 18:38 UTC · model grok-4.3

classification 💻 cs.RO

keywords fruit detectionocclusion handlingagricultural roboticsbranch pushing3D Hough transformdepth estimationrobot arm

0 comments

The pith

A robot arm pushes occluding branches aside after estimating hidden fruit depth and locating the branch in 3D to improve visibility for detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Fruits in orchards are frequently hidden by branches and leaves, which blocks robot cameras and prevents accurate localization or picking. The paper describes a method that first detects the fruit in an RGB image, then uses a deep learning model to complete its depth in the occluded region. Classic image processing determines the push direction, while a 3D extension of the Hough transform identifies the main occluding branch in the point cloud. The robot arm then physically pushes that branch to clear the view. Real-world tests on apples, lemons, and oranges under varying light show the occlusion is cleared and fruit visibility increases.

Core claim

The paper establishes that detecting an occluded fruit via RGB, completing its depth with a generative model, computing a push direction from image gradients, and isolating the responsible branch with a 3D Hough transform on the point cloud allows a robot arm to execute a targeted push that removes the occlusion and raises fruit visibility, with successful results across multiple fruit types and lighting conditions.

What carries the argument

The 3D extension of the Hough transform, which detects straight-line segments in the point cloud to locate and select the single branch primarily responsible for the occlusion.

If this is right

Fruit detection rates rise because previously hidden surfaces become visible after the push.
The same pipeline works across apples, lemons, and oranges without fruit-specific retraining.
Operation remains reliable under changed lighting because depth estimation and line detection are used together.
A physical demonstration confirms the arm can execute the computed push in real time without damaging the fruit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be inserted into existing vision-based harvesters to handle dense canopies that current static cameras cannot reach.
Repeated pushes on the same branch might be avoided by updating the 3D map after each action.
The method may extend to other occluded objects such as vegetables or tools if the generative model is retrained on their appearance.

Load-bearing premise

The generative model correctly reconstructs the hidden fruit depth and the 3D Hough transform isolates the correct main occluding branch.

What would settle it

A controlled scene with a known fruit and single occluding branch where pushing produces no measurable gain in visible fruit area or depth completeness.

Figures

Figures reproduced from arXiv: 2604.06341 by Andrea Cherubini, Dana Kuli\'c, Ege Gursoy.

**Figure 2.** Figure 2: Camera field of view (left), with corresponding image [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Outline of the fruit estimation method. Left to right: [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Line segments (grey) and their extremities (black [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Field of view from the camera (left) and Image (right). [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Line detection before (left) and after (right) removing [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 8.** Figure 8: From left to right: scene, segmentation results, and [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: Robot branch pushing demonstration. lines lbranch, pushing line lpush and pushing point ppush. The outcomes indicate that branches are generally accurately predicted, although there are instances of non-existent branch lines being detected. The selected pushing lines lpush and pushing points ppush appear reasonable from an empirical standpoint [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗

read the original abstract

In agricultural robotics, effective observation and localization of fruits present challenges due to occlusions caused by other parts of the tree, such as branches and leaves. These occlusions can result in false fruit localization or impede the robot from picking the fruit. The objective of this work is to push away branches that block the fruit's view to increase their visibility. Our setup consists of an RGB-D camera and a robot arm. First, we detect the occluded fruit in the RGB image and estimate its occluded part via a deep learning generative model in the depth space. The direction to push to clear the occlusions is determined using classic image processing techniques. We then introduce a 3D extension of the 2D Hough transform to detect straight line segments in the point cloud. This extension helps detect tree branches and identify the one mainly responsible for the occlusion. Finally, we clear the occlusion by pushing the branch with the robot arm. Our method uses a combination of deep learning for fruit appearance estimation, classic image processing for push direction determination, and 3D Hough transform for branch detection. We validate our perception methods through real data under different lighting conditions and various types of fruits (i.e. apple, lemon, orange), achieving improved visibility and successful occlusion clearance. We demonstrate the practical application of our approach through a real robot branch pushing demonstration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a robotic occlusion-handling system for fruit detection in agriculture. An RGB-D camera and robot arm are used to detect occluded fruits in RGB images, estimate their occluded depth via a deep learning generative model, compute a push direction via classic image processing, detect the primary occluding branch via a 3D extension of the Hough transform on the point cloud, and clear the occlusion by pushing with the arm. Validation consists of real-world RGB-D experiments on apples, lemons, and oranges under varying lighting, claiming improved visibility and successful clearance, plus a robot demonstration.

Significance. If the perception components prove reliable, the work could offer a practical contribution to agricultural robotics by combining deep learning depth estimation with classical 3D branch detection and active manipulation to mitigate occlusions. The multi-fruit, multi-lighting real-robot demonstration is a positive aspect. However, the absence of any quantitative metrics or isolated component evaluations limits assessment of robustness and generalizability.

major comments (2)

[Abstract] Abstract: the central claim of 'achieving improved visibility and successful occlusion clearance' on real data for apples, lemons, oranges under different lighting rests on unquantified performance of the generative depth model and 3D Hough branch detector. No error metrics, success rates, failure cases, or isolated evaluations of these load-bearing components are reported, undermining attribution of clearance success to the method.
[Abstract] Abstract: the 3D Hough transform extension is described only at a high level with no equations, implementation details, or validation of its accuracy in identifying the dominant occluder amid RGB-D noise and complex branching.

minor comments (1)

[Abstract] Abstract: the push-direction determination via 'classic image processing techniques' is not specified, making the pipeline hard to reproduce.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that additional quantitative metrics and implementation details would strengthen the manuscript and will revise accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of 'achieving improved visibility and successful occlusion clearance' on real data for apples, lemons, oranges under different lighting rests on unquantified performance of the generative depth model and 3D Hough branch detector. No error metrics, success rates, failure cases, or isolated evaluations of these load-bearing components are reported, undermining attribution of clearance success to the method.

Authors: We acknowledge that the current manuscript relies on qualitative demonstration of improved visibility and successful clearance without reporting quantitative metrics or component-wise evaluations. In the revision we will add error metrics for the generative depth model (e.g., MAE on held-out depth data), success rates for occlusion clearance across the tested fruits and lighting conditions, and a discussion of observed failure cases. Where isolated component tests were performed during development, we will report them; otherwise we will note the limitation and focus on end-to-end results. revision: yes
Referee: [Abstract] Abstract: the 3D Hough transform extension is described only at a high level with no equations, implementation details, or validation of its accuracy in identifying the dominant occluder amid RGB-D noise and complex branching.

Authors: We agree that the 3D Hough extension requires more detail. The revised manuscript will include the mathematical formulation of the 3D line-segment voting procedure, discretization parameters, and post-processing steps used to select the dominant occluder. We will also report any accuracy validation performed on the collected point clouds (e.g., comparison against manually annotated branches) and discuss robustness to sensor noise. revision: yes

Circularity Check

0 steps flagged

No circularity; experimental pipeline validated on real data without self-referential reductions

full rationale

The paper presents an integrated robotic system combining a deep learning generative model for occluded fruit depth estimation, classic image processing for push direction, and a 3D Hough transform extension for branch detection in point clouds. Validation occurs solely through real-world RGB-D experiments on apples, lemons, and oranges under varying lighting, reporting improved visibility and successful clearance. No equations, parameter fittings, derivations, or self-citations appear in the abstract or described method that reduce any claim to its inputs by construction. The approach is self-contained as an empirical demonstration rather than a theoretical chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. Standard computer-vision assumptions (RGB-D calibration, branch rigidity) are implicit but not detailed.

pith-pipeline@v0.9.0 · 5540 in / 1018 out tokens · 27601 ms · 2026-05-10T18:38:26.725541+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

[1]

Use of the hough transformation to detect lines and curves in pictures,

R. O. Duda and P. E. Hart, “Use of the hough transformation to detect lines and curves in pictures,”Communications of the ACM, vol. 15, no. 1, pp. 11–15, 1972

work page 1972
[2]

To- wards vision-based dual arm robotic fruit harvesting,

E. Gursoy, B. Navarro, A. Cosgun, D. Kuli ´c, and A. Cherubini, “To- wards vision-based dual arm robotic fruit harvesting,” in2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), 2023, pp. 1–6

work page 2023
[3]

Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association,

T. T. Santos, L. L. de Souza, A. A. dos Santos, and S. Avila, “Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association,”Computers and Electronics in Agriculture, vol. 170, p. 105247, 2020

work page 2020
[4]

Fast implementation of real-time fruit detection in apple orchards using deep learning,

H. Kang and C. Chen, “Fast implementation of real-time fruit detection in apple orchards using deep learning,”Computers and Electronics in Agriculture, vol. 168, p. 105108, 2020

work page 2020
[5]

Deep learning for real-time fruit detection and orchard fruit load estimation: Bench- marking of ‘mangoyolo’,

A. Koirala, K. Walsh, Z. Wang, and C. McCarthy, “Deep learning for real-time fruit detection and orchard fruit load estimation: Bench- marking of ‘mangoyolo’,”Precision Agriculture, vol. 20, no. 6, pp. 1107–1135, 2019

work page 2019
[6]

A visual system of citrus picking robot using convolutional neural networks,

Y .-P. Liu, C.-H. Yang, H. Ling, S. Mabu, and T. Kuremoto, “A visual system of citrus picking robot using convolutional neural networks,” in2018 5th International Conference on Systems and Informatics (ICSAI). IEEE, 2018, pp. 344–349

work page 2018
[7]

L* a* b* fruits: A rapid and robust outdoor fruit detection system combining bio-inspired features with one-stage deep learning networks,

R. Kirk, G. Cielniak, and M. Mangan, “L* a* b* fruits: A rapid and robust outdoor fruit detection system combining bio-inspired features with one-stage deep learning networks,”Sensors, vol. 20, no. 1, p. 275, 2020

work page 2020
[8]

Deep orange: Mask r-cnn based orange detection and segmentation,

P. Ganesh, K. V olle, T. Burks, and S. Mehta, “Deep orange: Mask r-cnn based orange detection and segmentation,”IF AC-PapersOnLine, vol. 52, no. 30, pp. 70–75, 2019

work page 2019
[9]

Multi-class fruit-on-plant detection for apple in snap system using faster r-cnn,

F. Gao, L. Fu, X. Zhang, Y . Majeed, R. Li, M. Karkee, and Q. Zhang, “Multi-class fruit-on-plant detection for apple in snap system using faster r-cnn,”Computers and Electronics in Agriculture, vol. 176, p. 105634, 2020

work page 2020
[10]

Kiwifruit detection in field images using faster r-cnn with zfnet,

L. Fu, Y . Feng, Y . Majeed, X. Zhang, J. Zhang, M. Karkee, and Q. Zhang, “Kiwifruit detection in field images using faster r-cnn with zfnet,”IF AC-PapersOnLine, vol. 51, no. 17, pp. 45–50, 2018

work page 2018
[11]

Design and test of tomatoes harvesting robot,

Q. Feng, X. Wang, G. Wang, and Z. Li, “Design and test of tomatoes harvesting robot,” in2015 IEEE international conference on informa- tion and automation. IEEE, 2015, pp. 949–952

work page 2015
[12]

Dual-arm cooper- ation and implementing for robotic harvesting tomato using binocular vision,

X. Ling, Y . Zhao, L. Gong, C. Liu, and T. Wang, “Dual-arm cooper- ation and implementing for robotic harvesting tomato using binocular vision,”Robotics and Autonomous Systems, vol. 114, pp. 134–143, 2019

work page 2019
[13]

Immature peach detection in colour images acquired in natural illumination conditions using sta- tistical classifiers and neural network,

F. Kurtulmus, W. S. Lee, and A. Vardar, “Immature peach detection in colour images acquired in natural illumination conditions using sta- tistical classifiers and neural network,”Precision agriculture, vol. 15, pp. 57–79, 2014

work page 2014
[14]

Reasoning-based vision recognition for agri- cultural humanoid robot toward tomato harvesting,

X. Chen, K. Chaudhary, Y . Tanaka, K. Nagahama, H. Yaguchi, K. Okada, and M. Inaba, “Reasoning-based vision recognition for agri- cultural humanoid robot toward tomato harvesting,” in2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2015, pp. 6487–6494

work page 2015
[15]

Robotic harvesting of the occluded fruits with a precise shape and position reconstruction approach,

L. Gong, W. Wang, T. Wang, and C. Liu, “Robotic harvesting of the occluded fruits with a precise shape and position reconstruction approach,”Journal of Field Robotics, vol. 39, no. 1, pp. 69–84, 2022

work page 2022
[16]

Recognition of overlapping and occluded fruits in natural environment,

J. Lv, “Recognition of overlapping and occluded fruits in natural environment,”Journal of Computational and Theoretical Nanoscience, vol. 13, no. 4, pp. 2475–2484, 2016

work page 2016
[17]

A novel approach for the 3d localization of branch picking points based on deep learning applied to longan harvesting uavs,

D. Li, X. Sun, S. Lv, H. Elkhouchlaa, Y . Jia, Z. Yao, P. Lin, H. Zhou, Z. Zhou, J. Shen,et al., “A novel approach for the 3d localization of branch picking points based on deep learning applied to longan harvesting uavs,”Computers and Electronics in Agriculture, vol. 199, p. 107191, 2022

work page 2022
[18]

Fruit detection, yield prediction and canopy geometric characterization using lidar with forced air flow,

J. Gen ´e-Mola, E. Gregorio, F. A. Cheein, J. Guevara, J. Llorens, R. Sanz-Cortiella, A. Escol `a, and J. R. Rosell-Polo, “Fruit detection, yield prediction and canopy geometric characterization using lidar with forced air flow,”Computers and Electronics in Agriculture, vol. 168, p. 105121, 2020

work page 2020
[19]

Can robots mold soft plastic materials by shaping depth images?

E. Gursoy, S. Tarbouriech, and A. Cherubini, “Can robots mold soft plastic materials by shaping depth images?”IEEE Transactions on Robotics, vol. 39, no. 5, pp. 3620–3635, 2023

work page 2023
[20]

A collaborative robot for the factory of the future: Bazar,

A. Cherubini, R. Passama, B. Navarro, M. Sorour, A. Khelloufi, O. Mazhar, S. Tarbouriech, J. Zhu, O. Tempier, A. Crosnier,et al., “A collaborative robot for the factory of the future: Bazar,”The International Journal of Advanced Manufacturing Technology, vol. 105, no. 9, pp. 3643–3659, 2019

work page 2019

[1] [1]

Use of the hough transformation to detect lines and curves in pictures,

R. O. Duda and P. E. Hart, “Use of the hough transformation to detect lines and curves in pictures,”Communications of the ACM, vol. 15, no. 1, pp. 11–15, 1972

work page 1972

[2] [2]

To- wards vision-based dual arm robotic fruit harvesting,

E. Gursoy, B. Navarro, A. Cosgun, D. Kuli ´c, and A. Cherubini, “To- wards vision-based dual arm robotic fruit harvesting,” in2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), 2023, pp. 1–6

work page 2023

[3] [3]

Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association,

T. T. Santos, L. L. de Souza, A. A. dos Santos, and S. Avila, “Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association,”Computers and Electronics in Agriculture, vol. 170, p. 105247, 2020

work page 2020

[4] [4]

Fast implementation of real-time fruit detection in apple orchards using deep learning,

H. Kang and C. Chen, “Fast implementation of real-time fruit detection in apple orchards using deep learning,”Computers and Electronics in Agriculture, vol. 168, p. 105108, 2020

work page 2020

[5] [5]

Deep learning for real-time fruit detection and orchard fruit load estimation: Bench- marking of ‘mangoyolo’,

A. Koirala, K. Walsh, Z. Wang, and C. McCarthy, “Deep learning for real-time fruit detection and orchard fruit load estimation: Bench- marking of ‘mangoyolo’,”Precision Agriculture, vol. 20, no. 6, pp. 1107–1135, 2019

work page 2019

[6] [6]

A visual system of citrus picking robot using convolutional neural networks,

Y .-P. Liu, C.-H. Yang, H. Ling, S. Mabu, and T. Kuremoto, “A visual system of citrus picking robot using convolutional neural networks,” in2018 5th International Conference on Systems and Informatics (ICSAI). IEEE, 2018, pp. 344–349

work page 2018

[7] [7]

L* a* b* fruits: A rapid and robust outdoor fruit detection system combining bio-inspired features with one-stage deep learning networks,

R. Kirk, G. Cielniak, and M. Mangan, “L* a* b* fruits: A rapid and robust outdoor fruit detection system combining bio-inspired features with one-stage deep learning networks,”Sensors, vol. 20, no. 1, p. 275, 2020

work page 2020

[8] [8]

Deep orange: Mask r-cnn based orange detection and segmentation,

P. Ganesh, K. V olle, T. Burks, and S. Mehta, “Deep orange: Mask r-cnn based orange detection and segmentation,”IF AC-PapersOnLine, vol. 52, no. 30, pp. 70–75, 2019

work page 2019

[9] [9]

Multi-class fruit-on-plant detection for apple in snap system using faster r-cnn,

F. Gao, L. Fu, X. Zhang, Y . Majeed, R. Li, M. Karkee, and Q. Zhang, “Multi-class fruit-on-plant detection for apple in snap system using faster r-cnn,”Computers and Electronics in Agriculture, vol. 176, p. 105634, 2020

work page 2020

[10] [10]

Kiwifruit detection in field images using faster r-cnn with zfnet,

L. Fu, Y . Feng, Y . Majeed, X. Zhang, J. Zhang, M. Karkee, and Q. Zhang, “Kiwifruit detection in field images using faster r-cnn with zfnet,”IF AC-PapersOnLine, vol. 51, no. 17, pp. 45–50, 2018

work page 2018

[11] [11]

Design and test of tomatoes harvesting robot,

Q. Feng, X. Wang, G. Wang, and Z. Li, “Design and test of tomatoes harvesting robot,” in2015 IEEE international conference on informa- tion and automation. IEEE, 2015, pp. 949–952

work page 2015

[12] [12]

Dual-arm cooper- ation and implementing for robotic harvesting tomato using binocular vision,

X. Ling, Y . Zhao, L. Gong, C. Liu, and T. Wang, “Dual-arm cooper- ation and implementing for robotic harvesting tomato using binocular vision,”Robotics and Autonomous Systems, vol. 114, pp. 134–143, 2019

work page 2019

[13] [13]

Immature peach detection in colour images acquired in natural illumination conditions using sta- tistical classifiers and neural network,

F. Kurtulmus, W. S. Lee, and A. Vardar, “Immature peach detection in colour images acquired in natural illumination conditions using sta- tistical classifiers and neural network,”Precision agriculture, vol. 15, pp. 57–79, 2014

work page 2014

[14] [14]

Reasoning-based vision recognition for agri- cultural humanoid robot toward tomato harvesting,

X. Chen, K. Chaudhary, Y . Tanaka, K. Nagahama, H. Yaguchi, K. Okada, and M. Inaba, “Reasoning-based vision recognition for agri- cultural humanoid robot toward tomato harvesting,” in2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2015, pp. 6487–6494

work page 2015

[15] [15]

Robotic harvesting of the occluded fruits with a precise shape and position reconstruction approach,

L. Gong, W. Wang, T. Wang, and C. Liu, “Robotic harvesting of the occluded fruits with a precise shape and position reconstruction approach,”Journal of Field Robotics, vol. 39, no. 1, pp. 69–84, 2022

work page 2022

[16] [16]

Recognition of overlapping and occluded fruits in natural environment,

J. Lv, “Recognition of overlapping and occluded fruits in natural environment,”Journal of Computational and Theoretical Nanoscience, vol. 13, no. 4, pp. 2475–2484, 2016

work page 2016

[17] [17]

A novel approach for the 3d localization of branch picking points based on deep learning applied to longan harvesting uavs,

D. Li, X. Sun, S. Lv, H. Elkhouchlaa, Y . Jia, Z. Yao, P. Lin, H. Zhou, Z. Zhou, J. Shen,et al., “A novel approach for the 3d localization of branch picking points based on deep learning applied to longan harvesting uavs,”Computers and Electronics in Agriculture, vol. 199, p. 107191, 2022

work page 2022

[18] [18]

Fruit detection, yield prediction and canopy geometric characterization using lidar with forced air flow,

J. Gen ´e-Mola, E. Gregorio, F. A. Cheein, J. Guevara, J. Llorens, R. Sanz-Cortiella, A. Escol `a, and J. R. Rosell-Polo, “Fruit detection, yield prediction and canopy geometric characterization using lidar with forced air flow,”Computers and Electronics in Agriculture, vol. 168, p. 105121, 2020

work page 2020

[19] [19]

Can robots mold soft plastic materials by shaping depth images?

E. Gursoy, S. Tarbouriech, and A. Cherubini, “Can robots mold soft plastic materials by shaping depth images?”IEEE Transactions on Robotics, vol. 39, no. 5, pp. 3620–3635, 2023

work page 2023

[20] [20]

A collaborative robot for the factory of the future: Bazar,

A. Cherubini, R. Passama, B. Navarro, M. Sorour, A. Khelloufi, O. Mazhar, S. Tarbouriech, J. Zhu, O. Tempier, A. Crosnier,et al., “A collaborative robot for the factory of the future: Bazar,”The International Journal of Advanced Manufacturing Technology, vol. 105, no. 9, pp. 3643–3659, 2019

work page 2019