pith. sign in

arxiv: 2606.26955 · v1 · pith:CWRVTGW4new · submitted 2026-06-25 · 💻 cs.RO

RobOralScan: Learning Active Intraoral Scanning for Robotic Dental Reconstruction

Pith reviewed 2026-06-26 05:16 UTC · model grok-4.3

classification 💻 cs.RO
keywords intraoral scanningreinforcement learningrobotic dentistryactive scanninggeometric memorycoverage learningsim-to-real transferdental reconstruction
0
0 comments X

The pith

Reinforcement learning lets a robot autonomously control an intraoral scanner by accumulating geometric memory of scan history and using tooth-specific coverage rewards.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a policy trained in simulation can select scanner motions to build a complete dental model inside the confined oral space. It does this by feeding the policy a tri-state geometric memory that tracks observed, unobserved, and boundary regions, plus rewards that penalize poor coverage on individual teeth. The result is higher overall surface coverage and fewer gaps than would occur with random or scripted motions. A sympathetic reader would care because manual full-arch scanning is slow and error-prone, and automation could reduce operator fatigue while improving reconstruction consistency. The work also reports that the same policy works on physical hardware without retraining.

Core claim

RobOralScan trains a reinforcement learning policy that chooses relative scanner motions from a geometric memory observation (accumulated partial scans represented as tri-state voxels) and robot proprioception. Tooth-wise coverage learning combines a coverage-aware reward with progressive training so the policy improves global coverage while reducing variance across individual teeth. In evaluation this produces 92.58 percent average coverage, 88.45 percent lower-tail per-tooth coverage, 0.00838 Chamfer distance, and successful completion of the scan criterion in eight of ten episodes, with zero-shot transfer to a physical robot-scanner setup.

What carries the argument

Geometric memory-based observation space that converts partial scan observations into a tri-state geometric representation, combined with tooth-wise coverage learning that adds coverage-aware rewards and progressive training.

If this is right

  • The policy produces closed-loop scan control that adapts to accumulating observations without human input.
  • Tooth-wise rewards reduce uneven coverage that would otherwise leave some teeth incompletely reconstructed.
  • Progressive training allows the policy to first master local coverage before optimizing global uniformity.
  • Zero-shot sim-to-real transfer indicates the learned motions are robust to moderate domain shift.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same memory-plus-coverage approach could be tested on other narrow-cavity scanning tasks such as ear or nasal endoscopy.
  • If coverage metrics remain stable across a wider range of dental arch shapes, the method might reduce the need for patient-specific retraining.
  • Adding explicit uncertainty estimates to the geometric memory could further lower the chance of missing hidden surfaces.

Load-bearing premise

The simulation environment accurately reproduces the occlusions, sensor noise, and motion dynamics of real intraoral scanning so that policies transfer to physical robots without retraining or failure.

What would settle it

Running the trained policy on a physical robot-scanner setup and measuring whether average coverage falls below 85 percent or the scan criterion is met in fewer than six of ten trials.

Figures

Figures reproduced from arXiv: 2606.26955 by Gihyun Baek, Haeun Yun, Jinhyung Lee, Sehyun Hwang, Siwon Kim, Sungho Moon, Sunghoon Im.

Figure 1
Figure 1. Figure 1: System overview of RobOralScan. Local scan observations are accumulated into a tri-state [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Zero-shot real-world deployment of RobOralScan on a Franka Research 3 (FR3) arm [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

Intraoral scanning is widely used for digital optical impressions in prosthodontic, implant, and orthodontic treatment, but full-arch and long-span scanning remain labor-intensive tasks with limited automation. In the confined oral cavity, operators must continuously adjust scanner motion while accumulating narrow field-of-view observations, making reconstruction quality sensitive to missing tooth surfaces and operator workload. We propose RobOralScan, which, to the best of our knowledge, is the first reinforcement learning (RL)-based pipeline for robotic automatic intraoral scanning. RobOralScan introduces a geometric memory-based observation space that accumulates partial scan observations into a tri-state geometric representation, allowing the policy to reason over scan history and insufficiently observed regions. It further introduces tooth-wise coverage learning, combining coverage-aware reward signals and a progressive training scheme to improve global reconstruction coverage while reducing uneven coverage across individual teeth. The learned policy selects relative scanner motions from accumulated geometric memory and robot proprioception for closed-loop scan control within the oral workspace. RobOralScan achieves a Chamfer Distance of 0.00838, an average coverage of 92.58%, a lower-tail per-tooth coverage of 88.45%, and a normalized AUC of 0.6674, completing the scan criterion in 8 of 10 evaluation episodes. Furthermore, zero-shot sim-to-real experiments demonstrate its practical feasibility on a physical robot-scanner setup.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes RobOralScan as the first RL-based pipeline for robotic automatic intraoral scanning. It introduces a geometric memory-based observation space that accumulates partial scans into a tri-state representation and a tooth-wise coverage learning scheme with coverage-aware rewards and progressive training. The policy uses relative scanner motions from geometric memory and proprioception. In simulation, it reports 92.58% average coverage, 88.45% lower-tail per-tooth coverage, 0.00838 Chamfer distance, normalized AUC of 0.6674, and scan completion in 8 of 10 episodes; the abstract further claims zero-shot sim-to-real transfer demonstrating practical feasibility on a physical robot-scanner setup.

Significance. If the simulation-to-real transfer holds with comparable quantitative performance, the work would represent a meaningful step toward automating labor-intensive full-arch intraoral scanning in confined workspaces, potentially reducing operator workload while improving reconstruction consistency via history-aware RL policies.

major comments (2)
  1. [Abstract] Abstract: The central claim of practical feasibility rests on zero-shot sim-to-real transfer, yet the abstract (and by extension the evaluation) provides no quantitative real-world metrics such as coverage percentage, lower-tail per-tooth coverage, or Chamfer distance on the physical robot-scanner setup, nor any ablation of sensor models or occlusion handling; this directly undermines the transfer assertion without additional evidence.
  2. [Evaluation] Evaluation section (implied by reported metrics): The headline simulation results (92.58% coverage, 0.00838 Chamfer, 8/10 episodes) lack accompanying training curves, ablation studies on the geometric memory or tooth-wise reward components, or error analysis of failure cases, making it impossible to verify that the reported performance stems from the proposed observation space and learning scheme rather than environment specifics.
minor comments (1)
  1. [Abstract] The abstract states the method is 'to the best of our knowledge, the first RL-based pipeline' without referencing prior RL or active-view-selection work in robotics or medical imaging; a brief related-work paragraph would strengthen this positioning.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for major revision. We address each major comment below with proposed revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim of practical feasibility rests on zero-shot sim-to-real transfer, yet the abstract (and by extension the evaluation) provides no quantitative real-world metrics such as coverage percentage, lower-tail per-tooth coverage, or Chamfer distance on the physical robot-scanner setup, nor any ablation of sensor models or occlusion handling; this directly undermines the transfer assertion without additional evidence.

    Authors: We agree that the abstract overstates the strength of the zero-shot sim-to-real claim by not qualifying the real-world results. The real-world experiments were performed as a qualitative proof-of-concept to show that the policy transfers without retraining, with success judged by visual inspection of scan completion rather than automated metrics (which require unavailable ground-truth meshes in the physical setting). We will revise the abstract to explicitly state that all quantitative metrics are simulation-based and that real-world results demonstrate feasibility via successful scan completion in the physical setup without providing numerical coverage or Chamfer values. revision: yes

  2. Referee: [Evaluation] Evaluation section (implied by reported metrics): The headline simulation results (92.58% coverage, 0.00838 Chamfer, 8/10 episodes) lack accompanying training curves, ablation studies on the geometric memory or tooth-wise reward components, or error analysis of failure cases, making it impossible to verify that the reported performance stems from the proposed observation space and learning scheme rather than environment specifics.

    Authors: The current version reports only final aggregate metrics without training curves, component ablations, or failure-case analysis. This limits the ability to attribute gains specifically to the geometric memory and tooth-wise progressive rewards. We will incorporate ablation studies isolating the geometric memory and tooth-wise reward scheme, add training curves, and provide a brief error analysis of the two unsuccessful episodes in the revised manuscript (or supplementary material if space-constrained). revision: yes

Circularity Check

0 steps flagged

No circularity: RL policy and metrics are empirically derived, not self-defined or fitted by construction.

full rationale

The paper introduces an RL pipeline using geometric memory observation space and tooth-wise coverage rewards, with performance (92.58% coverage, 0.00838 Chamfer) reported as outcomes of training and evaluation episodes. No equations, parameters, or uniqueness theorems are presented that reduce the claimed results to inputs by definition. Self-citations are absent from the provided text; the sim-to-real claim is an empirical assertion without load-bearing self-referential derivation. The evaluation remains independent of the method definition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities can be extracted or audited.

pith-pipeline@v0.9.1-grok · 5800 in / 1031 out tokens · 19502 ms · 2026-06-26T05:16:03.918725+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 22 canonical work pages

  1. [1]

    Yuzbasioglu, H

    E. Yuzbasioglu, H. Kurt, R. Turunc, and H. Bilir. Comparison of digital and conventional impression techniques: Evaluation of patients’ perception, treatment comfort, effectiveness and clinical outcomes.BMC Oral Health, 14(10), 2014. doi:10.1186/1472-6831-14-10

  2. [2]

    Gjelvold, B

    B. Gjelvold, B. R. Chrcanovic, E.-K. Korduner, I. Collin-Bagewitz, and J. Kisch. Intraoral digital impression technique compared to conventional impression technique. a randomized clinical trial.Journal of Prosthodontics, 25(4):282–287, 2016. doi:10.1111/jopr.12410

  3. [3]

    Pachiou, E

    A. Pachiou, E. Zervou, N. Sykaras, D. Tortopidis, A. Ioannidis, R. E. Jung, F. J. Strauss, and S. Kourtis. Patient-reported outcomes of digital versus conventional impressions for implant- supported fixed dental prostheses: A systematic review and meta-analysis.Journal of Person- alized Medicine, 15(9):427, 2025. doi:10.3390/jpm15090427

  4. [4]

    O. D. Ramos-Morro, S. Mareque-Ameijeiras, G. Giovannini, M. M. Paz-Cort ´es, D. Serrano- Velasco, J. M. Aragoneses, and A. Mart ´ın-Vacas. Evaluation of the patient’s perception, re- liability and reproducibility, and chairside time with intraoral scanners in adult population–a systematic review.Frontiers in Oral Health, 7, 2026. doi:10.3389/froh.2026.1733387

  5. [5]

    Mangano, A

    F. Mangano, A. Gandolfi, G. Luongo, and S. Logozzo. Intraoral scanners in dentistry: A review of the current literature.BMC Oral Health, 17(149), 2017. doi:10.1186/s12903-017-0442-x

  6. [6]

    M. Kwon, Y . Cho, D.-W. Kim, M. Kim, Y .-J. Kim, and M. Chang. Full-arch accuracy of five intraoral scanners: In vivo analysis of trueness and precision.Korean Journal of Orthodontics, 51(2):95–104, 2021. doi:10.4041/kjod.2021.51.2.95

  7. [7]

    A. M. Fratila, A. Saceleanu, V . C. Arcas, N. Fratila, and K. Earar. Enhancing intraoral scanning accuracy: From the influencing factors to a procedural guideline.Journal of Clinical Medicine, 14(10):3562, 2025. doi:10.3390/jcm14103562

  8. [8]

    Choi, K.-H

    E.-J. Choi, K.-H. Ko, Y .-H. Huh, C.-J. Park, and L.-R. Cho. Effect of scan path on accuracy of complete arch intraoral scan.The Journal of Advanced Prosthodontics, 16(6):319–327, 2024. doi:10.4047/jap.2024.16.6.319

  9. [9]

    Schl ¨ogl, J.-F

    K. Schl ¨ogl, J.-F. G ¨uth, T. Graf, and C. Keul. Accuracy of full arch scans performed with nine different scanning patterns–an in vitro study.Clinical Oral Investigations, 29(92), 2025. doi:10.1007/s00784-025-06154-2

  10. [10]

    Limones, D

    A. Limones, D. Morton, A. Sallorenzo, W.-S. Lin, R. Sadid-Zadeh, K. Phasuk, M. Revilla- Le´on, and M. G ´omez-Polo. Impact of operator experience on intraoral digital scanning: A systematic review, meta-analysis, and meta-regression. report of the committee on research in fixed prosthodontics of the american academy of fixed prosthodontics.The Journal of ...

  11. [11]

    Y . Bi, C. Qian, Z. Zhang, N. Navab, and Z. Jiang. Autonomous path planning for intercostal robotic ultrasound imaging using reinforcement learning.Scientific Reports, 16(1):6356, 2026. doi:10.1038/s41598-026-37702-9

  12. [12]

    R. Chen, X. Yan, K. Lv, G. Huang, Z. Li, and X. Li. Ultradp: Generalizable carotid ul- trasound scanning with force-aware diffusion policy. In2025 IEEE/RSJ International Con- ference on Intelligent Robots and Systems (IROS), pages 20074–20080. IEEE, 2025. doi: 10.1109/IROS60139.2025.11246769

  13. [13]

    Zhang, L

    Y . Zhang, L. Bai, L. Liu, H. Ren, and M. Q.-H. Meng. Deep reinforcement learning-based con- trol for stomach coverage scanning of wireless capsule endoscopy. In2022 IEEE international conference on robotics and biomimetics (ROBIO), pages 01–06. IEEE, 2022. 10

  14. [14]

    Y . Ao, M. Moghani, M. Mittal, M. Prajapat, L. Wu, F. Giraud, F. Carrillo, A. Krause, and P. F ¨urnstahl. Sonogym: High performance simulation for challenging surgical tasks with robotic ultrasound. InAdvances in Neural Information Processing Systems, volume 38. Curran Associates, Inc., 2025. URLhttps://proceedings.neurips.cc/paper_files/ paper/2025/file/...

  15. [15]

    Richert, A

    R. Richert, A. Goujat, L. Venet, G. Viguie, S. Viennot, P. Robinson, J.-C. Farges, M. Fages, and M. Ducret. Intraoral scanner technologies: A review to make a successful impression. Journal of Healthcare Engineering, 2017:8427595, 2017. doi:10.1155/2017/8427595

  16. [16]

    L. Alkadi. A comprehensive review of factors that influence the accuracy of intraoral scanners. Diagnostics, 13(21):3291, 2023. doi:10.3390/diagnostics13213291

  17. [17]

    Hardan, R

    L. Hardan, R. Bourgi, M. Lukomska-Szymanska, J. C. Hern´andez-Cabanillas, J. E. Zamarripa- Calder´on, G. Jorquera, S. Ghishan, and C. E. Cuevas-Su ´arez. Effect of scanning strategies on the accuracy of digital intraoral scanners: A meta-analysis of in vitro studies.The Journal of Advanced Prosthodontics, 15(6):315–332, 2023. doi:10.4047/jap.2023.15.6.315

  18. [18]

    Ender and A

    A. Ender and A. Mehl. Influence of scanning strategies on the accuracy of digital intraoral scanning systems.International Journal of Computerized Dentistry, 16(1):11–21, 2013

  19. [19]

    Medina-Sotomayor, A

    P. Medina-Sotomayor, A. Pascual-Moscard ´o, and I. Camps. Accuracy of four digital scanners according to scanning strategy in complete-arch impressions.PLOS ONE, 13(9):e0202916,

  20. [20]

    doi:10.1371/journal.pone.0202916

  21. [21]

    H. Y . Mai, H.-N. Mai, C.-H. Lee, K.-B. Lee, S.-y. Kim, J.-M. Lee, K.-W. Lee, and D.-H. Lee. Impact of scanning strategy on the accuracy of complete-arch intraoral scans: A preliminary study on segmental scans and merge methods.The Journal of Advanced Prosthodontics, 14 (2):88–95, 2022. doi:10.4047/jap.2022.14.2.88

  22. [22]

    Connolly

    C. Connolly. The determination of next best views. InProceedings. 1985 IEEE international conference on robotics and automation, volume 2, pages 432–435. IEEE, 1985

  23. [23]

    R. Zeng, Y . Wen, W. Zhao, and Y .-J. Liu. View planning in robot active vision: A survey of systems, algorithms, and applications.Computational Visual Media, 6(3):225–245, 2020

  24. [24]

    Isler, R

    S. Isler, R. Sabzevari, J. Delmerico, and D. Scaramuzza. An information gain formulation for active volumetric 3d reconstruction. In2016 IEEE international conference on robotics and automation (ICRA), pages 3477–3484. IEEE, 2016

  25. [25]

    Delmerico, S

    J. Delmerico, S. Isler, R. Sabzevari, and D. Scaramuzza. A comparison of volumetric infor- mation gain metrics for active 3d object reconstruction.Autonomous Robots, 42(2):197–208, 2018

  26. [26]

    Peralta, J

    D. Peralta, J. Casimiro, A. M. Nilles, J. A. Aguilar, R. Atienza, and R. Cajote. Next-best view policy for 3d reconstruction. InEuropean Conference on Computer Vision, pages 558–573. Springer, 2020

  27. [27]

    X. Chen, Q. Li, T. Wang, T. Xue, and J. Pang. Gennbv: Generalizable next-best-view policy for active 3d reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16436–16445, 2024

  28. [28]

    Frahm, D

    N. Frahm, D. Zhao, A. D. Beltran, R. Alterovitz, J.-M. Frahm, J. Oliva, and R. Sen- gupta. Vin-nbv: A view introspection network for next-best-view selection.arXiv preprint arXiv:2505.06219, 2025. 11

  29. [29]

    Y . Ao, M. Prajapat, Y . As, Y . Taoudi-Benchekroun, F. Carrillo, H. Esfandiari, B. F. Grewe, A. Krause, and P. F¨urnstahl. Robust-sub-gaussian model predictive control for safe ultrasound- image-guided robotic spinal surgery.IEEE Robotics and Automation Letters, 11(5):5638– 5645, 2026. doi:10.1109/LRA.2026.3673984

  30. [30]

    G ¨obel, J

    B. G ¨obel, J. Huurdeman, A. Reiterer, and K. M¨oller. Robot-based procedure for 3d reconstruc- tion of abdominal organs using the iterative closest point and pose graph algorithms.Journal of imaging, 11(2):44, 2025. doi:10.3390/jimaging11020044

  31. [31]

    Hornung, K

    A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard. Octomap: An efficient probabilistic 3d mapping framework based on octrees.Autonomous Robots, 34(3):189–206,

  32. [32]

    doi:10.1007/s10514-012-9321-0

  33. [33]

    Schulman, F

    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

  34. [34]

    Ben-Hamadou, O

    A. Ben-Hamadou, O. Smaoui, H. Chaabouni-Chouayakh, A. Rekik, S. Pujades, E. Boyer, J. Strippoli, A. Thollot, H. Setbon, C. Trosset, et al. Teeth3ds: a benchmark for teeth segmen- tation and labeling from intra-oral 3d scans.arXiv preprint arXiv:2210.06094, 2022

  35. [35]

    Ben-Hamadou, O

    A. Ben-Hamadou, O. Smaoui, A. Rekik, S. Pujades, E. Boyer, H. Lim, M. Kim, M. Lee, M. Chung, Y .-G. Shin, M. Leclercq, L. Cevidanes, J. C. Prieto, S. Zhuang, G. Wei, Z. Cui, Y . Zhou, T. Dascalu, B. Ibragimov, T.-H. Yong, H.-G. Ahn, W. Kim, J.-H. Han, B. Choi, N. van Nistelrooij, S. Kempers, S. Vinayahalingam, J. Strippoli, A. Thollot, H. Setbon, C. Tross...

  36. [36]

    Feng, C.-C

    C.-W. Feng, C.-C. Hung, J.-C. Wang, and T.-H. Lan. Accuracy of different head movements of intraoral scanner in full arch of both maxilla and mandible.Applied Sciences, 11(17):8140,

  37. [37]

    doi:10.3390/app11178140

  38. [38]

    Kriegel, C

    S. Kriegel, C. Rink, T. Bodenm ¨uller, and M. Suppa. Efficient next-best-scan planning for autonomous 3d surface reconstruction of unknown objects.Journal of Real-Time Image Pro- cessing, 10:611–631, 2015. doi:10.1007/s11554-013-0386-6

  39. [39]

    Franka research 3.https://franka.de/franka-research-3, 2026

    Franka Robotics. Franka research 3.https://franka.de/franka-research-3, 2026. Accessed: 2026-05-29

  40. [40]

    Exec. steps

    Huvitz. Lilivis scan: Dental intra oral scanner.https://www.huvitz.com/en/product/ lilivis-scan/, 2026. Accessed: 2026-05-29. 12 Supplementary Material: RobOralScan: Learning Active Intraoral Scanning for Robotic Dental Reconstruction Jinhyung Lee Haeun Yun Siwon Kim Gihyun Baek Sungho Moon Sehyun Hwang Sunghoon Im DGIST Overview This supplementary materi...