PRISM-VO: Scale-Aware Visual Odometry Using Photometric Plenoptic Bundle Adjustment

Aymeric Fleith; Daniel Cremers; Julian Zirbel; Niclas Zeller

arxiv: 2607.00176 · v1 · pith:P6B7M2J7new · submitted 2026-06-30 · 💻 cs.CV

PRISM-VO: Scale-Aware Visual Odometry Using Photometric Plenoptic Bundle Adjustment

Aymeric Fleith , Julian Zirbel , Daniel Cremers , Niclas Zeller This is my paper

Pith reviewed 2026-07-02 19:25 UTC · model grok-4.3

classification 💻 cs.CV

keywords visual odometryplenoptic camerabundle adjustmentphotometric optimizationmetric scaleSLAMdepth priorfocused plenoptic

0 comments

The pith

Plenoptic bundle adjustment recovers reliable metric scale in visual odometry from a single sensor.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents PRISM-VO as a sparse photometric visual odometry method built around a plenoptic bundle adjustment that jointly refines camera poses and inverse depths inside a sliding window. It aims to show that explicit modeling of the plenoptic projection, fused with geometric depth computed from one image and temporal photometric constraints, removes the scale ambiguity that affects ordinary monocular systems. A reader would care because the approach needs no extra sensors and no special initialization procedure, yet still produces drift-resistant trajectories that stay in metric units. Experiments on indoor and outdoor scenes indicate it exceeds earlier plenoptic methods and matches other optimization or learning pipelines while preserving correct scale.

Core claim

By combining geometric depth priors extracted directly from each plenoptic image with multi-view photometric residuals inside a joint optimization of poses and inverse depths, PRISM-VO produces accurate, drift-resilient motion estimates that recover consistent metric scale using only a single focused plenoptic camera.

What carries the argument

Photometric plenoptic bundle adjustment, which models the plenoptic projection to optimize poses and inverse depths together.

If this is right

Metric-scale reconstructions become available without auxiliary sensors or manual initialization.
Drift remains low across long sequences because depth priors and photometric constraints reinforce each other.
Performance exceeds prior plenoptic visual-odometry algorithms on both indoor and outdoor data.
Results rival other optimization-based and learning-based visual-odometry systems while retaining explicit metric scale.
The sliding-window formulation supports real-time operation on a single plenoptic stream.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same depth-prior fusion idea could be tested on other light-field or multi-lens cameras to see whether metric scale generalizes beyond the focused plenoptic design.
If the plenoptic depth prior degrades in low-texture or distant scenes, hybrid use with occasional IMU measurements might restore robustness without losing the single-sensor advantage.
Because scale is recovered explicitly, the output trajectories could be fed directly into metric mapping or planning modules that currently require stereo or RGB-D input.

Load-bearing premise

Geometric depth values computed from a single plenoptic image stay accurate and stable enough to serve as priors that keep the overall scale correct when fused with temporal constraints.

What would settle it

Run the optimizer on a sequence in which the plenoptic depth estimates contain systematic bias or high variance; the resulting trajectories should then exhibit growing scale drift or outright failure to match ground-truth metric distances.

Figures

Figures reproduced from arXiv: 2607.00176 by Aymeric Fleith, Daniel Cremers, Julian Zirbel, Niclas Zeller.

**Figure 1.** Figure 1: Metric 3D reconstruction of a 150 m long sequence (seq_007) from the dataset [39] using PRISM-VO. The zoom shows the accumulated drift over the whole sequence. The images below are examples of raw plenoptic images from the sequence. a micro-lens array (MLA) between the main lens and the sensor of a camera, a plenoptic camera extends conventional imaging by simultaneously capturing spatial and angular infor… view at source ↗

**Figure 2.** Figure 2: Representation of the model of the focused plenoptic camera in Galilean mode. but is tailored to sparse light-field cameras. An indirect VO framework leveraging light-field cameras is introduced in [1], which enables metric-scale translation estimation and simplifies calibration from a novel keypoint extraction. However, this approach requires an additional step of feature detection depending on textured… view at source ↗

**Figure 3.** Figure 3: Results of depth estimation on plenoptic images based on [36] and [15]. point in camera metric coordinates xC = [xC , yC , zC ] T in the 3D virtual image space xV = [xV , yV , zV ] T . This projection is denoted as Πpl(·). The virtual depth allows a depth map to be generated without prior metric calibration of the camera. The main lens principal point is denoted as cL = [cx, cy] T in pixels. Each micro-len… view at source ↗

**Figure 4.** Figure 4: Overview of the PRISM-VO algorithm pipeline: image processing, tracking in the front-end, and optimization by plenoptic bundle adjustment in the back-end. totally focused image drives photometric tracking, while depth and uncertainty provide geometric priors for both front-end tracking and back-end optimization. The front-end estimates camera pose via direct image alignment against reference keyframes. To… view at source ↗

**Figure 5.** Figure 5: Point selection and residual definitions. (5a) Totally focused image. (5b) Corresponding virtual depth map. (5c) Selection of points when adding a new frame. Green: valid depth used in the bundle adjustment; cyan: selected points with depth information available but unused due to a high uncertainty; magenta: selected points without depth. Representation of photometric (5d) and depth (5e) residuals, where… view at source ↗

**Figure 6.** Figure 6: Evolution of the inverse virtual depth residual weight factor η on seq_004 from the dataset [40]. Fig. 6a: Local maxima occur with nearby geometry, increasing depth influence, while local minima correspond to distant scenes dominated by photometric consistency. Fig. 6b: Corresponding representative frames are shown. online by matching the average Gauss–Newton curvature, determined by J T J, of both residua… view at source ↗

**Figure 7.** Figure 7: Point clouds and trajectories estimated by PRISM-VO on sequences from the dataset [40]. On the left is the sequence seq_002 (200 m long / outdoor) and on the right is the seq_009 (30 m long / indoor). The zoomed views show the accumulated drift and the ground-truth trajectory (front part in orange, back part in red). depends only on inverse depth and does not depend on a, b or the pose ξ. The overall struc… view at source ↗

**Figure 8.** Figure 8: Point clouds and trajectories estimated by PRISM-VO on sequences from the LiFMCR dataset [16] with the associated totally focused image (scenes from left to right: 01_Plants, 02_Bike, 04_Electronics). The camera poses are shown in blue. era and a Basler acA1920-40gc camera. As shown in [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Trajectories from a challenging sequence where PRISM-VO succeeds while other monocular methods fail. (a) Focus Pl (b) Depth Pl (c) RGB RS (d) Depth RS [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 11.** Figure 11: Cumulative plots obtained on the dataset [40] for VO/SLAM. to be able to compare the methods. To mitigate non-deterministic effects, each pipeline is run 10 times per sequence, with aggregated results shown in [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 12.** Figure 12: Cumulative error distributions for the ablation study. Curves compare the pinhole baseline (1), plenoptic model with depth initialization (2), and the full PRISMVO system with variance-weighted depth and adaptive balancing (3). Results show progressive improvement, highlighting the benefit of tightly integrating plenoptic data. scale observable, confirming the benefit of exploiting plenoptic data. The fu… view at source ↗

read the original abstract

We introduce PRISM-VO, a novel pure optimization-based sparse photometric visual odometry framework for focused plenoptic cameras. The core of PRISM-VO is a novel photometric plenoptic bundle adjustment which jointly optimizes camera poses and inverse depth values of points in a sliding window. By combining geometric depth from a single plenoptic image with temporal multi-view constraints, PRISM-VO achieves accurate and drift-resilient motion estimation. Through explicit modeling of the plenoptic projection, PRISM-VO provides reliable metric-scale reconstructions, overcoming the scale ambiguity of monocular SLAM algorithms. Importantly, our approach relies solely on a single plenoptic sensor and avoids complex initialization, as depth priors are computed directly from plenoptic imaging. Experiments show that PRISM-VO outperforms the current state-of-the-art plenoptic visual odometry method on indoor and outdoor scenes. The proposed approach rivals other optimization- and learning-based methods while accurately and reliably recovering a metric scale of the scene. Project page: https://prism-vo.github.io/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PRISM-VO adds explicit plenoptic projection modeling to photometric bundle adjustment to recover metric scale from one sensor.

read the letter

The paper introduces a sliding-window photometric bundle adjustment built for focused plenoptic cameras. It jointly refines poses and inverse depths by feeding geometric depth computed from a single plenoptic image into the same optimization that uses temporal photometric residuals.

What is actually new is the way the plenoptic projection is written directly into the bundle adjustment cost. Prior plenoptic VO work is cited, but this formulation appears to be the first that keeps both the geometric depth prior and the multi-view photometric terms inside one sparse optimization.

The experiments claim better accuracy than the previous plenoptic VO baseline on indoor and outdoor sequences and competitive numbers against other monocular and learning-based methods while preserving metric scale. That is the practical claim worth checking.

The soft spot is exactly the one the stress-test note flags: the abstract gives no error model for the single-image depth, no weighting schedule between geometric and photometric terms, and no ablation on how depth noise at range or in low texture affects long-term scale drift. Without those numbers it is hard to know whether the claimed scale resilience comes from the plenoptic prior or from the photometric constraints alone. The full paper presumably contains the equations and plots; the abstract alone does not let a reader verify the central assumption.

This is for people who already work with plenoptic or light-field hardware and need a metric VO pipeline that stays inside one sensor. A serious referee should see it because the modeling is concrete and the claimed results are on real data, even if the scale-stability argument needs tighter evidence.

Referee Report

1 major / 0 minor

Summary. The manuscript presents PRISM-VO, a sparse photometric visual odometry framework for focused plenoptic cameras. Its core contribution is a photometric plenoptic bundle adjustment that jointly optimizes camera poses and inverse depth values within a sliding window. By integrating geometric depth priors computed from a single plenoptic image with temporal multi-view photometric constraints, the method claims to deliver accurate, drift-resilient motion estimation and reliable metric-scale scene reconstructions using only one plenoptic sensor, without external initialization or additional sensors. Experiments are stated to demonstrate outperformance over prior plenoptic VO methods on indoor and outdoor scenes while rivaling other optimization- and learning-based approaches in metric scale recovery.

Significance. If the central claims hold after verification, the work would offer a meaningful advance by resolving monocular scale ambiguity through explicit plenoptic projection modeling and single-image depth priors, potentially enabling metric VO in sensor-constrained settings. No machine-checked proofs, reproducible code releases, or parameter-free derivations are described in the provided text.

major comments (1)

[Abstract] Abstract (final paragraph): The central claim that combining single-image plenoptic geometric depth with temporal photometric constraints produces drift-resilient metric trajectories is load-bearing, yet the text provides no error model for plenoptic depth, no weighting schedule between geometric and photometric residuals, and no ablation on how depth variance propagates into pose drift. This directly matches the stress-test concern that scale can wander when plenoptic depth noise exceeds photometric constraint strength, particularly at range or in low texture.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract (final paragraph): The central claim that combining single-image plenoptic geometric depth with temporal photometric constraints produces drift-resilient metric trajectories is load-bearing, yet the text provides no error model for plenoptic depth, no weighting schedule between geometric and photometric residuals, and no ablation on how depth variance propagates into pose drift. This directly matches the stress-test concern that scale can wander when plenoptic depth noise exceeds photometric constraint strength, particularly at range or in low texture.

Authors: We agree that the presentation of the central claim can be strengthened by making the supporting analysis more explicit. The manuscript describes the plenoptic depth prior computation (Section 3.2) and its integration into the joint photometric-geometric bundle adjustment (Section 5, Equation 7), with the weighting factor between the two residual types set proportionally to the inverse depth variance. However, we did not provide a dedicated error propagation derivation or ablation study. We will revise the manuscript to add (i) an explicit error model for the single-image plenoptic depth estimates, (ii) the precise weighting schedule used in the optimizer, and (iii) an ablation examining how depth variance affects pose drift at varying ranges and texture levels. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external plenoptic geometry as independent input.

full rationale

The paper introduces a photometric plenoptic bundle adjustment that jointly optimizes poses and inverse depths while incorporating geometric depth priors computed directly from single plenoptic images. This scale source is presented as an external sensor property rather than a fitted parameter or self-referential quantity. No equations or claims reduce the metric-scale output to a definition or prediction derived from the optimization itself. No load-bearing self-citations or uniqueness theorems from prior author work are invoked to force the result. The central claim remains self-contained against the plenoptic camera model as an independent prior.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. Typical bundle-adjustment parameters such as window size or weighting factors are expected but not detailed.

pith-pipeline@v0.9.1-grok · 5720 in / 1060 out tokens · 17867 ms · 2026-07-02T19:25:42.811346+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 37 canonical work pages · 1 internal anchor

[1]

Intelligent Systems with Applications (ISWA)28, 200600 (2025).https: //doi.org/10.1016/j.iswa.2025.200600

Al Assaad, M., Bazeille, S., Cudel, C.: Indirect visual odometry with a light-field camera. Intelligent Systems with Applications (ISWA)28, 200600 (2025).https: //doi.org/10.1016/j.iswa.2025.200600

work page doi:10.1016/j.iswa.2025.200600 2025
[2]

Transactions on Pattern Analysis and Machine Intelligence (TPAMI)39(2), 287–300 (2017).https://doi.org/10.1109/tpami

Bok, Y., Jeon, H.G., Kweon, I.S.: Geometric calibration of micro-lens-based light field cameras using line features. Transactions on Pattern Analysis and Machine Intelligence (TPAMI)39(2), 287–300 (2017).https://doi.org/10.1109/tpami. 2016.2541145

work page doi:10.1109/tpami 2017
[3]

Transactions on Robotics (T-RO)37(6), 1874–1890 (2021).https://doi

Campos, C., Elvira, R., Rodríguez, J.J.G., Montiel, J.M., Tardós, J.D.: ORB- SLAM3: An accurate open-source library for visual, visual-inertial, and multi-map SLAM. Transactions on Robotics (T-RO)37(6), 1874–1890 (2021).https://doi. org/10.1109/TRO.2021.3075644

work page doi:10.1109/tro.2021.3075644 2021
[4]

In: International Conference on Intelligent Robots and Systems (IROS)

Dansereau, D.G., Mahon, I., Pizarro, O., Williams, S.B.: Plenoptic flow: Closed- form visual odometry for light field cameras. In: International Conference on Intelligent Robots and Systems (IROS). pp. 4455–4462. IEEE (2011).https: //doi.org/10.1109/IROS.2011.6095080

work page doi:10.1109/iros.2011.6095080 2011
[5]

In: Conference on Computer Vi- sion and Pattern Recognition (CVPR)

Dansereau, D.G., Pizarro, O., Williams, S.B.: Decoding, calibration and recti- fication for lenslet-based plenoptic cameras. In: Conference on Computer Vi- sion and Pattern Recognition (CVPR). pp. 1027–1034. IEEE (2013).https: //doi.org/10.1109/cvpr.2013.137

work page doi:10.1109/cvpr.2013.137 2013
[6]

In: International Conference on Image Processing (ICIP)

Darwish, W., Bolsee, Q., Munteanu, A.: Plenoptic camera calibration based on sub-aperture images. In: International Conference on Image Processing (ICIP). pp. 3527–3531. IEEE (2019).https://doi.org/10.1109/icip.2019.8803473

work page doi:10.1109/icip.2019.8803473 2019
[7]

In: International Conference on Intelligent Robots and Systems (IROS)

Digumarti, S.T., Daniel, J., Ravendran, A., Griffiths, R., Dansereau, D.G.: Unsu- pervised learning of depth estimation and visual odometry for sparse light field cameras. In: International Conference on Intelligent Robots and Systems (IROS). pp. 278–285. IEEE (2021).https://doi.org/10.1109/IROS51168.2021.9636570

work page doi:10.1109/iros51168.2021.9636570 2021
[8]

International Journal of Robotics Research (IJRR) 32(2), 206–217 (2013).https://doi.org/10.1177/0278364912469420

Dong, F., Ieng, S.H., Savatier, X., Etienne-Cummings, R., Benosman, R.: Plenoptic cameras in real-time robotics. International Journal of Robotics Research (IJRR) 32(2), 206–217 (2013).https://doi.org/10.1177/0278364912469420

work page doi:10.1177/0278364912469420 2013
[9]

International Journal of Computer Vision (IJCV)134(1), 34 (2026).https:// doi.org/10.1007/s11263-025-02612-2

Dury, S., Bonatto, D., Sancho, J., Juarez, E., Teratani, M., Lafruit, G.: Structure- from-motion in the micro-image domain for uncalibrated plenoptic 2.0 cameras. International Journal of Computer Vision (IJCV)134(1), 34 (2026).https:// doi.org/10.1007/s11263-025-02612-2

work page doi:10.1007/s11263-025-02612-2 2026
[10]

Transactions on Pat- tern Analysis and Machine Intelligence (TPAMI)40(3), 611–625 (2018).https: //doi.org/10.1109/TPAMI.2017.2658577

Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. Transactions on Pat- tern Analysis and Machine Intelligence (TPAMI)40(3), 611–625 (2018).https: //doi.org/10.1109/TPAMI.2017.2658577

work page doi:10.1109/tpami.2017.2658577 2018
[11]

In: European Conference on Computer Vision (ECCV)

Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular SLAM. In: European Conference on Computer Vision (ECCV). pp. 834–849. Springer (2014).https://doi.org/10.1007/978-3-319-10605-2_54

work page doi:10.1007/978-3-319-10605-2_54 2014
[12]

Transactions on Multimedia27, 7179–7191 (2025).https://doi.org/10.1109/TMM.2025.3590906

Fachada, S., Bonatto, D., Lafruit, G., Teratani, M.: Micro-image domain view synthesizer for free navigation with focused plenoptic cameras. Transactions on Multimedia27, 7179–7191 (2025).https://doi.org/10.1109/TMM.2025.3590906

work page doi:10.1109/tmm.2025.3590906 2025
[13]

In: International Workshop on Multimedia Sig- nal Processing (MMSP)

Fachada, S., Bonatto, D., Losfeld, A., Lafruit, G., Teratani, M.: Pattern-free plenoptic 2.0 camera calibration. In: International Workshop on Multimedia Sig- nal Processing (MMSP). pp. 1–6. IEEE (2022).https://doi.org/10.1109/ mmsp55362.2022.9949312

work page arXiv 2022
[14]

In: International Workshop PRISM-VO 17 on Multimedia Signal Processing (MMSP)

Fachada, S., Losfeld, A., Senoh, T., Lafruit, G., Teratani, M.: A calibration method for sub-aperture views of plenoptic 2.0 camera arrays. In: International Workshop PRISM-VO 17 on Multimedia Signal Processing (MMSP). pp. 1–6. IEEE (2021).https://doi. org/10.1109/mmsp53017.2021.9733556

work page doi:10.1109/mmsp53017.2021.9733556 2021
[15]

In: DAGM German Conference on Pattern Recognition (GCPR)

Fleith, A., Ahmed, D., Cremers, D., Zeller, N.: LiFCal: Online light field cam- era calibration via bundle adjustment. In: DAGM German Conference on Pattern Recognition (GCPR). pp. 120–136. Springer (2024).https://doi.org/10.1007/ 978-3-031-85187-2_8

2024
[16]

In: International Symposium on Visual Computing (ISVC)

Fleith, A., Zirbel, J., Cremers, D., Zeller, N.: LiFMCR: Dataset and benchmark for light field multi-camera registration. In: International Symposium on Visual Computing (ISVC). Springer Nature Switzerland (2026).https://doi.org/10. 1007/978-3-032-14492-8_35

2026
[17]

Optics Express24(19), 21521–21540 (2016)

Hahne, C., Aggoun, A., Velisavljevic, V., Fiebig, S., Pesch, M.: Refocusing dis- tance of a standard plenoptic camera. Optics Express24(19), 21521–21540 (2016). https://doi.org/10.1364/OE.24.021521

work page doi:10.1364/oe.24.021521 2016
[18]

arXiv preprint arXiv:2012.10714 (2020).https://doi.org/10.48550/ arXiv.2012.10714

Kaveti, P., Singh, H.: A light field front-end for robust SLAM in dynamic envi- ronments. arXiv preprint arXiv:2012.10714 (2020).https://doi.org/10.48550/ arXiv.2012.10714

work page arXiv 2012
[19]

In: International Symposium on Mixed and Augmented Reality (ISMAR)

Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: International Symposium on Mixed and Augmented Reality (ISMAR). pp. 225–
[20]

IEEE (2007).https://doi.org/10.1109/ISMAR.2007.4538852

work page doi:10.1109/ismar.2007.4538852 2007
[21]

In: Conference on Computer Vision and Pattern Recognition (CVPR)

Labussière, M., Teulière, C., Bernardin, F., Ait-Aider, O.: Blur-aware calibration of multi-focus plenoptic cameras. In: Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2542–2551. IEEE (2020).https://doi.org/10.1109/ cvpr42600.2020.00262

work page arXiv 2020
[22]

International journal of computer vi- sion (IJCV)130(7), 1655–1677 (2022).https://doi.org/10.1007/s11263-022- 01582-z

Labussière, M., Teulière, C., Bernardin, F., Ait-Aider, O.: Leveraging blur in- formation for plenoptic camera calibration. International journal of computer vi- sion (IJCV)130(7), 1655–1677 (2022).https://doi.org/10.1007/s11263-022- 01582-z

work page doi:10.1007/s11263-022- 2022
[23]

In: Inter- national Conference on Robotics and Automation (ICRA)

Lasheras-Hernandez, B., Strobl, K.H., Izquierdo, S., Bodenmüller, T., Triebel, R., Civera, J.: Single-shot metric depth from focused plenoptic cameras. In: Inter- national Conference on Robotics and Automation (ICRA). pp. 9566–9573. IEEE (2025).https://doi.org/10.1109/ICRA55743.2025.11128276

work page doi:10.1109/icra55743.2025.11128276 2025
[24]

In: International Con- ference on Computational Photography (ICCP)

Lumsdaine, A., Georgiev, T.: The focused plenoptic camera. In: International Con- ference on Computational Photography (ICCP). pp. 1–8. IEEE (2009).https: //doi.org/10.1109/iccphot.2009.5559008

work page doi:10.1109/iccphot.2009.5559008 2009
[25]

Indiana University and Adobe Systems, Tech

Lumsdaine, A., Georgiev, T., et al.: Full-resolution light field rendering. Indiana University and Adobe Systems, Tech. Rep91, 92 (2008)

2008
[26]

Transactions on Robotics (T-RO)31(5), 1147–1163 (2015).https://doi.org/10.1109/TRO.2015.2463671

Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: A versatile and accu- rate monocular SLAM system. Transactions on Robotics (T-RO)31(5), 1147–1163 (2015).https://doi.org/10.1109/TRO.2015.2463671

work page doi:10.1109/tro.2015.2463671 2015
[27]

In: International Conference on Computer Vision (ICCV)

Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense tracking and map- ping in real-time. In: International Conference on Computer Vision (ICCV). pp. 2320–2327. IEEE (2011).https://doi.org/10.1109/ICCV.2011.6126513

work page doi:10.1109/iccv.2011.6126513 2011
[28]

In: International Conference on Digital Image Computing: Techniques and Applications (DICTA)

Noury, C.A., Teulière, C., Dhome, M.: Light-field camera calibration from raw images. In: International Conference on Digital Image Computing: Techniques and Applications (DICTA). pp. 1–8. IEEE (2017).https://doi.org/10.1109/DICTA. 2017.8227459

work page doi:10.1109/dicta 2017
[29]

In: International conference on 3D vision (3DV)

O’brien, S., Trumpf, J., Ila, V., Mahony, R.: Calibrating light-field cameras using plenoptic disc features. In: International conference on 3D vision (3DV). pp. 286–
[30]

Fleith et al

IEEE (2018).https://doi.org/10.1109/3dv.2018.00041 18 A. Fleith et al

work page doi:10.1109/3dv.2018.00041 2018
[31]

In: Human Vision and Electronic Imaging (HVEI)

Perwass, C., Wietzke, L.: Single-lens 3D camera with extended depth of field. In: Human Vision and Electronic Imaging (HVEI). vol. 8291, pp. 45–59. SPIE (2012). https://doi.org/10.1117/12.909882

work page doi:10.1117/12.909882 2012
[32]

Advances in Neural Information Processing Systems (NeurIPS) 34, 16558–16569 (2021)

Teed, Z., Deng, J.: DROID-SLAM: Deep visual SLAM for monocular, stereo, and RGB-D cameras. Advances in Neural Information Processing Systems (NeurIPS) 34, 16558–16569 (2021)

2021
[33]

Advances in Neural Information Processing Systems (NeurIPS)36, 39033–39051 (2023)

Teed, Z., Lipson, L., Deng, J.: Deep patch visual odometry. Advances in Neural Information Processing Systems (NeurIPS)36, 39033–39051 (2023)

2023
[34]

Robotics and Automation Letters (RA-L)7(2), 1408–1415 (2022)

Von Stumberg, L., Cremers, D.: DM-VIO: Delayed marginalization visual-inertial odometry. Robotics and Automation Letters (RA-L)7(2), 1408–1415 (2022). https://doi.org/10.1109/LRA.2021.3140129

work page doi:10.1109/lra.2021.3140129 2022
[35]

Ego4d: Around the world in 3, 000 hours of egocentric video

Wang, Y., Wang, L., Liang, Z., Yang, J., An, W., Guo, Y.: Occlusion-aware cost constructor for light field depth estimation. In: Conference on Computer Vision and Pattern Recognition (CVPR). pp. 19777–19786. IEEE (2022).https://doi. org/10.1109/cvpr52688.2022.01919

work page doi:10.1109/cvpr52688.2022.01919 2022
[36]

In: Conference on Computer Vision and Pattern Recognition (CVPR)

Xiao, Z., Liu, Y., Gao, R., Xiong, Z.: CutMIB: Boosting light field super-resolution via multi-view image blending. In: Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1672–1682. IEEE (2023).https://doi.org/10.1109/ cvpr52729.2023.00167

work page arXiv 2023
[37]

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Annals)III–3, 449–456 (2016).https://doi.org/10.5194/isprsannals- iii-3-449-2016

Zeller, N., Noury, C.A., Quint, F., Teulière, C., Stilla, U., Dhome, M.: Metric calibration of a focused plenoptic camera based on a 3D calibration target. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Annals)III–3, 449–456 (2016).https://doi.org/10.5194/isprsannals- iii-3-449-2016

work page doi:10.5194/isprsannals- 2016
[38]

Zeller, N., Quint, F., Stilla, U.: Establishing a probabilistic depth map from focused plenopticcameras.In:InternationalConferenceon3DVision(3DV)(2015).https: //doi.org/10.1109/3DV.2015.18

work page doi:10.1109/3dv.2015.18 2015
[39]

Sepasdar, M

Zeller, N., Quint, F., Stilla, U.: Depth estimation and camera calibration of a focused plenoptic camera for visual odometry. ISPRS Journal of Photogrammetry and Remote Sensing (P&RS)118, 83–100 (2016).https://doi.org/10.1016/j. isprsjprs.2016.04.010

work page doi:10.1016/j 2016
[40]

Zeller, N., Quint, F., Stilla, U.: From the calibration of a light-field camera to direct plenopticodometry.JournalofSelectedTopicsinSignalProcessing(JSTSP)11(7), 1004–1019 (2017).https://doi.org/10.1109/jstsp.2017.2737965

work page doi:10.1109/jstsp.2017.2737965 2017
[41]

In: European Conference on Computer Vision (ECCV)

Zeller, N., Quint, F., Stilla, U.: Scale-awareness of light-field camera-based visual odometry. In: European Conference on Computer Vision (ECCV). p. 732–747. Springer (2018).https://doi.org/10.1007/978-3-030-01237-3_44

work page doi:10.1007/978-3-030-01237-3_44 2018
[42]

A Synchronized Stereo and Plenoptic Visual Odometry Dataset

Zeller, N., Quint, F., Stilla, U.: A synchronized stereo and plenoptic visual odome- try dataset. arXiv preprint arXiv:1807.09372 (2018).https://doi.org/10.48550/ arXiv.1807.09372

work page internal anchor Pith review Pith/arXiv arXiv 2018
[43]

Optical Engineering59(7), 073104– 073104 (2020).https://doi.org/10.1117/1.oe.59.7.073104

Zhao, Y., Li, H., Mei, D., Shi, S.: Metric calibration of unfocused plenoptic cam- eras for three-dimensional shape measurement. Optical Engineering59(7), 073104– 073104 (2020).https://doi.org/10.1117/1.oe.59.7.073104

work page doi:10.1117/1.oe.59.7.073104 2020
[44]

Optics and Lasers in Engineering115, 190–196 (2019).https://doi.org/10.1016/j.optlaseng.2018.11.024

Zhou, P., Cai, W., Yu, Y., Zhang, Y., Zhou, G.: A two-step calibration method of lenslet-based light field cameras. Optics and Lasers in Engineering115, 190–196 (2019).https://doi.org/10.1016/j.optlaseng.2018.11.024

work page doi:10.1016/j.optlaseng.2018.11.024 2019

[1] [1]

Intelligent Systems with Applications (ISWA)28, 200600 (2025).https: //doi.org/10.1016/j.iswa.2025.200600

Al Assaad, M., Bazeille, S., Cudel, C.: Indirect visual odometry with a light-field camera. Intelligent Systems with Applications (ISWA)28, 200600 (2025).https: //doi.org/10.1016/j.iswa.2025.200600

work page doi:10.1016/j.iswa.2025.200600 2025

[2] [2]

Transactions on Pattern Analysis and Machine Intelligence (TPAMI)39(2), 287–300 (2017).https://doi.org/10.1109/tpami

Bok, Y., Jeon, H.G., Kweon, I.S.: Geometric calibration of micro-lens-based light field cameras using line features. Transactions on Pattern Analysis and Machine Intelligence (TPAMI)39(2), 287–300 (2017).https://doi.org/10.1109/tpami. 2016.2541145

work page doi:10.1109/tpami 2017

[3] [3]

Transactions on Robotics (T-RO)37(6), 1874–1890 (2021).https://doi

Campos, C., Elvira, R., Rodríguez, J.J.G., Montiel, J.M., Tardós, J.D.: ORB- SLAM3: An accurate open-source library for visual, visual-inertial, and multi-map SLAM. Transactions on Robotics (T-RO)37(6), 1874–1890 (2021).https://doi. org/10.1109/TRO.2021.3075644

work page doi:10.1109/tro.2021.3075644 2021

[4] [4]

In: International Conference on Intelligent Robots and Systems (IROS)

Dansereau, D.G., Mahon, I., Pizarro, O., Williams, S.B.: Plenoptic flow: Closed- form visual odometry for light field cameras. In: International Conference on Intelligent Robots and Systems (IROS). pp. 4455–4462. IEEE (2011).https: //doi.org/10.1109/IROS.2011.6095080

work page doi:10.1109/iros.2011.6095080 2011

[5] [5]

In: Conference on Computer Vi- sion and Pattern Recognition (CVPR)

Dansereau, D.G., Pizarro, O., Williams, S.B.: Decoding, calibration and recti- fication for lenslet-based plenoptic cameras. In: Conference on Computer Vi- sion and Pattern Recognition (CVPR). pp. 1027–1034. IEEE (2013).https: //doi.org/10.1109/cvpr.2013.137

work page doi:10.1109/cvpr.2013.137 2013

[6] [6]

In: International Conference on Image Processing (ICIP)

Darwish, W., Bolsee, Q., Munteanu, A.: Plenoptic camera calibration based on sub-aperture images. In: International Conference on Image Processing (ICIP). pp. 3527–3531. IEEE (2019).https://doi.org/10.1109/icip.2019.8803473

work page doi:10.1109/icip.2019.8803473 2019

[7] [7]

In: International Conference on Intelligent Robots and Systems (IROS)

Digumarti, S.T., Daniel, J., Ravendran, A., Griffiths, R., Dansereau, D.G.: Unsu- pervised learning of depth estimation and visual odometry for sparse light field cameras. In: International Conference on Intelligent Robots and Systems (IROS). pp. 278–285. IEEE (2021).https://doi.org/10.1109/IROS51168.2021.9636570

work page doi:10.1109/iros51168.2021.9636570 2021

[8] [8]

International Journal of Robotics Research (IJRR) 32(2), 206–217 (2013).https://doi.org/10.1177/0278364912469420

Dong, F., Ieng, S.H., Savatier, X., Etienne-Cummings, R., Benosman, R.: Plenoptic cameras in real-time robotics. International Journal of Robotics Research (IJRR) 32(2), 206–217 (2013).https://doi.org/10.1177/0278364912469420

work page doi:10.1177/0278364912469420 2013

[9] [9]

International Journal of Computer Vision (IJCV)134(1), 34 (2026).https:// doi.org/10.1007/s11263-025-02612-2

Dury, S., Bonatto, D., Sancho, J., Juarez, E., Teratani, M., Lafruit, G.: Structure- from-motion in the micro-image domain for uncalibrated plenoptic 2.0 cameras. International Journal of Computer Vision (IJCV)134(1), 34 (2026).https:// doi.org/10.1007/s11263-025-02612-2

work page doi:10.1007/s11263-025-02612-2 2026

[10] [10]

Transactions on Pat- tern Analysis and Machine Intelligence (TPAMI)40(3), 611–625 (2018).https: //doi.org/10.1109/TPAMI.2017.2658577

Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. Transactions on Pat- tern Analysis and Machine Intelligence (TPAMI)40(3), 611–625 (2018).https: //doi.org/10.1109/TPAMI.2017.2658577

work page doi:10.1109/tpami.2017.2658577 2018

[11] [11]

In: European Conference on Computer Vision (ECCV)

Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular SLAM. In: European Conference on Computer Vision (ECCV). pp. 834–849. Springer (2014).https://doi.org/10.1007/978-3-319-10605-2_54

work page doi:10.1007/978-3-319-10605-2_54 2014

[12] [12]

Transactions on Multimedia27, 7179–7191 (2025).https://doi.org/10.1109/TMM.2025.3590906

Fachada, S., Bonatto, D., Lafruit, G., Teratani, M.: Micro-image domain view synthesizer for free navigation with focused plenoptic cameras. Transactions on Multimedia27, 7179–7191 (2025).https://doi.org/10.1109/TMM.2025.3590906

work page doi:10.1109/tmm.2025.3590906 2025

[13] [13]

In: International Workshop on Multimedia Sig- nal Processing (MMSP)

Fachada, S., Bonatto, D., Losfeld, A., Lafruit, G., Teratani, M.: Pattern-free plenoptic 2.0 camera calibration. In: International Workshop on Multimedia Sig- nal Processing (MMSP). pp. 1–6. IEEE (2022).https://doi.org/10.1109/ mmsp55362.2022.9949312

work page arXiv 2022

[14] [14]

In: International Workshop PRISM-VO 17 on Multimedia Signal Processing (MMSP)

Fachada, S., Losfeld, A., Senoh, T., Lafruit, G., Teratani, M.: A calibration method for sub-aperture views of plenoptic 2.0 camera arrays. In: International Workshop PRISM-VO 17 on Multimedia Signal Processing (MMSP). pp. 1–6. IEEE (2021).https://doi. org/10.1109/mmsp53017.2021.9733556

work page doi:10.1109/mmsp53017.2021.9733556 2021

[15] [15]

In: DAGM German Conference on Pattern Recognition (GCPR)

Fleith, A., Ahmed, D., Cremers, D., Zeller, N.: LiFCal: Online light field cam- era calibration via bundle adjustment. In: DAGM German Conference on Pattern Recognition (GCPR). pp. 120–136. Springer (2024).https://doi.org/10.1007/ 978-3-031-85187-2_8

2024

[16] [16]

In: International Symposium on Visual Computing (ISVC)

Fleith, A., Zirbel, J., Cremers, D., Zeller, N.: LiFMCR: Dataset and benchmark for light field multi-camera registration. In: International Symposium on Visual Computing (ISVC). Springer Nature Switzerland (2026).https://doi.org/10. 1007/978-3-032-14492-8_35

2026

[17] [17]

Optics Express24(19), 21521–21540 (2016)

Hahne, C., Aggoun, A., Velisavljevic, V., Fiebig, S., Pesch, M.: Refocusing dis- tance of a standard plenoptic camera. Optics Express24(19), 21521–21540 (2016). https://doi.org/10.1364/OE.24.021521

work page doi:10.1364/oe.24.021521 2016

[18] [18]

arXiv preprint arXiv:2012.10714 (2020).https://doi.org/10.48550/ arXiv.2012.10714

Kaveti, P., Singh, H.: A light field front-end for robust SLAM in dynamic envi- ronments. arXiv preprint arXiv:2012.10714 (2020).https://doi.org/10.48550/ arXiv.2012.10714

work page arXiv 2012

[19] [19]

In: International Symposium on Mixed and Augmented Reality (ISMAR)

Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: International Symposium on Mixed and Augmented Reality (ISMAR). pp. 225–

[20] [20]

IEEE (2007).https://doi.org/10.1109/ISMAR.2007.4538852

work page doi:10.1109/ismar.2007.4538852 2007

[21] [21]

In: Conference on Computer Vision and Pattern Recognition (CVPR)

Labussière, M., Teulière, C., Bernardin, F., Ait-Aider, O.: Blur-aware calibration of multi-focus plenoptic cameras. In: Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2542–2551. IEEE (2020).https://doi.org/10.1109/ cvpr42600.2020.00262

work page arXiv 2020

[22] [22]

International journal of computer vi- sion (IJCV)130(7), 1655–1677 (2022).https://doi.org/10.1007/s11263-022- 01582-z

Labussière, M., Teulière, C., Bernardin, F., Ait-Aider, O.: Leveraging blur in- formation for plenoptic camera calibration. International journal of computer vi- sion (IJCV)130(7), 1655–1677 (2022).https://doi.org/10.1007/s11263-022- 01582-z

work page doi:10.1007/s11263-022- 2022

[23] [23]

In: Inter- national Conference on Robotics and Automation (ICRA)

Lasheras-Hernandez, B., Strobl, K.H., Izquierdo, S., Bodenmüller, T., Triebel, R., Civera, J.: Single-shot metric depth from focused plenoptic cameras. In: Inter- national Conference on Robotics and Automation (ICRA). pp. 9566–9573. IEEE (2025).https://doi.org/10.1109/ICRA55743.2025.11128276

work page doi:10.1109/icra55743.2025.11128276 2025

[24] [24]

In: International Con- ference on Computational Photography (ICCP)

Lumsdaine, A., Georgiev, T.: The focused plenoptic camera. In: International Con- ference on Computational Photography (ICCP). pp. 1–8. IEEE (2009).https: //doi.org/10.1109/iccphot.2009.5559008

work page doi:10.1109/iccphot.2009.5559008 2009

[25] [25]

Indiana University and Adobe Systems, Tech

Lumsdaine, A., Georgiev, T., et al.: Full-resolution light field rendering. Indiana University and Adobe Systems, Tech. Rep91, 92 (2008)

2008

[26] [26]

Transactions on Robotics (T-RO)31(5), 1147–1163 (2015).https://doi.org/10.1109/TRO.2015.2463671

Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: A versatile and accu- rate monocular SLAM system. Transactions on Robotics (T-RO)31(5), 1147–1163 (2015).https://doi.org/10.1109/TRO.2015.2463671

work page doi:10.1109/tro.2015.2463671 2015

[27] [27]

In: International Conference on Computer Vision (ICCV)

Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense tracking and map- ping in real-time. In: International Conference on Computer Vision (ICCV). pp. 2320–2327. IEEE (2011).https://doi.org/10.1109/ICCV.2011.6126513

work page doi:10.1109/iccv.2011.6126513 2011

[28] [28]

In: International Conference on Digital Image Computing: Techniques and Applications (DICTA)

Noury, C.A., Teulière, C., Dhome, M.: Light-field camera calibration from raw images. In: International Conference on Digital Image Computing: Techniques and Applications (DICTA). pp. 1–8. IEEE (2017).https://doi.org/10.1109/DICTA. 2017.8227459

work page doi:10.1109/dicta 2017

[29] [29]

In: International conference on 3D vision (3DV)

O’brien, S., Trumpf, J., Ila, V., Mahony, R.: Calibrating light-field cameras using plenoptic disc features. In: International conference on 3D vision (3DV). pp. 286–

[30] [30]

Fleith et al

IEEE (2018).https://doi.org/10.1109/3dv.2018.00041 18 A. Fleith et al

work page doi:10.1109/3dv.2018.00041 2018

[31] [31]

In: Human Vision and Electronic Imaging (HVEI)

Perwass, C., Wietzke, L.: Single-lens 3D camera with extended depth of field. In: Human Vision and Electronic Imaging (HVEI). vol. 8291, pp. 45–59. SPIE (2012). https://doi.org/10.1117/12.909882

work page doi:10.1117/12.909882 2012

[32] [32]

Advances in Neural Information Processing Systems (NeurIPS) 34, 16558–16569 (2021)

Teed, Z., Deng, J.: DROID-SLAM: Deep visual SLAM for monocular, stereo, and RGB-D cameras. Advances in Neural Information Processing Systems (NeurIPS) 34, 16558–16569 (2021)

2021

[33] [33]

Advances in Neural Information Processing Systems (NeurIPS)36, 39033–39051 (2023)

Teed, Z., Lipson, L., Deng, J.: Deep patch visual odometry. Advances in Neural Information Processing Systems (NeurIPS)36, 39033–39051 (2023)

2023

[34] [34]

Robotics and Automation Letters (RA-L)7(2), 1408–1415 (2022)

Von Stumberg, L., Cremers, D.: DM-VIO: Delayed marginalization visual-inertial odometry. Robotics and Automation Letters (RA-L)7(2), 1408–1415 (2022). https://doi.org/10.1109/LRA.2021.3140129

work page doi:10.1109/lra.2021.3140129 2022

[35] [35]

Ego4d: Around the world in 3, 000 hours of egocentric video

Wang, Y., Wang, L., Liang, Z., Yang, J., An, W., Guo, Y.: Occlusion-aware cost constructor for light field depth estimation. In: Conference on Computer Vision and Pattern Recognition (CVPR). pp. 19777–19786. IEEE (2022).https://doi. org/10.1109/cvpr52688.2022.01919

work page doi:10.1109/cvpr52688.2022.01919 2022

[36] [36]

In: Conference on Computer Vision and Pattern Recognition (CVPR)

Xiao, Z., Liu, Y., Gao, R., Xiong, Z.: CutMIB: Boosting light field super-resolution via multi-view image blending. In: Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1672–1682. IEEE (2023).https://doi.org/10.1109/ cvpr52729.2023.00167

work page arXiv 2023

[37] [37]

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Annals)III–3, 449–456 (2016).https://doi.org/10.5194/isprsannals- iii-3-449-2016

Zeller, N., Noury, C.A., Quint, F., Teulière, C., Stilla, U., Dhome, M.: Metric calibration of a focused plenoptic camera based on a 3D calibration target. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Annals)III–3, 449–456 (2016).https://doi.org/10.5194/isprsannals- iii-3-449-2016

work page doi:10.5194/isprsannals- 2016

[38] [38]

Zeller, N., Quint, F., Stilla, U.: Establishing a probabilistic depth map from focused plenopticcameras.In:InternationalConferenceon3DVision(3DV)(2015).https: //doi.org/10.1109/3DV.2015.18

work page doi:10.1109/3dv.2015.18 2015

[39] [39]

Sepasdar, M

Zeller, N., Quint, F., Stilla, U.: Depth estimation and camera calibration of a focused plenoptic camera for visual odometry. ISPRS Journal of Photogrammetry and Remote Sensing (P&RS)118, 83–100 (2016).https://doi.org/10.1016/j. isprsjprs.2016.04.010

work page doi:10.1016/j 2016

[40] [40]

Zeller, N., Quint, F., Stilla, U.: From the calibration of a light-field camera to direct plenopticodometry.JournalofSelectedTopicsinSignalProcessing(JSTSP)11(7), 1004–1019 (2017).https://doi.org/10.1109/jstsp.2017.2737965

work page doi:10.1109/jstsp.2017.2737965 2017

[41] [41]

In: European Conference on Computer Vision (ECCV)

Zeller, N., Quint, F., Stilla, U.: Scale-awareness of light-field camera-based visual odometry. In: European Conference on Computer Vision (ECCV). p. 732–747. Springer (2018).https://doi.org/10.1007/978-3-030-01237-3_44

work page doi:10.1007/978-3-030-01237-3_44 2018

[42] [42]

A Synchronized Stereo and Plenoptic Visual Odometry Dataset

Zeller, N., Quint, F., Stilla, U.: A synchronized stereo and plenoptic visual odome- try dataset. arXiv preprint arXiv:1807.09372 (2018).https://doi.org/10.48550/ arXiv.1807.09372

work page internal anchor Pith review Pith/arXiv arXiv 2018

[43] [43]

Optical Engineering59(7), 073104– 073104 (2020).https://doi.org/10.1117/1.oe.59.7.073104

Zhao, Y., Li, H., Mei, D., Shi, S.: Metric calibration of unfocused plenoptic cam- eras for three-dimensional shape measurement. Optical Engineering59(7), 073104– 073104 (2020).https://doi.org/10.1117/1.oe.59.7.073104

work page doi:10.1117/1.oe.59.7.073104 2020

[44] [44]

Optics and Lasers in Engineering115, 190–196 (2019).https://doi.org/10.1016/j.optlaseng.2018.11.024

Zhou, P., Cai, W., Yu, Y., Zhang, Y., Zhou, G.: A two-step calibration method of lenslet-based light field cameras. Optics and Lasers in Engineering115, 190–196 (2019).https://doi.org/10.1016/j.optlaseng.2018.11.024

work page doi:10.1016/j.optlaseng.2018.11.024 2019