Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

Alessio Tonioni; Luigi Di Stefano; Oscar Rahnama; Philip H. S. Torr; Simon Walker; Stuart Golodetz; Thomas Joy; Tommaso Cavallari

arxiv: 1907.07745 · v1 · pith:MRYWD3MTnew · submitted 2019-07-17 · 💻 cs.CV · eess.IV· eess.SP

Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

Oscar Rahnama , Tommaso Cavallari , Stuart Golodetz , Alessio Tonioni , Thomas Joy , Luigi Di Stefano , Simon Walker , Philip H. S. Torr This is my paper

Pith reviewed 2026-05-24 20:15 UTC · model grok-4.3

classification 💻 cs.CV eess.IVeess.SP

keywords stereo depth estimationFPGAreal-time visionembedded systemsKITTI datasetSGMELASpower efficiency

0 comments

The pith

A hybrid FPGA-CPU chip computes dense stereo depth at over 50 frames per second with 8.7 percent error while drawing only 5 watts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to obtain accurate dense depth from stereo images on embedded hardware that must stay under a strict power limit. It does so by splitting the work of two established stereo algorithms between the FPGA and CPU parts of a single chip so that each runs the parts it handles best. A reader would care because this combination reaches accuracy levels previously seen only on high-power systems while meeting real-time and power constraints typical of mobile robots or drones.

Core claim

The central claim is that a novel stereo method combining the best features of SGM and ELAS on an FPGA-CPU hybrid SoC produces highly accurate dense depth in real time, reaching an 8.7 percent error rate on the KITTI 2015 benchmark at over 50 FPS with a total power draw of only 5 W.

What carries the argument

Partitioning of SGM and ELAS processing steps across the FPGA fabric and CPU cores of the hybrid SoC so that memory-intensive or iterative operations run where they are efficient.

If this is right

Real-time dense depth becomes available on platforms limited to a few watts of power.
Stereo pipelines no longer need to sacrifice accuracy to fit within FPGA resource or timing limits.
Embedded vision systems can now use depth maps whose quality approaches that of full desktop implementations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same split of regular and irregular computation steps could be tried on other hybrid chips for different vision tasks.
Power budgets that once ruled out dense depth sensing may now support it, changing the design space for battery-powered robots.
Multiple such depth pipelines might run concurrently on one low-power SoC if the reported margins hold.

Load-bearing premise

The SGM and ELAS components can be split between FPGA and CPU without large drops in accuracy or violations of real-time timing on the chosen chip.

What would settle it

Running the described implementation on the target SoC and measuring either an error rate above 8.7 percent on KITTI 2015 or power consumption above 5 W at frame rates below 50 FPS would falsify the performance result.

Figures

Figures reproduced from arXiv: 1907.07745 by Alessio Tonioni, Luigi Di Stefano, Oscar Rahnama, Philip H. S. Torr, Simon Walker, Stuart Golodetz, Thomas Joy, Tommaso Cavallari.

**Figure 1.** Figure 1: Overview of our approach. First, we use Fast R3SGM (see §I-A1) to compute disparity images for the input stereo pair (in raster and reverse-raster order). We then flip the right result and perform a left-right consistency check to obtain an accurate but sparse disparity map for the left input image (see §I-A2). Next, as ELAS [11] does, we perform support checking (see §I-B1) to remove points whose disparit… view at source ↗

**Figure 2.** Figure 2: Comparing the implicit biases that exist in raster and [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 4.** Figure 4: The results of performing a consolidating consistency [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 3.** Figure 3: An example showing the effects of performing L/R [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 5.** Figure 5: The results of performing a redundancy check on the [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: The plane priors produced by constructing a Delaunay [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗

read the original abstract

Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of the more advanced algorithms, e.g. those that rely on unpredictable access to memory or are highly iterative in nature, are difficult to deploy efficiently on FPGAs, and thus the depth quality that can be achieved is limited. In this paper, we leverage a FPGA-CPU chip to propose a novel, sophisticated, stereo approach that combines the best features of SGM and ELAS-based methods to compute highly accurate dense depth in real time. Our approach achieves an 8.7% error rate on the challenging KITTI 2015 dataset at over 50 FPS, with a power consumption of only 5W.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper delivers a working hybrid FPGA-CPU SGM-ELAS stereo system with 8.7% KITTI error at >50 FPS and 5W, but the claim that the split preserves accuracy rests on unshown details about data movement and timing.

read the letter

The one thing to know is that this paper gives a concrete FPGA-CPU hybrid stereo pipeline that combines SGM and ELAS and reports 8.7% error on KITTI 2015 at over 50 FPS while using only 5W. That set of numbers on a standard benchmark plus the power figure is the usable takeaway for anyone who needs embedded depth under tight constraints. What is new is the specific partitioning: they map the unpredictable-memory or iterative parts of ELAS to the CPU and run the rest on the FPGA. The work does well by shipping actual measured performance instead of just simulation or theory, and by staying within existing algorithms rather than claiming a new core method. The soft spot is exactly the one the stress-test flags. The abstract asserts that the hybrid split keeps the accuracy of the combined pipeline without major loss or timing violations, yet it supplies no ablation, no comparison of hybrid error versus full-CPU error, and no breakdown of FPGA-CPU data movement or synchronization cost. If the full paper contains those measurements and shows the hybrid version stays close to the accuracy of the unsplit method, the central claim holds; if those checks are missing or weak, the 8.7% figure cannot be read as strong evidence that the architecture itself succeeds. The rest of the paper looks like straightforward engineering with no obvious circularity or invented entities. This is for readers who build real-time stereo on power-limited platforms such as robots or drones. A person looking for practical low-power depth solutions will get value from the numbers and the partitioning strategy. It deserves a serious referee because the result is concrete, the hardware target is realistic, and the benchmark is public.

Referee Report

1 major / 0 minor

Summary. The paper proposes a hybrid FPGA-CPU stereo depth system that combines SGM and ELAS by mapping unpredictable-memory and iterative ELAS components to the CPU while running the remainder on the FPGA, claiming an 8.7% error rate on KITTI 2015 at >50 FPS with 5 W power draw.

Significance. If the hybrid partitioning demonstrably preserves the accuracy of the combined SGM+ELAS pipeline without introducing latency or synchronization artifacts that violate the real-time bound on the target SoC, the result would be a meaningful contribution to embedded vision, showing how advanced stereo algorithms can be deployed at low power without the usual accuracy trade-offs.

major comments (1)

[Abstract] Abstract (and any corresponding method section): the headline 8.7% KITTI 2015 error is presented as evidence that the FPGA-CPU split succeeds, yet no ablation, timing profile, or disparity-map comparison is supplied to show that off-loading the iterative ELAS stages to the CPU preserves accuracy or meets the >50 FPS bound on the specific SoC; without this quantitative support the central claim cannot be evaluated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract (and any corresponding method section): the headline 8.7% KITTI 2015 error is presented as evidence that the FPGA-CPU split succeeds, yet no ablation, timing profile, or disparity-map comparison is supplied to show that off-loading the iterative ELAS stages to the CPU preserves accuracy or meets the >50 FPS bound on the specific SoC; without this quantitative support the central claim cannot be evaluated.

Authors: We agree that the abstract and method section would benefit from explicit quantitative support for the hybrid partitioning. The results section of the manuscript reports the end-to-end accuracy and frame rate achieved on the target SoC, but does not include a dedicated ablation isolating the effect of CPU off-loading. In the revised version we will add an ablation comparing the hybrid system to a pure-FPGA baseline, detailed per-stage timing profiles, and side-by-side disparity-map visualizations to confirm that accuracy and the >50 FPS bound are preserved. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical hardware result with no derivation chain

full rationale

The paper reports an empirical implementation result (8.7% KITTI error at >50 FPS on 5W FPGA-CPU SoC) obtained by partitioning SGM and ELAS components. No equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or described approach. The central claim rests on measured hardware performance against an external benchmark (KITTI 2015), which is falsifiable outside any internal construction. This matches the default case of a self-contained empirical paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no free parameters, axioms, or invented entities are identifiable from the provided text.

pith-pipeline@v0.9.0 · 5731 in / 1037 out tokens · 18497 ms · 2026-05-24T20:15:38.250712+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 2 internal anchors

[1]

ElasticFusion: Real-Time Dense SLAM and Light Source Estimation,

T. Whelan, R. F. Salas-Moreno, B. Glocker, A. J. Davison, and S. Leutenegger, “ElasticFusion: Real-Time Dense SLAM and Light Source Estimation,” IJRR, vol. 35, no. 14, pp. 1697–1716, 2016

work page 2016
[2]

InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

V . A. Prisacariu, O. K ¨ahler, S. Golodetz, M. Sapienza, T. Cavallari, P. H. S. Torr, and D. W. Murray, “InﬁniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure,” arXiv preprint arXiv:1708.00783v1, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[3]

Collaborative Large-Scale Dense 3D Recon- struction with Online Inter-Agent Pose Optimisation,

S. Golodetz ∗, T. Cavallari ∗, N. A. Lord ∗, V . A. Prisacariu, D. W. Murray, and P. H. S. Torr, “Collaborative Large-Scale Dense 3D Recon- struction with Online Inter-Agent Pose Optimisation,” TVCG, vol. 24, no. 11, 2018

work page 2018
[4]

Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images,

J. Shotton, B. Glocker, C. Zach, S. Izadi, A. Criminisi, and A. Fitzgib- bon, “Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images,” in CVPR, 2013, pp. 2930–2937

work page 2013
[5]

On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation,

T. Cavallari, S. Golodetz*, N. A. Lord*, J. Valentin, L. D. Stefano, and P. H. S. Torr, “On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation,” in CVPR, 2017, pp. 4457–4466

work page 2017
[6]

Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

T. Cavallari*, S. Golodetz*, N. A. Lord*, J. Valentin*, V . A. Prisacariu, L. D. Stefano, and P. H. S. Torr, “Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade,” arXiv preprint arXiv:1810.12163, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[7]

A Depth-Based Head-Mounted Visual Display to Aid Navigation in Partially Sighted Individuals,

S. L. Hicks, I. Wilson, L. Muhammed, J. Worsfold, S. M. Downes, and C. Kennard, “A Depth-Based Head-Mounted Visual Display to Aid Navigation in Partially Sighted Individuals,” PLoS ONE , 2013

work page 2013
[8]

A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,

D. Scharstein and R. Szeliski, “A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,” International Journal of Computer Vision , vol. 47, no. 1-3, pp. 7–42, 2002

work page 2002
[9]

Review of Stereo Vision Algorithms and their Suitability for Resource-Limited Systems,

B. Tippetts, D. J. Lee, K. Lillywhite, and J. Archibald, “Review of Stereo Vision Algorithms and their Suitability for Resource-Limited Systems,” Journal of Real-Time Image Processing , vol. 11, no. 1, pp. 5–25, 2016

work page 2016
[10]

Stereo Processing by Semiglobal Matching and Mu- tual Information,

H. Hirschmuller, “Stereo Processing by Semiglobal Matching and Mu- tual Information,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 328–341, 2008

work page 2008
[11]

Efﬁcient Large-Scale Stereo Matching,

A. Geiger, M. Roser, and R. Urtasun, “Efﬁcient Large-Scale Stereo Matching,” in Computer Vision–ACCV 2010. Springer, 2010, pp. 25–38

work page 2010
[12]

R 3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems,

O. Rahnama, T. Cavallari ∗, S. Golodetz ∗, S. Walker, and P. H. S. Torr, “R 3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems,” in FPT, 2018

work page 2018
[13]

Real-Time Semi-Global Matching on the CPU,

S. K. Gehrig and C. Rabe, “Real-Time Semi-Global Matching on the CPU,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on . IEEE, 2010, pp. 85–92

work page 2010
[14]

Real-Time Semi-Global Matching Disparity Estimation on the GPU,

C. Banz, H. Blume, and P. Pirsch, “Real-Time Semi-Global Matching Disparity Estimation on the GPU,” inComputer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on . IEEE, 2011, pp. 514–521

work page 2011
[15]

Embedded real-time stereo estimation via Semi- Global Matching on the GPU,

D. Hernandez-Juarez, A. Chac ´on, A. Espinosa, D. V´azquez, J. C. Moure, and A. M. L ´opez, “Embedded real-time stereo estimation via Semi- Global Matching on the GPU,” Procedia Computer Science , vol. 80, 2016

work page 2016
[16]

Design of Real- Time FPGA-based Embedded System for Stereo Vision,

S. Perri, F. Frustaci, F. Spagnolo, and P. Corsonello, “Design of Real- Time FPGA-based Embedded System for Stereo Vision,” in Circuits and Systems (ISCAS), 2018 IEEE International Symposium on . IEEE, 2018, pp. 1–5

work page 2018
[17]

Real-time depth processing for embedded platforms,

O. Rahnama, A. Makarov, and P. Torr, “Real-time depth processing for embedded platforms,” in Real-Time Image and Video Processing 2017 , vol. 10223. International Society for Optics and Photonics, 2017, p. 102230N

work page 2017
[18]

Real-time high-deﬁnition stereo matching on FPGA,

L. Zhang, K. Zhang, T. S. Chang, G. Lafruit, G. K. Kuzmanov, and D. Verkest, “Real-time high-deﬁnition stereo matching on FPGA,” in Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays . ACM, 2011, pp. 55–64

work page 2011
[19]

Stereo vision architecture for heterogeneous systems-on-chip,

S. Perri, F. Frustaci, F. Spagnolo, and P. Corsonello, “Stereo vision architecture for heterogeneous systems-on-chip,” Journal of Real-Time Image Processing, pp. 1–23, 2018

work page 2018
[20]

High-quality real-time hardware stereo matching based on guided image ﬁltering,

C. Ttoﬁs and T. Theocharides, “High-quality real-time hardware stereo matching based on guided image ﬁltering,” in Proceedings of the Conference on Design, Automation & Test in Europe . European Design and Automation Association, 2014, p. 356

work page 2014
[21]

FPGA based real-time on-road stereo vision system,

M. Dehnavi and M. Eshghi, “FPGA based real-time on-road stereo vision system,” Journal of Systems Architecture , vol. 81, pp. 32–43, 2017

work page 2017
[22]

A real-time global stereo-matching on FPGA,

D. Zha, X. Jin, and T. Xiang, “A real-time global stereo-matching on FPGA,” Microprocessors and Microsystems, vol. 47, pp. 419–428, 2016

work page 2016
[23]

A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching,

S. K. Gehrig, F. Eberli, and T. Meyer, “A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching,” in International Conference on Computer Vision Systems . Springer, 2009, pp. 134–143

work page 2009
[24]

Real-Time Stereo Vision System using Semi-Global Matching Disparity Estima- tion: Architecture and FPGA-Implementation,

C. Banz, S. Hesselbarth, H. Flatt, H. Blume, and P. Pirsch, “Real-Time Stereo Vision System using Semi-Global Matching Disparity Estima- tion: Architecture and FPGA-Implementation,” in Embedded Computer Systems (SAMOS), 2010 International Conference on . IEEE, 2010, pp. 93–101

work page 2010
[25]

A passive RGBD sensor for accurate and real-time depth sensing self-contained into an FPGA,

S. Mattoccia and M. Poggi, “A passive RGBD sensor for accurate and real-time depth sensing self-contained into an FPGA,” in Proceedings of the 9th International Conference on Distributed Smart Cameras . ACM, 2015, pp. 146–151

work page 2015
[26]

Real-time and Low Latency Embedded Computer Vision Hardware Based on a Combination of FPGA and Mobile CPU,

D. Honegger, H. Oleynikova, and M. Pollefeys, “Real-time and Low Latency Embedded Computer Vision Hardware Based on a Combination of FPGA and Mobile CPU,” in Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on . IEEE, 2014, pp. 4930–4935

work page 2014
[27]

Real-Time High- Quality Stereo Vision System in FPGA,

W. Wang, J. Yan, N. Xu, Y . Wang, and F.-H. Hsu, “Real-Time High- Quality Stereo Vision System in FPGA,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 10, pp. 1696–1708, 2015

work page 2015
[28]

Real-Time Dense Stereo Matching With ELAS on FPGA-Accelerated Embedded De- vices,

O. Rahnama, D. Frost, O. Miksik, and P. H. Torr, “Real-Time Dense Stereo Matching With ELAS on FPGA-Accelerated Embedded De- vices,” IEEE Robotics and Automation Letters , vol. 3, no. 3, pp. 2008– 2015, 2018

work page 2008
[29]

Joint 3D Estimation of Vehicles and Scene Flow,

M. Menze, C. Heipke, and A. Geiger, “Joint 3D Estimation of Vehicles and Scene Flow,” in ISPRS Workshop on Image Sequence Analysis (ISA) , 2015

work page 2015
[30]

Object Scene Flow,

——, “Object Scene Flow,” ISPRS Journal of Photogrammetry and Remote Sensing (JPRS) , 2018

work page 2018
[31]

End-to-end Learning of Cost-V olume Aggregation for Real-time Dense Stereo,

A. Kuzmin, D. Mikushin, and V . Lempitsky, “End-to-end Learning of Cost-V olume Aggregation for Real-time Dense Stereo,” in MLSP, 2017

work page 2017

[1] [1]

ElasticFusion: Real-Time Dense SLAM and Light Source Estimation,

T. Whelan, R. F. Salas-Moreno, B. Glocker, A. J. Davison, and S. Leutenegger, “ElasticFusion: Real-Time Dense SLAM and Light Source Estimation,” IJRR, vol. 35, no. 14, pp. 1697–1716, 2016

work page 2016

[2] [2]

InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

V . A. Prisacariu, O. K ¨ahler, S. Golodetz, M. Sapienza, T. Cavallari, P. H. S. Torr, and D. W. Murray, “InﬁniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure,” arXiv preprint arXiv:1708.00783v1, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[3] [3]

Collaborative Large-Scale Dense 3D Recon- struction with Online Inter-Agent Pose Optimisation,

S. Golodetz ∗, T. Cavallari ∗, N. A. Lord ∗, V . A. Prisacariu, D. W. Murray, and P. H. S. Torr, “Collaborative Large-Scale Dense 3D Recon- struction with Online Inter-Agent Pose Optimisation,” TVCG, vol. 24, no. 11, 2018

work page 2018

[4] [4]

Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images,

J. Shotton, B. Glocker, C. Zach, S. Izadi, A. Criminisi, and A. Fitzgib- bon, “Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images,” in CVPR, 2013, pp. 2930–2937

work page 2013

[5] [5]

On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation,

T. Cavallari, S. Golodetz*, N. A. Lord*, J. Valentin, L. D. Stefano, and P. H. S. Torr, “On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation,” in CVPR, 2017, pp. 4457–4466

work page 2017

[6] [6]

Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

T. Cavallari*, S. Golodetz*, N. A. Lord*, J. Valentin*, V . A. Prisacariu, L. D. Stefano, and P. H. S. Torr, “Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade,” arXiv preprint arXiv:1810.12163, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[7] [7]

A Depth-Based Head-Mounted Visual Display to Aid Navigation in Partially Sighted Individuals,

S. L. Hicks, I. Wilson, L. Muhammed, J. Worsfold, S. M. Downes, and C. Kennard, “A Depth-Based Head-Mounted Visual Display to Aid Navigation in Partially Sighted Individuals,” PLoS ONE , 2013

work page 2013

[8] [8]

A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,

D. Scharstein and R. Szeliski, “A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,” International Journal of Computer Vision , vol. 47, no. 1-3, pp. 7–42, 2002

work page 2002

[9] [9]

Review of Stereo Vision Algorithms and their Suitability for Resource-Limited Systems,

B. Tippetts, D. J. Lee, K. Lillywhite, and J. Archibald, “Review of Stereo Vision Algorithms and their Suitability for Resource-Limited Systems,” Journal of Real-Time Image Processing , vol. 11, no. 1, pp. 5–25, 2016

work page 2016

[10] [10]

Stereo Processing by Semiglobal Matching and Mu- tual Information,

H. Hirschmuller, “Stereo Processing by Semiglobal Matching and Mu- tual Information,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 328–341, 2008

work page 2008

[11] [11]

Efﬁcient Large-Scale Stereo Matching,

A. Geiger, M. Roser, and R. Urtasun, “Efﬁcient Large-Scale Stereo Matching,” in Computer Vision–ACCV 2010. Springer, 2010, pp. 25–38

work page 2010

[12] [12]

R 3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems,

O. Rahnama, T. Cavallari ∗, S. Golodetz ∗, S. Walker, and P. H. S. Torr, “R 3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems,” in FPT, 2018

work page 2018

[13] [13]

Real-Time Semi-Global Matching on the CPU,

S. K. Gehrig and C. Rabe, “Real-Time Semi-Global Matching on the CPU,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on . IEEE, 2010, pp. 85–92

work page 2010

[14] [14]

Real-Time Semi-Global Matching Disparity Estimation on the GPU,

C. Banz, H. Blume, and P. Pirsch, “Real-Time Semi-Global Matching Disparity Estimation on the GPU,” inComputer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on . IEEE, 2011, pp. 514–521

work page 2011

[15] [15]

Embedded real-time stereo estimation via Semi- Global Matching on the GPU,

D. Hernandez-Juarez, A. Chac ´on, A. Espinosa, D. V´azquez, J. C. Moure, and A. M. L ´opez, “Embedded real-time stereo estimation via Semi- Global Matching on the GPU,” Procedia Computer Science , vol. 80, 2016

work page 2016

[16] [16]

Design of Real- Time FPGA-based Embedded System for Stereo Vision,

S. Perri, F. Frustaci, F. Spagnolo, and P. Corsonello, “Design of Real- Time FPGA-based Embedded System for Stereo Vision,” in Circuits and Systems (ISCAS), 2018 IEEE International Symposium on . IEEE, 2018, pp. 1–5

work page 2018

[17] [17]

Real-time depth processing for embedded platforms,

O. Rahnama, A. Makarov, and P. Torr, “Real-time depth processing for embedded platforms,” in Real-Time Image and Video Processing 2017 , vol. 10223. International Society for Optics and Photonics, 2017, p. 102230N

work page 2017

[18] [18]

Real-time high-deﬁnition stereo matching on FPGA,

L. Zhang, K. Zhang, T. S. Chang, G. Lafruit, G. K. Kuzmanov, and D. Verkest, “Real-time high-deﬁnition stereo matching on FPGA,” in Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays . ACM, 2011, pp. 55–64

work page 2011

[19] [19]

Stereo vision architecture for heterogeneous systems-on-chip,

S. Perri, F. Frustaci, F. Spagnolo, and P. Corsonello, “Stereo vision architecture for heterogeneous systems-on-chip,” Journal of Real-Time Image Processing, pp. 1–23, 2018

work page 2018

[20] [20]

High-quality real-time hardware stereo matching based on guided image ﬁltering,

C. Ttoﬁs and T. Theocharides, “High-quality real-time hardware stereo matching based on guided image ﬁltering,” in Proceedings of the Conference on Design, Automation & Test in Europe . European Design and Automation Association, 2014, p. 356

work page 2014

[21] [21]

FPGA based real-time on-road stereo vision system,

M. Dehnavi and M. Eshghi, “FPGA based real-time on-road stereo vision system,” Journal of Systems Architecture , vol. 81, pp. 32–43, 2017

work page 2017

[22] [22]

A real-time global stereo-matching on FPGA,

D. Zha, X. Jin, and T. Xiang, “A real-time global stereo-matching on FPGA,” Microprocessors and Microsystems, vol. 47, pp. 419–428, 2016

work page 2016

[23] [23]

A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching,

S. K. Gehrig, F. Eberli, and T. Meyer, “A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching,” in International Conference on Computer Vision Systems . Springer, 2009, pp. 134–143

work page 2009

[24] [24]

Real-Time Stereo Vision System using Semi-Global Matching Disparity Estima- tion: Architecture and FPGA-Implementation,

C. Banz, S. Hesselbarth, H. Flatt, H. Blume, and P. Pirsch, “Real-Time Stereo Vision System using Semi-Global Matching Disparity Estima- tion: Architecture and FPGA-Implementation,” in Embedded Computer Systems (SAMOS), 2010 International Conference on . IEEE, 2010, pp. 93–101

work page 2010

[25] [25]

A passive RGBD sensor for accurate and real-time depth sensing self-contained into an FPGA,

S. Mattoccia and M. Poggi, “A passive RGBD sensor for accurate and real-time depth sensing self-contained into an FPGA,” in Proceedings of the 9th International Conference on Distributed Smart Cameras . ACM, 2015, pp. 146–151

work page 2015

[26] [26]

Real-time and Low Latency Embedded Computer Vision Hardware Based on a Combination of FPGA and Mobile CPU,

D. Honegger, H. Oleynikova, and M. Pollefeys, “Real-time and Low Latency Embedded Computer Vision Hardware Based on a Combination of FPGA and Mobile CPU,” in Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on . IEEE, 2014, pp. 4930–4935

work page 2014

[27] [27]

Real-Time High- Quality Stereo Vision System in FPGA,

W. Wang, J. Yan, N. Xu, Y . Wang, and F.-H. Hsu, “Real-Time High- Quality Stereo Vision System in FPGA,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 10, pp. 1696–1708, 2015

work page 2015

[28] [28]

Real-Time Dense Stereo Matching With ELAS on FPGA-Accelerated Embedded De- vices,

O. Rahnama, D. Frost, O. Miksik, and P. H. Torr, “Real-Time Dense Stereo Matching With ELAS on FPGA-Accelerated Embedded De- vices,” IEEE Robotics and Automation Letters , vol. 3, no. 3, pp. 2008– 2015, 2018

work page 2008

[29] [29]

Joint 3D Estimation of Vehicles and Scene Flow,

M. Menze, C. Heipke, and A. Geiger, “Joint 3D Estimation of Vehicles and Scene Flow,” in ISPRS Workshop on Image Sequence Analysis (ISA) , 2015

work page 2015

[30] [30]

Object Scene Flow,

——, “Object Scene Flow,” ISPRS Journal of Photogrammetry and Remote Sensing (JPRS) , 2018

work page 2018

[31] [31]

End-to-end Learning of Cost-V olume Aggregation for Real-time Dense Stereo,

A. Kuzmin, D. Mikushin, and V . Lempitsky, “End-to-end Learning of Cost-V olume Aggregation for Real-time Dense Stereo,” in MLSP, 2017

work page 2017