Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Tom Joy, Luigi Di Stefano, Simon Walker and Philip H. S. Torr
Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of the more advanced algorithms, e.g., those that rely on unpredictable access to memory or are highly iterative in nature, are difficult to deploy efficiently on FPGAs, and thus the depth quality that can be achieved is limited. In this brief, we leverage an FPGA-CPU chip to propose a novel, sophisticated, stereo approach that combines the best features of semi-global matching and ELAS-based methods to compute highly accurate dense depth in real time. Our approach achieves an 8.7% error rate on the challenging KITTI 2015 dataset at over 50 frames/s, with a power consumption of only 5 W.