الگوریتم تطبیق استریوی سریع مناسب برای سیستم های جاسازی شده زمان واقعی
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|7257||2010||23 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computer Vision and Image Understanding, Volume 114, Issue 11, November 2010, Pages 1180–1202
In this paper, the challenge of fast stereo matching for embedded systems is tackled. Limited resources, e.g. memory and processing power, and most importantly real-time capability on embedded systems for robotic applications, do not permit the use of most sophisticated stereo matching approaches. The strengths and weaknesses of different matching approaches have been analyzed and a well-suited solution has been found in a Census-based stereo matching algorithm. The novelty of the algorithm used is the explicit adaption and optimization of the well-known Census transform in respect to embedded real-time systems in software. The most important change in comparison with the classic Census transform is the usage of a sparse Census mask which halves the processing time with nearly unchanged matching quality. This is due the fact that large sparse Census masks perform better than small dense masks with the same processing effort. The evidence of this assumption is given by the results of experiments with different mask sizes. Another contribution of this work is the presentation of a complete stereo matching system with its correlation-based core algorithm, the detailed analysis and evaluation of the results, and the optimized high speed realization on different embedded and PC platforms. The algorithm handles difficult areas for stereo matching, such as areas with low texture, very well in comparison to state-of-the-art real-time methods. It can successfully eliminate false positives to provide reliable 3D data. The system is robust, easy to parameterize and offers high flexibility. It also achieves high performance on several, including resource-limited, systems without losing the good quality of stereo matching. A detailed performance analysis of the algorithm is given for optimized reference implementations on various commercial of the shelf (COTS) platforms, e.g. a PC, a DSP and a GPU, reaching a frame rate of up to 75 fps for 640 × 480 images and 50 disparities. The matching quality and processing time is compared to other algorithms on the Middlebury stereo evaluation website reaching a middle quality and top performance rank. Additional evaluation is done by comparing the results with a very fast and well-known sum of absolute differences algorithm using several Middlebury datasets and real-world scenarios.
For modern mobile robot platforms, dependable and embedded perception modules are important for successful autonomous operations like navigation, visual servoing, or grasping. Especially 3D information about the area around the robot is crucial for reliable operations in human environments. State-of-the-art sensors such as laser scanners or time-of-flight methods deliver 3D information, that is either rough or has low resolution with respect to time and space. Stereo vision is a technology that is well suited for delivering a precise description within its field of view. Stereo is purely a passive technology that primarily uses only two cameras and a processing unit to do the matching and 3D reconstruction. However, for extracting dense and reliable 3D information from the observed scene, stereo matching algorithms are computationally intensive and require a high-end hardware resources. Integrating such an algorithm in an embedded system, which is in fact limited in resources, scale, and energy consumption, is a delicate task. The real-time requirements of most robot applications complicate the realization of such a vision system as well. The key to success in realizing a reliable embedded real-time-capable stereo vision system is the careful design of the core algorithm. The trade-off between execution time and quality of the matching must be handled with care and is a difficult task. The definition of the term real-time by Kopetz  which means that a task has to be finished within an a priori defined time frame is extended in this work. Additionally, demands on fast (at least 10 fps), constant, and scene-independent processing time are made. In this paper the challenge of fast stereo matching suitable for embedded real-time systems is tackled. An adapted, high speed and quality stereo matching algorithm especially optimized for embedded systems is presented. Furthermore, an evaluation of the results using the Middlebury stereo evaluation website and real-world scenarios is given and experimental results of reference implementations on a Personal Computer (PC), a Digital Signal Processor (DSP) and a Graphics Processing Unit (GPU) are presented. The remainder of this paper is organized as follows: Section 2 introduces a summary of the fundamentals of stereo vision and the state-of-the-art in stereo matching algorithms. Section 3 gives a detailed description of the proposed real-time stereo engine. The algorithm’s parameters are analyzed in detail in Section 4 and Section 5 shows the reference implementations on a PC, a DSP and a GPU. Finally, Section 6 presents evaluation results of our algorithm and Section 7 concludes the paper and gives an outlook to future research.
نتیجه گیری انگلیسی
After a general overview of stereo matching algorithms and systems, in this paper an algorithm for fast, Census-based stereo matching on embedded systems is presented. Because of the gain in processing time and the insignificant loss in quality, a sparse Census transform is used. The algorithm has been implemented on a PC, a GPU and a DSP. All implementations, aside from the plain software, reach real-time performance, whereby the GPU is by far the fastest but has the highest power consumption. The resulting disparity maps are evaluated on the Middlebury stereo website and perform well in comparison to other real-time algorithms. Especially in terms of processing time, the proposed algorithm outperformes the other real-time algorithms. The algorithm and its according reference implementations have several strengths. First of all, the proposed algorithm achieves high performance on several, including resource-limited, systems without losing good quality of stereo matching. The algorithm itself is robust, easy to parameterize and it delivers a good matching quality under real-world conditions. The implementations offer high flexibility in terms of image dimensions, disparity range, image bit-depth and frame rates, enabling the use of a wide variety of camera hardware. As a pure software solution, for embedded and non-embedded systems, it is able to run on a broad spectrum of COTS platforms which enables cost efficient stereo sensing systems as well as the integration of additional functionality in existing platforms. Especially at disparity discontinuities, object borders and textureless areas, an improvement of the proposed algorithm would be of interest. Therefore, an integration of advanced matching techniques, such as global optimization approaches, will be investigated to improve the quality of the algorithm. Furthermore, a more sophisticated costs aggregation strategy could lead to better results. Indoor environments often suffer from difficult lighting conditions, so high dynamic range cameras will be used in the future to capture stereo image pairs. A processing time improvement for DSPs can be achieved by the use of upcoming multi-core DSPs.