الگوریتم هیوریستیک برای ردیابی بصری در اشیاء تغییر پذیر
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
8021 | 2011 | 14 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Journal of Visual Communication and Image Representation, Volume 22, Issue 6, August 2011, Pages 465–478
چکیده انگلیسی
Many vision problems require fast and accurate tracking of objects in dynamic scenes. These problems can be formulated as exploration problems and thus can be expressed as a search into a state space based approach. However, these problems are hard to solve because they involve search through a space of transformations corresponding to all the possible motion and deformation. In this paper, we propose a heuristic algorithm through the space of transformations for computing target 2D motion. Three features are combined in order to compute efficient motion: (1) a quality of function match based on a holistic similarity measurement, (2) Kullback–Leibler measure as heuristic to guide the search process and (3) incorporation of target dynamics into the search process for computing the most promising search alternatives. Once 2D motion has been calculated, the result value of the quality of function match computed is used with the purpose of verifying template updates. A template will be updated only when the target object has evolved to a transformed shape dissimilar with respect to the actual shape. Also, a short-term memory subsystem is included with the purpose of recovering previous views of the target object. The paper includes experimental evaluations with video streams that illustrate the efficiency and suitability for real-time vision based tasks in unrestricted environments.
مقدمه انگلیسی
Template tracking is a basic task in visual systems whose main goal is focused on detection and tracking a mobile object of interest in a dynamic vision context given one or several explicit templates that represent the target object. If an active vision approach is considered, it is also desirable that the tracking process keeps the object of interest centered in the image, moving the sensor adequately [1] and [2]. At present, there are still obstacles in achieving all-purpose and robust tracker approaches. Four main issues must be addressed in order to carry out an effective template tracking approach: (1) Real-time performance. Real-time template tracking is a critical task in many computer vision applications such as vision based interface tasks [3], visual surveillance [35], traffic control [36], navigation tasks for autonomous robots [37], gesture based human–computer-interaction [38], perceptual intelligence applications [4], virtual and augmented reality systems [39] or applications from the “looking and people” domain [5]. Moreover, in real-time applications not all system resources can be allocated for tracking processes because other high-level tasks such as trajectory interpretation and reasoning can be demanded. Therefore, it is desirable to adjust the requirements of the computational cost of a tracker approach to be as low as possible to make feasible real-time performance over general purpose hardware. (2) Initialisation. Many template based tracking approaches are focused on the use of a manual initialisation. Some approaches often assume that the template which represents the target object is correctly aligned in the first frame [6]. Other approaches select the reference templates by a hand-drawn prototype template, i.e., an ellipse outline for faces [7] and [8] or they are extracted from a set of examples such as level appearance [9] or outlines [10] and [11]. Moreover, the condensation algorithm [10] also requires training using the object moving over an uncluttered background to learn the motion model parameters before it can be applied to the real scene. However, these selection processes restrict its use in many practical embedded applications. Therefore, quick and transparent initializations without user participation are required. (3) Matching. Template matching is the process in which a reference template T(k) is searched for in an input image I(k) to determine its location and occurrence. Over the last decade, different approaches based on searching the space of transformations using a measurement similarity have been proposed for template based matching. Some of them explicitly establish point correspondences between two shapes and subsequently find a transformation that aligns these shapes [12] and [13]. The iteration of these two steps involves the use of algorithms such as iterated closest points (ICP) [14] and [15] or shape context matching [13]. However, these methods require a good initial alignment in order to converge, particularly whether the image contains a cluttered background. Other approaches are based on searching the space of transformations using Hausdorff matching [16], which are based on an exhaustive search that works by subdividing the space of transformations in order to find the transformation that matches the template position into the current image. Also, similar techniques have been used for tracking selected targets in natural scenes [30] and for person tracking using an autonomous robot [31]. However, no heuristic functions and no target dynamics have been combined in the search process. This situation leads to an increase of the computational costs in the tracking process. (4) Updating. The underlying assumption behind several template tracking approaches is that the appearance of the object remains the same through the entire video [17], [18] and [19]. This assumption is generally reasonable for a certain period of time and a naive solution to this problem is updating the template every frame [30], [31] and [32] or every n frames [33] with a new template extracted from the current image. However, small errors can be introduced in the location of the template each time the template is updated and this situation establishes that the template gradually drifts away from the object [20]. Matthews et al. in [20] propose a solution to this problem. However, their solution approach only addresses the issue related to objects whose visibility does not change while they are being tracked. In this paper, a template based solution for fast and accurate tracking of moving objects is proposed. The main contributions are focused on: (1) an A∗ search algorithm that uses the Kullback–Leibler measurement as heuristic to guide the search process for efficient matching of the target position, (2) dynamic update of the search space in each image, whose corresponding dimension is determined by the target dynamics, dramatically reducing the number of possible search alternatives, (3) updating templates only when the target object has evolved to a new shape change significantly dissimilar with respect to the current template in order to solve the drift problem and (4) representation of illustrative views of the target shape evolution through a short-term memory subsystem. As a result, the first two contributions provide a fast algorithm to apply over a space of transformations for computing target 2D motion and the other two contributions provide robust tracking because accurate template updating can be performed. In addition to these contributions, the paper also contains a number of experimental evaluations and comparisons: • A direct comparison of the performance of conventional search approaches [16] that work by subdividing transformations space and the proposed A∗ search approach that incorporates target dynamics and heuristic to guide the search process, demonstrating that A∗ search based approach is faster. • An empirical comparison of updating templates using a continuous updating approach like that proposed in [30], [31] and [32] and the updating template approach that is proposed in this paper, demonstrating that no updating templates in every frame and using a dynamic short-term memory subsystem, lead to a more robust tracking approach. • An analysis of the time required for computing the proposed template matching and updating approach, illustrating that the time to track targets in video streams is lower than real-time requirements. The structure of this paper is as follows: the problem formulation is illustrated in Section 2. In Section 3, the heuristic algorithm for computing target position is described. The updating reference template problem is detailed in Section 4. Experimental results are provided in Sections 5 and 6 concludes the paper.
نتیجه گیری انگلیسی
This paper is concerned with fast and accurate tracking of arbitrary shapes in video streams without any assumption of the speed and trajectory of the objects. The described approach does not need a priori a 2D template of the object to be tracked. The major aspects of the approach are focused on decomposition of the transformation between frames of a 2D object moving in 3D space into two parts: (1) a 2D motion corresponding to the new target position and (2) a 2D shape change. It is proposed an A∗ search framework in the space of transformations to compute efficient target motion that uses the Kullback–Leibler measurement as heuristic to guide the search process. The most promising initial search alternatives are computed through the incorporation of target dynamics. 2D shape change is captured with 2D templates that evolve with time. These templates are only updated when the target object has evolved to a new shape change. Also, the representative temporal variations of the target shape are enclosed in a short-term memory subsystem. The proposed template based tracking system has been tested and has been empirically proved that: (1) computational cost of visual tracking for an arbitrary target shape is related directly to the set of transformations. Real-time performance using general purpose hardware can be achieved when an A∗ search strategy and an adjustable-size based set of transformations are used, (2) the heuristic search proposed is faster than previous search strategies with similar features in an average rate three times better, allowing real-time performance, (3) although abrupt motions cannot be predicted by an alpha-beta filtering approach, the tracker performance was well adapted to the non-stationary character of the person’s movement which alternates abruptly between slow and fast motion such as the People sequence, (4) target shape evolution introduces, in certain situations, views that cannot be matched in the current image I(k) using the A∗ heuristic search. These situations are presented when the target shape is represented by some sparse edge points and its size dimension is extremely reduced due to disappearance and reappearance conditions from the current image I(k), such as illustrated in the Car sequence. In these situations, the use of color cue is required in order to avoid the loss of the target and (5) updating templates using combined results focused on the value of the quality of function match and the use of a short-term memory lead to accurate template based tracking. This work leaves a number of open possibilities that may be worth further research, among others it may be interesting to consider: (i) further processing on input images in order to reduce the illumination sensitivity by means of the use of an anisotropic diffusion filter instead of the Gaussian filter that is used by the classical Canny detector in order to obtain better edge detection and the proposal of a dynamic reformulation of the Canny edge detector replacing the hysteresis step by a dynamic threshold process in order to reduce blinking effect of edges during successive frames and as consequence generate more stable edge sequences. The study of known models for varying illuminations such as [41] can also be interesting in order to supplement the previous approach, (ii) the study of more sophisticated filtering association approaches such as multiple hypothesis tracking [34] for reducing search initial space and (iii) inclusion of more perceptual pathways (e.g. color information) in order to perform a more robust tracking approach when the template is represented by a reduced set of edge points.