یادگیری تکراری تطابق درک از طریق اصلاحات انسان
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|27471||2012||17 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Robotics and Autonomous Systems, Volume 60, Issue 1, January 2012, Pages 55–71
In the context of object interaction and manipulation, one characteristic of a robust grasp is its ability to comply with external perturbations applied to the grasped object while still maintaining the grasp. In this work, we introduce an approach for grasp adaptation which learns a statistical model to adapt hand posture solely based on the perceived contact between the object and fingers. Using a multi-step learning procedure, the model dataset is built by first demonstrating an initial hand posture, which is then physically corrected by a human teacher pressing on the fingertips, exploiting compliance in the robot hand. The learner then replays the resulting sequence of hand postures, to generate a dataset of posture–contact pairs that are not influenced by the touch of the teacher. A key feature of this work is that the learned model may be further refined by repeating the correction–replay steps. Alternatively, the model may be reused in the development of new models, characterized by the contact signatures of a different object. Our approach is empirically validated on the iCub robot. We demonstrate grasp adaptation in response to changes in contact, and show successful model reuse and improved adaptation with additional rounds of model refinement.
Object interaction and manipulation is a challenging topic within robotics research. When a detailed model of the object shape and surface properties is known, one can reason about grasp optimality. However, the prior knowledge requirement is extensive–object properties like the mass distribution or surface texture can be difficult to obtain, for example requiring force sensors or accurate tactile sensing–and how these properties change as the object is manipulated can be difficult to predict. When detailed information about the object shape and surface properties is not known, compromises like grasp sub-optimality and a strong reliance on accurate runtime sensing must be made. Object manipulation becomes even more challenging within the context of dynamic interactions, when the grasp on the object is not static. In this work, the target behavior is grasp adaptation; that is, the ability to be intentionally responsive to external forces so as to comply smoothly with external perturbations, all while maintaining contact with the object ( Fig. 1(a)). The use of force or impedance feedback controllers offer robust solutions to the goal of maintaining contact with an object; however, most works do not consider the additional goal of being intentionally compliant and to follow perturbations , ,  and . Smooth compliance in response to object perturbations when grasping necessitates a tight coordination between all fingers, else the grasped object might fall from the hand. Moreover, this coordination is typically ensured by a good knowledge of the hand kinematics and of the object shape , ,  and . To tackle this issue, rather than handcraft the coordination patterns across all fingers for each novel object, we adopt a learning approach based on human demonstration. The coordination patterns thus are extracted from a set of good example grasps. The use of demonstration learning is motivated further by the high-dimensionality of the task state-space, due to the number of degrees of freedom in the fingers and the sensory signals at play. Showing by example can simplify the specification of coordinated postures between all of the fingers. If the examples are shown kinesthetically, by physically touching the robot to move its fingers, demonstration also allows the teacher to provide the robot with an intuitive notion of force Our work takes the approach of learning a statistical model able to predict a desired hand posture and fingertip pressure from the current signature of the contact perceived at the robot’s fingertips. The approach depends on tactile sensing at the fingertips and human demonstration to provide an example set of feasible grasps.1 The approach does not require any kinematic nor dynamic model of the hand nor object, unlike model-based manipulation approaches. Such requirements of a detailed model and consequently, precise sensing capabilities, in practice can be an issue for many robotic platforms. Instead, the use of a probabilistic model allows for the encapsulation of the intrinsic non-linear mapping between the noisy tactile data and joint information, obtained directly from example grasps. The dataset of examples is built both from human demonstration, and from self-demonstration by the robot after correction by a human teacher. In particular, our model derives from a multi-step learning procedure, that iteratively builds a training dataset from a combination of teacher demonstration, teacher correction and learner replay ( Fig. 1(b)). Corrections are accomplished by having the teacher directly act on the fingers of the robot. In contrast to other demonstration mechanisms like vision systems or data gloves, we suggest that directly acting on the fingers allows the human to detect the forces applied to the grasped object, and thus to achieve a better demonstration of the applied forces. The dataset also is built iteratively, as the teacher interactively corrects the robot’s executions and thus refines the learned behavior. A key distinction in our work when compared to other iterative demonstration learning approaches , , ,  and  is the focus on perturbations, that possibly take the execution far from what has been shown in the demonstration set. Our novel formulation for avoiding over-generalization also ensures that the robot’s response is always valid with respect to the example dataset. Our corrections furthermore aim not only to improve upon a demonstrated behavior, but also to explicitly show additional flexibility and adaptation beyond an executed pose. Our approach is empirically validated on the iCub robot , building contact models for multiple objects of different shapes and sizes. The effectiveness of the iterative learning procedure is confirmed, by measuring an increase across models in the joint ranges encompassed by a given model, as well as in the smoothness of the adaptation and the fingers’ ability to maintain contact with the object when faced with perturbations. Although we overlook the analytical force-closure constraint  during model training, we show that the grasps learned using our approach do in fact satisfy the constraint of force-closure. The benefit of self-replay following teacher correction furthermore is demonstrated. The following section provides an overview of the related literature that supports and motivates this work. Section 3 then formally introduces our approach to iteratively learn an adaptation model, along with the details of the control method for grasp adaptation. Hardware specifications and the experimental setup are detailed in Section 4, and results on the iCub humanoid in Section 5. Section 6 concludes with a summary and discussion of contributions, and directions for future work.
نتیجه گیری انگلیسی
We have introduced a probabilistic approach for grasp adaptation, which learns a model to adapt hand posture solely based on the sensor signature of the contact. A statistical model able to predict a target hand posture and contact magnitude, given the current contact normal direction, is learned from a dataset built over multiple steps under human supervision. In particular, an initial hand posture is first demonstrated to the learner, then physically corrected by a human teacher, and finally the resulting sequence of postures is replayed by the learner as a form of self-demonstration. We contribute an empirical validation of our approach on the iCub robot. To provide tactile corrections, the teacher presses on the fingertips, thus exploiting partial compliance in the robot hand. Through this programming by demonstration methodology, we were able to teach a robot to perform the task by providing it not only with an implicit knowledge of the necessary kinematics for adaptation, but also with an intuitive notion of force. Our results confirmed successful grasp adaptation in response to changes in contact for multiple objects. Our approach furthermore allows for the modification of a learned model, within two contexts. The first is to refine the model to improve adaptation performance, by repeating the correction–replay steps. The second is to reuse a model in the development of new model, characterized by the contact signatures of a different object. In both cases the teacher provides tactile corrections as the learner executes with an existing model of the task, thus exploiting the fact that corrections are easier to provide when the learner is already doing part of the job of actuation on its own, and building upon domain knowledge already present within the robot system. Both successful model reuse and improved adaptation with additional rounds of model refinement have been shown. Importantly, this iterative approach allowed us to progressively reduce the complexity of teaching the robot to perform a task that uses a large number of degrees of freedom. The probabilistic task model that we learn is formulated to take advantage of the statistical data encoding in several important contexts. The first is to avoid over-generalization within the input space, by handling unreliable contact signature signals that might result from a missing contact between the object and one or more fingers, for example. The second is to follow a perturbation only when the hand is in a posture that is near to what was seen within the demonstration dataset, and to otherwise counteract the perturbation in favor of maintaining posture stability. In short, the demonstration data thus is used not only to determine the reaction of the robot to environmental changes, but also to determine when grasps are infeasible or input signals are poor, by exploiting a probabilistic representation which captures the inherent variability in the data. A third advantage is to avoid the need of a detailed model of the hand kinematics and object geometry, by implicitly encapsulating this information into a model built from sensory data only. In contrast to model-based methods that require precise force sensing, actuation and a detailed environment model, which can be an impediment and impractical on many robotic platforms, our learning approach was capable of extracting the non-linearities inherent to such problems with a compact probabilistic model. Our approach thus contributes to the challenging area of object interaction and manipulation within the context of dynamic environments, when contact with the object is changing due to large perturbations. Some limitations of this work include the following. The input space of our regression formulation is not sufficiently rich to disambiguate different hand postures that produce the same contact signature (i.e. contact normal direction ϕϕ), and so a model must be learned for each object individually. Also, the sensing capabilities of our robot platform have restricted our approach to the development fingertip manipulation paradigms only. A tactile sensor with greater coverage or finer resolution would allow for manipulations that engage the entire hand. Improving this sensory capability would also allow our approach to be applicable on a larger set of objects. A tactile sensor with greater coverage and resolution also might provide additional object information useful for defining an input space that is sufficiently rich to disambiguate different hand postures that produce the same contact signature. To tackle this latter issue, enhancing our prediction method to select the best grasp from a multi-modal distribution is a very interesting research question, that is left for future work. Since our approach implicitly encapsulates the hand kinematics and object information, it is unlikely that a learned model would generalize directly to the addition or removal of one or more fingers. Nevertheless, models developed under our approach have been shown to be capable of handling the loss of sensory feedback from a finger. We therefore expect that one round of correction should be sufficient to learn, from the reuse of an existing model, a new model for a smaller number of fingers. If instead one or more fingers is added to the effector, the prior knowledge of the existing model would allow the teacher to focus on correcting the additional fingers only. There are many other promising directions in which to continue this work. The first is to integrate the adaptive contact models with our prior work, that incorporated tactile corrections on the iCub arms, with the result of a complete tactile teaching interface for learning full hand–arm manipulation behaviors interactively via demonstration. One also might reason about the dynamics of the contact signatures, as they change over time. Integrating such information with the hand as well as arm posture adaptation would allow for increasingly complex responses to dynamic interactions with objects. For instance, our approach also assumes that the position of each finger on the object should remain roughly fixed throughout adaptation. Extending our work to incorporate finger repositioning techniques used for explicit object manipulation would certainly enhance the general applicability of our method. At a more technical level, a more advanced model of finger actuation could be incorporated, for example that takes cable friction into consideration. We expect that an improved actuation model would have a significant impact on the success of the learned behavior, as the performance of a grasping system depends heavily on the actuation controller. Similarly, though the use of an impedance controller would require knowledge of the dynamic parameters of the manipulator and very precise force sensing capabilities, with such a controller our approach could be applied on a larger variety of robots, especially on those that do not have the intrinsic mechanical slack that we took advantage of in order to provide corrections. A final area of interest would be to combine our grasp adaptation approach with a model-based approach that can optimally plan an initial grasp and also recover from a loss of contact produced by too strong a perturbation.