مکانیزم های اکتشاف مشارکتی چند دامنه ای برای گسترش جستجوی در یک چارچوب فیلترینگ مبتنی بر عامل
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
20088 | 2007 | 14 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Electronic Commerce Research and Applications, Volume 6, Issue 4, Winter 2007, Pages 399–412
چکیده انگلیسی
Novice users often do not have enough domain knowledge to create good queries for searching information on-line. To help alleviate the situation, exploration techniques have been used to increase the diversity of the search results so that not only those explicitly asked will be returned, but also those potentially relevant ones will be returned too. Most existing approaches, such as collaborative filtering, do not allow the level of exploration to be controlled. Consequently, the search results can be very different from what is expected. We propose an exploration strategy that performs intelligent query processing by first searching usable old queries, and then utilising them to adapt the current query, with the hope that the adapted query will be more relevant to the user’s areas of interest. We applied the proposed strategy to the implementation of a personal information assistant (PIA) set up for user evaluation for 3 months. The experimental results showed that the proposed exploration method outperformed collaborative filtering, and mutation and crossover methods by around 25% in terms of the elimination of off-topic results.
مقدمه انگلیسی
Without clearly understanding what kind of information is being searched for, the chance that the “right” recommendations will be offered by today’s search engines is low. The precise specification of information needs is imperative, but the imperfect nature of human beings unfortunately introduces significant difficulties for ordinary users to create queries that are both compact and sufficiently descriptive. The underlying reasons include the following: (1) Users often need to articulate their thoughts related to their current interests while specifying their queries. (2) They also need to be familiar with query syntax, and this can sometimes be complex when the information sought is very specific. (3) They often need to possess at least some general knowledge about the information to be retrieved. Although there have been a number of online portals that provide information search services, most of these problems hinder users from fully exploring the information space effectively. For example, it makes no sense to search a database of books for an author’s family details, as only “bulk” author names can be found there. The lack of exploratory processing of queries, as adopted by most of the today’s search engines, is the main reason for queries with too many or too few results returned [2]. Also, the search results returned are often not related to the information needs that users had in mind. Users typically do not consider the fact that not getting the expected results is due to their poorly formed queries. So they may lose confidence in a particular search engine, and may switch to another one that can better satisfy their information needs. Since the users’ habits regarding the creation of queries cannot easily be influenced, intelligent processing techniques delivered by a filtering engine are necessary. One such technique is called intelligent query answering [3], where the underlying intent of a query is analysed in order not to provide exactly what is being asked for, but also to explore potentially relevant information areas. The user who is looking for a nice holiday destination might also want to get information about car rental, even though the user has not asked for that. An intelligent query system can deduce that. How? By knowing that this extra information is the most frequently requested at the same time for a particular holiday offer, for example. Then, the provided results will be much more diverse, and have a better chance to satisfy the stated, as well as the implicit needs of the user who created the query. Even though providing extra information may be useful, providing too much will lower the precision of the results and may have a negative effect. A user can be overwhelmed by irrelevant results and decide not to use the search engine in the future due to the bad experience. Thus, the level of exploration should be carefully controlled. Conventional exploration techniques like collaborative filtering normally ignore the extent of exploration, which causes the delivery of results unrelated to the actual information needs. Although recommending items that are liked by similar users sounds logical, these items will have a low usability if they are too far away from the context of the query. For example, offering weather information about cities could be favoured when sightseeing recommendations are requested but not when we are comparing prices for travel packages. The deployed exploration mechanisms should be able to control the automatic adaptation of the specified information needs so they are within users’ preferences. A promising technique for performing controlled exploration is automatic transformation of an actual query based on past queries. Novice users can benefit from the past queries posted by like-minded experts. To do so, one may need to first identify past queries that are related to the currently posted one, and then utilise the extended context formed due to the past queries to achieved controlled exploration. For example, a query about tools for designing multi-agent systems might lack important keywords which might be found in other queries that have been previously posted by domain experts. Such an application of query adaptation to achieve exploration is fully transparent in the sense that a query is modified without user awareness. In order to integrate exploration techniques in a multi-agent framework that contains different filtering agents, this transparency will be essential. In the remaining sections, after critically reviewing the drawbacks of other attempts to provide exploration, one scenario will be used to illustrate the exploration challenge. Our core contribution is presented in Section 4. It will show how similar past queries can be found. It also will show how they can be used to carefully adapt the current one by both adding new attributes that might be important but missing, as well as adapting the existing attributes so they have more realistic values. We conclude with implementation details and experimental results.
نتیجه گیری انگلیسی
A filtering framework without exploration will tend to recommend only similar items, without providing much in a way of serendipitous discovery. The goal of this paper was to provide solutions to the challenges in exploration caused by the unpredictable context of the queries. We proposed an exploration approach which first selects the old queries that are related, and then mine important features from them that can improve the actual query. We adopted a weighted combination of Euclidean distance and the Jaccard index, and found it to be around 15% more accurate than the individual distance function in measuring the discrepancy between queries. The ability to carefully explore related information areas with the proposed approach was proven by comparing it with the collaborative filtering, and the mutation and crossover approaches. It outperformed these other approaches by delivering around 25% fewer off-topic results. In spite of the superior performance of our exploration approach, we note two limitations. The first relates to deciding which old queries are good and useful enough to be included in the similarity neighbourhood. The current realisation in Section 4 specifies that the actual query solely guides the selection process. The old queries, which are chosen by such a selection policy, will be very similar to the actual one. Unfortunately, their high similarity may reduce their usefulness for improving the actual query, as it is possible that nothing new can be learned from them. A future improvement should modify the selection policy to ensure the greater diversity of similarity neighbourhood. One algorithm which guarantees diversity is known as the aggregate creation of a neighbourhood. Its basic idea is to use not only the actual, but also the already chosen past queries as unprocessed past queries are evaluated. Such aggregate formation of a neighbourhood will provide more attributes that can substitute for the missing attributes in the actual query. The second limitation is concerned with the representation of the formed neighbourhood by computing the expected values and weights of attributes. Though such a representation is efficient, it reduces the flexibility while processing different attributes. Where outliners are present, we expect that the expected values and weights that are computed will not be as representative as desired. For such attributes, a more suitable summarisation can be based on a medoids principle [3], where the most important values and weights are used. Unfortunately, the drawback of this representation lies in the increased complexity and its impact on the usage of the summarised information. Our future work will concentrate on finding ways for handling these limitations. Additionally, the combined distance function might become self-adjusting regarding the determination of the optimal weights for a particular query. The value of the exploration rate can be personalised by learning from the habits of users. As the number of processed queries and the cost for adaptation increase, a critical topic for further investigation will be how effective clustering techniques should be applied to past queries. Though the results that we presented here are an initial step towards more intelligent exploration in multi-agent filtering frameworks, we further advocate additional efforts to lay a more solid foundation for the comprehensive study of more intelligent filtering services.