When a query is passed to multiple search engines, each search engine returns a ranked list of documents. Researchers have demonstrated that combining results, in the form of a “metasearch engine”, produces a significant improvement in coverage and search effectiveness. This paper proposes a linear programming mathematical model for optimizing the ranked list result of a given group of Web search engines for an issued query. An application with a numerical illustration shows the advantages of the proposed method.
The World Wide Web (WWW) is a main place to find information about any area. Searching is a key activity on the Web and the major search engines such as Google, Live, Yahoo, etc. are the most frequently used tools for locating specific information on the vast expanse of the WWW. Several attempts have been reported in the literature to compare, rank and measure the performance of major search engines (Diaz et al., 2005, Diaz et al., 2007, Emrouznejad, 2008, Emrouznejad and Amin, 2010 and Jansen and Spink, 2006). Many researchers have demonstrated that combining the results of multiple search engines in the form of a metasearch engine can significantly improve the search effectiveness (Bar-Ilan et al., 2006, Spink et al., 2006, Spoerri, 2007 and Vaughan, 2004). Spink et al. (2006) studied the dispersion and overlap between the results of the major Web search engines. Spoerri (2007) investigated the ranking effects in search engine results and Vaughan, 2004 and Mowshowitz and Kawaguchi, 2005 and Bar-Ilan et al. (2006) compared the results of several search engines. The results of different search engines show that only 45% of the relevant results are likely to be located by a single search engine and therefore combining the results of different search engines can significantly improve the results quality of the search engines (Keyhanipour, Moshiri, Kazemian, Piroozmand, & Lucas, 2007). All above studies concluded that search engines use different methods and the results of finding materials may also be differently ranked within them. Hence it would be impossible to find related information to the queries submitted to multi-search engines on the Web without a sophisticated method to combine the results and find the best related information. Consequently, finding relevant data on the Web in a timely and cost-effective way is a problem of wide interest and many believe that employing a single general-purpose search engine for all data on the Web is unrealistic (Höchstötter and Lewandowski, 2009, Lempel and Moran, 2004, Meng et al., 2002 and Mowshowitz and Kawaguchi, 2005). Moreover, researchers have demonstrated that combining results of different search engines produces a significant improvement in coverage and search effectiveness (Diaz et al., 2007 and Höchstötter and Lewandowski, 2009). A metasearch engine is a system that supports unified access to multiple existing Web search engines. When a query is passed to a metasearch engine, the query is sent to a set of search engines, it then extracts the results from the returned pages, and aggregates them into a single ranked list (Diaz et al., 2005, Diaz et al., 2007 and Emrouznejad and Amin, 2010). Keyhanipour et al. (2007) and Emrouznejad (2008) used ordered weighted averaging (OWA) operator for aggregation of Web search engines. Within literature, no single research is reported to optimize the search engines results of a specific query using mathematical optimization theory. This paper aims to introduce a linear programming (LP) model to combine the results obtained in a metasearch. In summary the method first ranks the documents resulted for a specific query from each search engine then we use the state-of-art in linear programming to combine the rank and to find the optimal rank for each document in the search engines results. The originality of this study is that, for the first time the optimal results of Web search engines are analyzed using linear programming. Also, the proposed model finds the score of each document retrieved from a search engine using an optimization model and without including a subjective procedure. The rest of this paper is organized as follows. Section 2 gives a brief explanation of linear programming in general. Section 3 introduces a LP model for finding the optimal list of search engines results. This is followed by a numerical illustration in Section 4. A discussion on the results and advantage of using the proposed model is given in Section 5. Finally, remarks and conclusions are given in Section 6.