بهره برداری در مقابل اکتشاف: انتخاب یک منبع در محیط اطلاعات ناقص
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|20039||2004||18 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Decision Support Systems, Volume 38, Issue 1, October 2004, Pages 1–18
An agent operating in the real world must often choose between maximizing its expected utility according to its current knowledge about the world and trying to learn more about the world, since this may improve its future gains. This problem is known as the trade-off between exploitation and exploration. In this research, we consider this problem in the context of electronic commerce. An agent intends to buy a particular product (goods or service). There are several potential suppliers of this product, but they differ in their quality and in the price charged. The buyer cannot observe the average quality of each product, but he has some knowledge about the quality of previous goods purchased from the suppliers. On the one hand, the buyer is motivated to buy the goods from the supplier with the highest expected product quality, deducting the product price. However, when buying from a lesser known supplier, the buyer can learn about its quality and this can help him in the future, when he will purchase more products of this type. We show the similarity of the suppliers problem to the k-armed bandit problem, and we suggest solving the suppliers problem by evaluating Gittins indices and choosing the supplier with the optimal index. We demonstrate how Gittins indices are calculated in real world situations, where deals of different magnitudes may exist, and where product prices may vary. Finally, we consider the existence of suppliers with no history and suggest how the original Gittins indices can be adapted in order to consider this extension.
An agent in an electronic market often has to choose among several suppliers of a product or a service. The suppliers differ in the mean quality of the products they sell. The agent does not know with certainty the mean quality of each supplier, but he has some knowledge about qualities of previous items sold by each suppliers. We consider situations where the agent needs to repeatedly buy items and he would like to buy high quality items for the lowest possible prices. The problem of unknown quality of items appears also in traditional markets, but is much more pronounced in e-commerce since the buyer cannot view the product before purchasing it and cannot form a personal impression of the supplier  and . When automated agents represent the buyer, this problem is even more pronounced. Thus, there is considerable uncertainty about the quality of the goods, delivery time and the reliability of the supplier. The price of the goods cannot identify its quality, since unknown suppliers may exist, which may sell goods of high quality, but charge low prices since they are not known. Consider, for example, the market for books. The largest online supplier of books is Amazon , but there are other online suppliers that may sell the same books at a reduced price and may also provide a better quality product, if quality is measured as delivery time, since a smaller company has fewer transactions, while the well-known supplier may be backlogged with orders, and delivery may sometimes be delayed. Similarly, empirical tests  show that price differences exist in the electronic market of airline ticket offerings. In situations where the unknown supplier may sell superior goods at a lower price, the agents have to use an intelligent strategy in order to choose the most beneficial deals. Under these circumstances, the history of past transactions with this supplier is crucial information in evaluating the quality of the products or services it sells. On the one hand, the best option for the agent is to chose the supplier which maximizes his expected utility, i.e., with a low price and high average quality. However, there may be situations in which the agent will prefer to buy from an unknown supplier, in order to explore the quality of this supplier, and this may provide future benefits by buying from better suppliers. The dilemma of the buyer is whether to choose the best known supplier or to try other suppliers in order to learn about their quality, in order to improve future gains. This dilemma is called in the literature the trade-off between exploitation and exploration. For this kind of problems, Gittins  suggests a method for calculating an index for each alternative, which considers the expected gains from choosing it, taking into account future gains from obtaining additional information. Gittins proves that the alternative with the highest calculated index is the optimal choice. However, there are several adaptations that are necessary in order to apply Gittins indices to real world situations such as those in the above examples. In this paper, we consider the problem of applying the Gittins technique to real problems of choosing among alternatives and we demonstrate it on the problem of choosing an online supplier. The paper is organized as follows. In Section 2, we discuss related work and, in Section 3, the formal model is presented. A theoretical background about Gittins indices is presented in Section 3.1 and how to solve the supplier problem using Gittins indices is explained in Section 3.2. Finally, in Section 4, we provide conclusions and suggestions for future extensions.
نتیجه گیری انگلیسی
In this research, we discuss the issue of solving problems which involve a trade-off between exploration and exploitation. We suggest using deterministic methods for solving these problems, in cases where the problem can be described formally, and some required properties about the parameters exist. To illustrate this idea, we introduce the problem of an agent deciding from which supplier to buy certain goods in the context of electronic commerce, where each supplier provides goods with different expected qualities, and the expected quality is unknown to the agent. The agent only has information about the quality of goods produced by the suppliers in the past. The agent has to decide from which supplier to buy, and whether to buy from the best known supplier, or to buy from an unknown supplier in order to learn about it. We show that this problem is a special case of the “multi-armed bandit” problem  and suggest using Gittins allocation indices  for solving it. The index of a given supplier will indicate its value and the supplier with the highest index will be chosen. We first consider variations where the price of each supplier is constant and the size of the goods sold is constant for all goods sold, and we proceed with additional details which appear in a real world variation of the suppliers problem. First, we consider the fact that the agent is likely to buy the goods in time periods which are not known in advance and we suggest how Gittins indices for this case can be calculated. Second, we consider the case of suppliers with different prices and suggest how the prices will be included in the indices calculation. Then we consider situations where the deals differ in their size. As the size of a deal increases, the value of the present deal becomes more important compared to the value of future deals. We suggest how to take into account the size of the deal when calculating the Gittins indices. We also consider the decision problem when there are new suppliers, with an unknown history, or with a history of a single event. We suggest techniques for enabling the choice of new suppliers. Finally, we consider a situation where the item's price of each supplier varies over time. In this case, the superiority of a supplier decreases over time and is terminated after the final time. We suggest how to solve the suppliers problem when considering changing prices.