محاسبات با داده های غیر جبری: مدیریت زمان صبر برای سیستم های نقطه به نقطه
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|6568||2008||14 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computer Communications, Volume 31, Issue 3, 25 February 2008, Pages 629–642
One of the unusual challenges faced by peer-to-peer algorithms as opposed to classical distributed algorithms is that these have to compute with data non-determinism where there is no guarantee that data from a particular node will be delivered in time or delivered at all. Each distributed component not only has to work in such environment, but also has to decide how long to wait for peers to respond. A large wait can inordinately delay entire computation. A smaller wait can reduce the coverage of the algorithm and drastically degenerate the quality of final result produced. In this paper, we investigate a set of wait management schemas which tries to minimize the overall completion time and maximize the quality of solution. We present detail analyses of the schemas for various scenarios including large power law networks and forms of non-responsiveness. The work promises dramatic improvement in peer-to-peer search and other computations.
P2P has emerged as an important networking paradigm in recent years. The fabric of peer-to-peer systems is founded on various distributed algorithms such as searching, packet routing, scheduling, shortest path, k-best path, network connectivity, spanning tree, etc. computed over various overlay networks. An interesting question if it is distinct from classical parallel and distributed computing systems (PDCS) paradigm? It seems one of the distinction lies in the treatment of unreliability of distributed nodes. Unlike classical distributed systems nodes and client servers programs built on them, it is primarily based on peers who can voluntarily join and leave the network. Peers can volunteer or deny letting others use its resources. Thus the distributed algorithms face scenarios not common in classical network such as dead-beat node, unreliable or busy peer, missing messages, authentication failure, selective cooperation/non-cooperation etc. We call such a computing environment made out of the distributed moody and autonomous entity as moody and autonomous environment (MAE). It seems MAE offers a new challenge that is neither solved by the data-deterministic classical distributed systems or by the flavor of reliability assured by classical communication primitives (such as TCP, or message passing). It is true that some primitives such as TCP make an extra effort to recover the lost messages if possible. Thus, classical distributed systems have safely assumed perfect data arrival in their design and corresponding performance characterization. There now clearly seems to be a gap. Peer-to-peer systems nodes have to abandon the notion of perfect data and thus perfect computing. We call it data non-determinism. The situation is unavoidable especially when the system consists of thousands or millions of autonomous entities – such as peer-to-peer systems or real life social systems. It gives rise to some interesting new problems. What if even after the recovery effort the communication returns unsuccessfully? Currently, each individual algorithm needs to be separately programmed how an entity in MAE would proceed if the network (such as TCP or UDP) does not return the value for which it has been waiting for. A data-deterministic algorithm can be quite efficient in a classical distributed system environment but it can have drastically reduced performance (or even may not work at all) – when faced with a MAE. The distributed component in MAE needs to decide some new issues such as how long it should wait for data in a particular step? If it waits longer whether the solution provided by the algorithm will improve or not? It has to address how the quality of solution would be affected if some peers use data offered by some other peers received from previous exploration of the network. Painstakingly, each of the distributed algorithms needs to be custom built for the solutions to moodiness’. In this paper, we study several strategies for wait management in a peer-to-peer system so that distributed algorithms (such as search or any other computation) can effectively operate on MAE environment in non-data deterministic situation.
نتیجه گیری انگلیسی
One of the unusual challenges faced by peer-to-peer algorithms as opposed to classical distributed algorithms are that these have to work with data non-determinism where there is no guarantee that data from a particular node will be delivered in time or delivered at all. Each distributed component not only has to work in such environment, but also has to decide how long to wait for peers to respond. A large wait can inordinately delay the entire computation. A smaller wait can reduce the coverage of the algorithm and drastically degenerate the quality of final result produced. The paper brings attention to an important fact that the performance of distributed algorithms design for conventional ‘ideal’ networks drastically deteriorates in MAE environment. To our knowledge, this is one of the first works to investigate this particular aspect of the paradigm. Wait time management for peers is the natural technique to address the unreliable responsiveness posed by an MAE entity. In this paper, we investigate a set of wait management schemas which tries to minimize the overall completion time and maximize the quality of solution. We present detail analyses of the schemas under various scenarios; including varying P2P like power law network topologies, various moodiness characteristics, and under varying application objectives. It seems dramatic performance improvement is possible in various search and other maintenance algorithms used in peer-to-peer systems. Clearly, it is also possible to design newer schemas. For example, in this paper we did not discuss hybrid schemas: (such as Hybrid of MCD between PCD or hybrid between static and PLD. The later is more realistic for P2P applications. It is interesting to note that the comparative difference between the Oracle schema and the observed best performing schemas presented shows there are still room for improvement. Even the best schemas use various constants and factors. In our experiment, we selected those after extensively looking into performance of individual schema in various scenarios – and reported the best one found so far for each. Clearly, more work can be carried out to develop provisions which can make at-least some of the schema parameters adaptive to network conditions.