ترتیب بندی مجدد معامله
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|9280||2010||21 صفحه PDF||سفارش دهید||14135 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Data & Knowledge Engineering, Volume 69, Issue 1, January 2010, Pages 29–49
Traditional workload management methods mainly focus on the current system status while information about the interaction between queued and running transactions is largely ignored. This paper proposes using transaction reordering, a workload management method that considers both the current system status and information about the interaction between queued and running transactions, to improve the transaction throughput in an RDBMS. Our main idea is to reorder the transaction sequence submitted to the RDBMS to minimize resource contention and to maximize resource sharing. The advantages of the transaction reordering method are demonstrated through experiments with three commercial RDBMSs.
Traditional workload management methods mainly focus on the current system status  and . For example, in a typical RDBMS, the load controller only allows a certain number of complex queries to run concurrently. Also, if the system is in the danger of thrashing (i.e., admitting more transactions for execution will lead to excessive overhead and severe performance degradation ), the load controller may choose not to run any new transactions. To support modern applications, users are continually requiring higher performance from RDBMSs. To meet this requirement, it is natural to ask whether or not we can use information about the interaction between queued and running transactions to improve the existing workload management methods. The answer to this question is “yes.” In fact, in many instances, it is possible to improve the throughput of an RDBMS through the utilization of such information. More specifically, we can improve the throughput of an RDBMS that is processing a sequence of transactions by reordering these transactions before submitting them for execution. This is due to opportunities for either resource sharing among multiple transactions (e.g., sharing data in the buffer pool, or perhaps even sharing intermediate computations common to several transactions) or lowering resource contention (e.g., avoiding lock conflicts). Information about the interaction between queued and running transactions is essential in capturing such opportunities. There are two main reasons why transaction reordering might be effective. The first is system independent – for example, it might be that a reordering of a transaction sequence truly eliminates some intrinsic lock conflicts between adjacent transactions and/or makes resource sharing possible. The second is system dependent – for example, a system may have a particular implementation of buffer management or concurrency control that renders one order of transactions superior to another. Even reordering to exploit system dependent opportunities is useful. Commercial RDBMSs are large, complex pieces of code, and changes in functionality can require a very long design-implement-test-release cycle. In many cases it may be far simpler to do some reordering of transactions outside of the RDBMS before submitting them to the RDBMS for execution than it would be to change, say, the concurrency control subsystem of the RDBMS. This is especially true for database application developers who are unable to change the database engine. This paper presents a general transaction reordering framework, which utilizes both the current system status and information about the interaction between queued and running transactions. The basic concept is simple and shown in Fig. 1. In an RDBMS, generally, at any time there are M1M1 transactions waiting in a FIFO transaction admission queue Q to be admitted to the system for execution, while another M2M2 transactions forming a set SrSr are currently running in the system. Such a transaction admission queue Q is commonly used for load control purpose  and .Those transactions in the transaction admission queue Q are the candidates for reordering. That is, the reorderer reorders the transactions waiting in Q so that the expected throughput of the reordered transaction sequence exceeds that of the original transaction sequence. In its reordering decisions, the reorderer exploits properties it deduces about the blocked transactions in Q and the properties it knows about the active transactions in SrSr. The improvement in overall system throughput is a function of (a) the number of factors considered for reordering transactions, and (b) the quality of the original transaction sequence. The more factors considered, the better quality the reordered transaction sequence has. However, the time spent on reordering cannot be unlimited, as we need to ensure that the reordering overhead is smaller than the benefit we gain in throughput. Also, we need to ensure acceptable transaction response time in the sense that no transaction is subject to starvation. There are a wide range of reordering algorithms that could be used. At the extremes, we could: (1) Do no analysis. Run all the transactions in the order that they arrive at the RDBMS. (2) Take a snapshot of the system. Analyze every possible order of the transactions and record the corresponding throughput. Pick the optimal order to run all the transactions. The first extreme may be undesirable if some amount of reordering can improve the throughput. The second extreme is obviously unrealistic due to the exponential analysis overhead. Our goal is to find a good compromise between these two extremes. That is, under the constraint of acceptable transaction response time, we want to maximize the difference between the gain in throughput and the reordering overhead. Reordering transactions requires CPU cycles. However, the increasing disparity between CPU and disk performance renders trading CPU cycles for disk I/Os more attractive as a way of improving DBMS performance . As shown in detail in Section 4 below, some forms of transaction reordering can be regarded as a way to trade CPU cycles for disk I/Os. Also, our experiments in three commercial RDBMSs show that with minor overhead, our transaction reordering method greatly improves the throughput of a targeted class of transactions while it has only a minor impact on the throughput of other classes of transactions. There are many resource allocation factors that can be considered for transaction reordering. In this paper, due to space constraints, we only consider two factors: lock conflicts (with an application to materialized view maintenance ) and buffer pool performance (with an application to exploiting synchronized scans ). Transaction reordering can be implemented in two places: (1) inside the RDBMS, or (2) outside the RDBMS as an add-on module. These two choices are shown in Fig. 2, where the dotted rectangle denotes the RDBMS. The inside-RDBMS choice affords more opportunities for reordering, as the reorderer is tightly integrated with the RDBMS and can use detailed information about the current state of the system. Also, certain reordering policy (such as the one described in Section 4 for exploiting synchronized scans) can only be implemented using the inside-RDBMS choice if it requires support of other modules in the RDBMS. The outside-RDBMS choice has the advantage of not needing to change the database engine and is especially suitable for database application developers. However, putting the reorderer outside the system means that it might have to treat the system as a black box and certain opportunities for reordering will be missed. It also requires an additional parsing of each transaction (once in the reorderer, once in the system). Section 5.3 gives an example of the outside-RDBMS choice.Reordering transactions itself is not a new idea. RDBMS users sometimes order their transactions themselves before submitting those transactions to the DBMS . However, to our knowledge the published literature has not considered a transaction reordering module that attempts to increase concurrency and share resource utilization. In related work, the operating system community has explored the approach of adding a module outside of a system to reorder web server requests based on the knowledge of OS buffer contents  and . The database community has proposed multi-query optimization  and  for resource sharing. The traditional multi-query optimization approach work in a batch fashion, as the optimizer needs to wait for a sufficient number of incoming queries with common sub-expressions to arrive, and then before executing them, changes their query plans to share common sub-expressions. Our transaction reordering method is dynamic and online: there is no need for either changing the query plans or waiting. This paper is an extended version of our previous workshop papers  and , and includes new material on deadlock probability computation in Section 3.3 and on the corresponding performance results in Section 5.1. The rest of this paper is organized as follows. Section 2 presents our general transaction reordering framework. Section 3 describes how to use lock conflict analysis as the reordering criterion. Section 4 discusses how to use buffer pool analysis as the reordering criterion. Section 5 investigates the performance of the transaction reordering method through an evaluation in three commercial RDBMSs. We conclude in Section 6.
نتیجه گیری انگلیسی
This paper proposes the use of transaction reordering to improve the performance of an RDBMS. The general idea underlying transaction reordering is that by combining knowledge about the currently running transactions and the transactions waiting to be run, a system can improve performance by selecting for running those transactions that fit best with those that are already running. In this paper we explored two different techniques, the first based upon reducing lock conflicts, the second upon increasing buffer pool hit rates. Our experiments with three commercial systems are promising, showing that this technique can significantly improve throughput for certain workloads. For continuous data loading, there are several interesting directions that we intend to pursue in future work: (1) We would like to give a fair comparison of the transaction response time in different methods: our reordering method, the naive immediate materialized view maintenance method (without reordering), and the deferred materialized view maintenance method. This is not a trivial task, since the overhead of load transactions in the deferred materialized view maintenance case (no materialized view maintenance is needed) is different from that in the immediate materialized view maintenance case. Also, the throughput of the naive immediate materialized view maintenance method (without reordering) is close to zero. (2) As mentioned at the end of Section 3, reordering transactions may slightly delay the processing of some load transactions while speeding up the processing of other load transactions. It is a challenge to compare the data staleness caused by these slight delays with the data staleness caused by deferred materialized view maintenance. In fact, even giving a precise definition of data staleness that is comparable in both cases will be a challenge. For example, by some metrics the staleness of the reordering method will be zero (the transactions that are delayed will be balanced by those that are run early.) Also, it is difficult to define a standard base on which data staleness can be measured, since without reordering, the throughput of the naive immediate materialized view maintenance method is close to zero. It is an open question whether immediate materialized view maintenance or deferred materialized view maintenance is more desirable. The answer to this question depends on how efficiently immediate materialized view maintenance can be done. Also, it depends on how well the semantic discrepancy between materialized views and base relations can be minimized for deferred materialized view maintenance. We hope that the techniques in this paper can contribute to the discussion in this regard. Developing and exploring ways to define and detect which transactions fit best is another rich area for future work. Such future work can either seek to exploit intrinsic properties of sequences of transactions, or it can seek to exploit performance problems that arise due to idiosyncrasies of specific commercial systems. Both approaches are interesting – as commercial RDBMSs continue to grow in complexity, the difficulty of making major changes to their functionality also grows, to the point where it is interesting in some cases to view them as artifacts to be studied rather than as programs to be modified. Transaction reordering research is one example of this approach.