درک آینده از انرژی عملکرد تجارت کردن از طریق DVFS در محیط های HPC
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
25376 | 2012 | 12 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Journal of Parallel and Distributed Computing, Volume 72, Issue 4, April 2012, Pages 579–590
چکیده انگلیسی
DVFS is a ubiquitous technique for CPU power management in modern computing systems. Reducing processor frequency/voltage leads to a decrease of CPU power consumption and an increase in the execution time. In this paper, we analyze which application/platform characteristics are necessary for a successful energy-performance trade-off of large scale parallel applications. We present a model that gives an upper bound on performance loss due to frequency scaling using the application parallel efficiency. The model was validated with performance measurements of large scale parallel applications. Then we track how application sensitivity to frequency scaling evolved over the last decade for different cluster generations. Finally, we study how cluster power consumption characteristics together with application sensitivity to frequency scaling determine the energy effectiveness of the DVFS technique.
مقدمه انگلیسی
Energy efficiency has become one of the most critical issues in modern cluster design because of very high operating costs, reliability issues and environmental concerns. Since CPU power accounts for a large portion of total system power consumption [11], there has been considerable research on CPU power management. The majority of these works are based on the DVFS technique (Dynamic Voltage Frequency Scaling). Processors that support DVFS can run at a lower frequency/voltage setting consuming less power. Unfortunately, lower frequency settings generally lead to longer execution times. DVFS energy saving techniques in HPC (High Performance Computing) systems can be classified into two classes. The first class of approaches accepts a certain penalty in performance for reduced energy consumption [29], [14], [16] and [21]. The other class runs processors at lower frequency only if they are not on the critical path avoiding performance loss [18], [22], [28] and [20]. There are two main drawbacks of the second class of approaches: they can be applied only to specific applications (i.e. load imbalanced or communication intensive) and they involve fine grain DVFS use that may present a chip reliability issue. In this paper we target the first class of approaches discussing DVFS potentials for the energy-performance trade-off in current and future large scale HPC clusters. Fig. 1 gives two power/execution time scenarios. The first one represents an application execution at the nominal CPU frequency whilst the other case assumes that the application runs at a reduced frequency ff. In the second case the application takes longer, finishing at the moment T2T2. When running at the nominal frequency the application execution ends at the moment T1T1. In this case the system consumes power View the MathML sourceP(fmax) until the moment T1T1 and View the MathML sourcePidle from T1T1 until T2T2. The application running at the reduced frequency dissipates P(f)P(f) over the entire observed time interval. The mentioned values are the average system power consumption values over the observed intervals. Hence, the energy View the MathML sourceE(fmax) consumed in the first case is View the MathML sourceP(fmax)∗T1+Pidle∗(T2−T1). In the second case, the energy consumption E(f)E(f) is equal to P(f)∗T2P(f)∗T2. Since CPU power consumption accounts for a high portion of the total system power (50% of system power under load [11]), reduction in CPU power due to frequency scaling leads to significant difference between View the MathML sourceP(fmax) and View the MathML sourceP(f)(P(f)<P(fmax)). Therefore, the second scenario in which the application runs at reduced frequency has been considered to be more energy efficient (View the MathML sourceE(f)<E(fmax)).New attitudes, contrary to conventional wisdom that in general DVFS saves energy in spite of performance loss, have emerged recently. For instance, Le Sueur et al. found that while DVFS was effective on older platforms, it actually increases energy usage of sequential applications on the most recent platforms [19]. Running an application at lower frequency/voltage results in significantly lower CPU power consumption. However, due to an increase in the application execution time, frequency reduction may lead to higher energy consumption. Critical aspects that must be considered when evaluating frequency scaling potentials for energy saving are the following: • The increase in the execution time for a given amount the frequency was reduced by. • The portion of total system power reduction for a given amount the frequency was reduced by. • The ratio of idle and active system power. Longer execution time at lower frequency is not only a performance issue but it determines whether the reduction in frequency results in energy savings. The increase in the execution time is not necessarily proportional to the reduction in frequency. How much frequency scaling affects the application execution time depends on non-CPU activity i.e. memory accesses and communication latency. It is important to state that this work targets large scale parallel applications whose performance loss highly depends on the portion of time spent in communication as shown in Section 2. Obviously, the amount by which total system power can be reduced is one of the parameters that determine energy efficiency of the DVFS technique. CPU power reduction is limited by the constantly increasing portion of leakage power and the voltage-scaling window. Furthermore, the amount by which total system power can be reduced depends on the CPU power fraction in total system power. When evaluating an energy saving approach it is common to regard only the energy consumed during an application execution even when different approaches do not have the same execution times. Miyosi argued that the power consumed while idle must be taken into account if overall saving is the goal [24]. The system cannot be simply turned off when an application finishes. In fact idle cluster power is still very high accounting for about half of the power consumed under load [8]. Idle processors can be put into a low power mode but this is still not the case with other system components. Future cluster design must radically decrease idle power in order to achieve energy proportional computing [1]. Thus, two energy scenarios must be compared during the same time interval. Our contributions in this paper are: • We proposed and evaluated a model of frequency scaling impact on execution time for large scale MPI applications. • Frequency scaling impact on performance was measured and analyzed on a modern platform for real world applications with up to 712 processors. • Application sensitivity to frequency scaling was compared over different cluster generations. • A parametric analysis of DVFS energy efficiency was performed for large scale parallel applications. Similarly to findings of Le Sueur et al. for sequential applications, we find that the DVFS technique potentials for parallel applications are diminishing as well, in spite of the fact that the communication time does not scale with frequency scaling. Execution times of parallel applications running on newer systems tend to be more sensitive to frequency scaling than they were before. Though energy-proportional computing is still a research challenge, we show how the eventual reduction in idle power consumption will further diminish opportunities for DVFS energy savings. In spite of decreasing DVFS energy saving potentials, the technique still can be used to reduce power consumption in power constrained systems to run more jobs simultaneously [7], resulting in the same or higher energy consumption. Because of increasing main memory power consumption, memory DVFS has been proposed recently [5] and [4]. Applying frequency/voltage scaling to both processors and memory subsystem might present a solution for future clusters. The rest of the paper is organized as follows. The next section explains how frequency scaling affects the application execution time. Section 3 gives a parametric analysis of application and platform parameters that determine whether a reduction in CPU frequency leads to a more energy efficient execution. Section 4 presents related work. Our conclusions are given in Section 5.
نتیجه گیری انگلیسی
In this paper potentials of DVFS in current and future HPC systems were analyzed. We discussed performance loss due to frequency scaling as one of the crucial aspects of DVFS energy efficiency. Large scale parallel applications are less sensitive to frequency scaling than sequential applications as frequency scaling does not affect the time spent in communication. We explained how the application parallel efficiency bounds the performance loss. Accordingly, running large parallel applications at reduced frequency might seem promising. On the other hand, new architectures tend to show more sensitivity to frequency scaling because of less memory stalls and shorter communication times. Furthermore, energy savings via DVFS depend on the cluster power consumption characteristics: the idle power consumption and the fraction of CPU power in the total system power. We showed that achieving energy-proportional computing would seriously limit the DVFS use for an energy-performance trade-off. Reducing the CPU power fraction to 30% or less would have a similar effect.