دانلود مقاله ISI انگلیسی شماره 22020
ترجمه فارسی عنوان مقاله

فهرست زمان بندی چند هدفی از برنامه های جریان کار در زیرساخت های محاسبات توزیع شده

عنوان انگلیسی
Multi-objective list scheduling of workflow applications in distributed computing infrastructures
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
22020 2014 14 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Journal of Parallel and Distributed Computing, Volume 74, Issue 3, March 2014, Pages 2152–2165

ترجمه کلمات کلیدی
برنامه ریزی چند هدفی - گردش علمی - زیرساخت های محاسبات توزیع شده
کلمات کلیدی انگلیسی
Multi-objective scheduling, Scientific workflows, Distributed computing infrastructures
پیش نمایش مقاله
پیش نمایش مقاله  فهرست زمان بندی چند هدفی از برنامه های جریان کار در زیرساخت های محاسبات توزیع شده

چکیده انگلیسی

Executing large-scale applications in distributed computing infrastructures (DCI), for example modern Cloud environments, involves optimization of several conflicting objectives such as makespan, reliability, energy, or economic cost. Despite this trend, scheduling in heterogeneous DCIs has been traditionally approached as a single or bi-criteria optimization problem. In this paper, we propose a generic multi-objective optimization framework supported by a list scheduling heuristic for scientific workflows in heterogeneous DCIs. The algorithm approximates the optimal solution by considering user-specified constraints on objectives in a dual strategy: maximizing the distance to the user’s constraints for dominant solutions and minimizing it otherwise. We instantiate the framework and algorithm for a four-objective case study comprising makespan, economic cost, energy consumption, and reliability as optimization goals. We implemented our method as part of the ASKALON environment (Fahringer et al., 2007) for Grid and Cloud computing and demonstrate through extensive real and synthetic simulation experiments that our algorithm outperforms related bi-criteria heuristics while meeting the user constraints most of the time.

مقدمه انگلیسی

Scientific workflows emerged in the last decade as an attractive paradigm for programming large-scale applications in heterogeneous distributed computing infrastructures (DCI) such as Grids and Clouds. In this context, scheduling heterogeneous tasks including workflows is one of the traditional challenges in parallel and distributed computing. If we focus on the execution time also referred as makespan, the problem has been shown to be NP-complete, hence no polynomial algorithm for solving it exists (assuming View the MathML sourceP≠NP). While this has been for decades the only optimization parameter of interest, modern DCIs are bringing along nowadays other parameters of equal importance such as reliability and economic cost (in Clouds), while recently energy consumption raises ever greater concerns too. Real-world scenarios are therefore confronted with a multi-objective optimization problem where many of these objectives are conflicting. For example, fast processors are typically rented by Cloud providers at higher prices, consume more energy, and may become unreliable due to the large user contention. In these scenarios, there is no single solution that optimizes all objectives, but a set of tradeoff solutions known as the Pareto frontier. Until today, traditional scheduling researches [19] targeted makespan as the only optimization goal, while several isolated efforts addressed the problem by considering at most two objectives [17] and [21]. Although scheduling problems involve today multi-objective optimizations [14], a generic scheduling algorithm and framework for optimizing multiple conflicting objectives is still missing. Due to the NP-hard complexity of the makespan scheduling problem, practical approaches need to resort on heuristics to approximate the optimal solutions also in the multi-objective case. In this paper, we present a polynomial multi-objective algorithm for scientific workflow applications in heterogeneous DCIs that brings a two-fold novelty. First, we propose a general framework based on the multi-objective optimization theory for static scheduling of scientific workflows in DCIs. We analyze and classify different objectives with respect to their impact on the optimization process and present a four-objective case study comprising makespan, economic cost, energy consumption, and reliability. Second, we support our framework through a list scheduling heuristic algorithm capable of dealing with more than two objectives (as restricted by related works). The algorithm uses constraints specified by the user for each objective and approximates the optimal solution by applying a dual strategy: maximizing the distance to the constraint vector for dominant solutions and minimizing it otherwise. The paper is organized as the follows. In Section 2 we review the related work, followed by a short background in multi-objective optimization theory in Section 3. In Section 4, we formalize the abstract application, objective, and platform models underneath our approach. In Section 5, we instantiate this model by a case study comprising makespan, cost, reliability and energy as objectives. Section 6 presents a new multi-objective list scheduling heuristic for workflow applications, illustrated through a small example in Section 7. We extensively evaluate our method in Section 8 for real-world and synthetic workflows and conclude in Section 9.

نتیجه گیری انگلیسی

In this paper, we proposed a generic multi-objective scheduling heuristic for workflow applications in heterogeneous DCIs. We tailored this generic model for a four-objective case study comprising makespan, economic cost, energy consumption and reliability. Considering user-specified constraint for objectives, the algorithm follows a dual strategy based on Pareto dominance relationships: maximize the distance to the constraint vector for dominant solutions and minimize it otherwise. We conducted extensive evaluation for real-world and synthetic workflow applications and observed that, in most experiments, our solutions dominate the user-specified constraints. Compared to the Pareto set estimated by SPEA2* and NSGAII*, our algorithm is considerably faster and delivers better solutions. Extending the MOLS algorithm for dynamic scenarios considering model inaccuracies for both application and DCI is an interesting future direction for this work