دانلود مقاله ISI انگلیسی شماره 6123
ترجمه فارسی عنوان مقاله

جایگذاری ذخیره عمومی برای واکنش به حداقل رساندن زمان در دیسک های موازی

عنوان انگلیسی
General store placement for response time minimization in parallel disks
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
6123 2007 15 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Journal of Parallel and Distributed Computing, Volume 67, Issue 12, December 2007, Pages 1286–1300

ترجمه کلمات کلیدی
- جایگذاری ذخیره داده - زمان پاسخ به حداقل رساندن - دیسک های موازی - مدیریت حافظه - جایگذاری فایل -
کلمات کلیدی انگلیسی
Data store placement,Response time minimization,Parallel disks,Storage management,File placement,
پیش نمایش مقاله
پیش نمایش مقاله  جایگذاری ذخیره عمومی برای واکنش به حداقل رساندن زمان در دیسک های موازی

چکیده انگلیسی

We investigate the placement of N enterprise data-stores (e.g., database tables, application data) across an array of disks with the aim of minimizing the response time averaged over all served requests, while balancing the load evenly across all the disks in the parallel disk array. Incorporating the non-FCFS serving discipline and non-work-conserving nature of disk drives in formulation of the placement problem is difficult and current placement strategies do not take them into account. We present a novel formulation of the placement problem to incorporate these crucial features and identify the runlength of requests accessing a store as the most important criterion for placing the stores. We use these insights to design a fast (running time of NlogNNlogN) placement algorithm that is optimal under the assumption that transfer times are small. Further, we develop polynomial-time extensions of the algorithm that minimize response time even if transfer times are large, while balancing the loads across the disks. Comprehensive experimental studies establish the efficacy of the proposed algorithm under a wide variety of workloads with the proposed algorithm reducing the response time for real storage traces by more than a factor of 2 under heterogeneous workload scenarios.

مقدمه انگلیسی

Parallel I/O systems have received a lot of attention [11,17,22] in the recent past for their ability to provide fast reliable access, while supporting high transfer rates for dedicated supercomputing applications as well as diverse enterprise applications. Disk arrays partition data across multiple disks in a storage pool and provide concurrent access to multiple applications at the same time. Moreover, a single application with huge data requirements may partition its data into stores and place them across multiple disks and use this parallelism to alleviate the I/O bottleneck to a certain extent. In today’s web-service scenario with performance guarantees, throughput is no longer the only performance requirement for applications, and many applications require the average response time of their requests to remain under certain thresholds. Since storageParallel I/O systems have received a lot of attention [11,17,22] in the recent past for their ability to provide fast reliable access, while supporting high transfer rates for dedicated supercomputing applications as well as diverse enterprise applications. Disk arrays partition data across multiple disks in a storage pool and provide concurrent access to multiple applications at the same time. Moreover, a single application with huge data requirements may partition its data into stores and place them across multiple disks and use this parallelism to alleviate the I/O bottleneck to a certain extent. In today’s web-service scenario with performance guarantees, throughput is no longer the only performance requirement for applications, and many applications require the average response time of their requests to remain under certain thresholds. Since storageParallel I/O systems have received a lot of attention [11,17,22] in the recent past for their ability to provide fast reliable access, while supporting high transfer rates for dedicated supercomputing applications as well as diverse enterprise applications. Disk arrays partition data across multiple disks in a storage pool and provide concurrent access to multiple applications at the same time. Moreover, a single application with huge data requirements may partition its data into stores and place them across multiple disks and use this parallelism to alleviate the I/O bottleneck to a certain extent. In today’s web-service scenario with performance guarantees, throughput is no longer the only performance requirement for applications, and many applications require the average response time of their requests to remain under certain thresholds. Since storage Parallel I/O systems have received a lot of attention [11,17,22] in the recent past for their ability to provide fast reliable access, while supporting high transfer rates for dedicated supercomputing applications as well as diverse enterprise applications. Disk arrays partition data across multiple disks in a storage pool and provide concurrent access to multiple applications at the same time. Moreover, a single application with huge data requirements may partition its data into stores and place them across multiple disks and use this parallelism to alleviate the I/O bottleneck to a certain extent. In today’s web-service scenario with performance guarantees, throughput is no longer the only performance requirement for applications, and many applications require the average response time of their requests to remain under certain thresholds. Since storageminimized. We work under the additional constraint that the load is balanced across all the disks. The problem also finds applications in web-services [12,3], where user streams are allocated to different web-servers, and each server may manage its own storage. Content distribution and multimedia hosting servers, where disk latencies dominate average response time, are other settings where the placement of stores determine the performance of the service significantly. The current placement strategies strive to load balance the workload on the array of disks without explicitly minimizing the response time. Since mean response time is directly related to variance, such an approach based only on the load (and oblivious of variance) is not sufficient in a heterogeneous shared storage service scenario.

نتیجه گیری انگلیسی

We presented a novel formulation of the store placement problem that captures the non-FCFS scheduling discipline used to serve disk requests as well as the location-dependence of service time.We identify the Disk-Runlength of a stream (analogously store) as the most important parameter in allocating streams to disk servers. We presented fast algorithms that solve the problem completely under certain realistic assumptions.We have presented theoretical and experimental evidence to show that the proposed algorithms achieve significant performance improvement as compared to existing placement strategies. Our separation of the non-FCFS nature of disk requests opens up the possiblity of optimizing store placement for other objective functions. A direction of future research that we are exploring is to solve the store placement problem when the objective is to minimize the maximum response time over all the serves.