تجزیه و تحلیل عملکرد از سیستم حافظه انتخابی فشرده
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|27605||2002||14 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Microprocessors and Microsystems, Volume 26, Issue 2, 17 March 2002, Pages 63–76
On-line data compression is a new alternative technique for improving memory system performance, which can increase both the effective memory space and the bandwidth of memory systems. However, decompression time accompanied by accessing compressed data may offset the benefits of compression. In this paper, a selectively compressed memory system (SCMS) based on a combination of selective compression and hiding of decompression overhead is proposed and analyzed. The architecture of an efficient compressed cache and its management policies are presented. Analytical modeling shows that the performance of SCMS is influenced by the compression efficiency, the percentage of references to the compressed data block, and the percentage of references found in the decompression buffer. The decompression buffer plays the most important role in improving the performance of the SCMS. If the decompression buffer can filter more than 70% of the references to the compressed blocks, the SCMS can significantly improve performance over conventional memory systems.
As computing power increases, the memory space requirements of many programs, such as multimedia and 3D graphics applications, have increased dramatically, by as much as 50–100% per year. However, DRAM technology has not kept up with this requirement because its density has increased by just under 60% per year . Furthermore, the performance gap between processor and memory is increasing steadily and has given rise to a phenomenon known as the ‘memory wall’ problem, which means that main memory access time is the primary obstacle in improving the overall performance of computer systems . In addition, a disk access takes 105 times as long as a memory access and thus the time to load data from the disk becomes a significant portion of the overall execution time of a program . To reduce these processor–memory and memory–disk performance gaps, conventional computer systems take advantage of a memory hierarchy. However, a long latency for access to lower level storage systems still occurs due both to relatively slower access and pin-bandwidth limitation . Data compression, which has already been used at the lower level of the memory hierarchy such as the disk or network environment, is a new alternative method for reducing processor–memory and memory–disk performance gaps. Two advantages can be obtained by storing compressed data at each level of the memory hierarchy. First, the effective storage space for each level of the hierarchy can be increased resulting in a reduction of cache misses and page faults. Second, the data transfer time can be reduced. Transferring data in compressed form can improve the effective bandwidth for each level of the memory hierarchy, resulting in a reduced miss penalty and a reduced data load time. On the other side, the time taken to compress and decompress data incurs significant overhead and this negative effect is enlarged at the higher levels of the memory hierarchy. This overhead may offset the benefits of compression and worsen overall system performance. Thus, there are two fundamental problems to be solved: redesigning the highest level of the memory hierarchy to store the compressed data, and minimizing or hiding the compression/decompression time. In this research, a selectively compressed memory system (SCMS) is proposed with its cache architecture and memory management policy. The SCMS employs several techniques to reduce decompression overhead and supports a fixed memory allocation method for storing various sizes of compressed data effectively. On-line compression and decompression are performed in hardware, which operates at the processor cycle rate. The performance of the proposed SCMS is evaluated via an analytic model devised in this work, which gives evaluation results that are realistic and reliable without resorting to complicated simulation. The results show that the performance of SCMS is largely affected by three major parameters, i.e. the compression efficiency, the percentage of references that are in the compressed data blocks, and the percentage of references found in the decompression buffer. The decompression buffer is critical in improving the performance of the SCMS. If the decompression buffer handles at least 70% of references to the compressed blocks, the SCMS significantly improves performance with respect to conventional memory systems (CMS). The improvement is, of course, the greatest when compression efficiency is high, and the probability of referencing the compressed data is also high. Section 2 reviews the on-line data compression method and related work. In Section 3, the characteristics and organization of the proposed SCMS are described, including the cache architecture and management policy. Performance improvement (PI) is evaluated analytically in Section 4. Finally, Section 5 draws our conclusions.
نتیجه گیری انگلیسی
In this research, a SCMS is described as a new method to overcome processor–memory and memory–disk performance gaps. The SCMS employs a selective compression technique and a fixed space allocation method, which allows a simple cache architecture and efficient management of the compressed data blocks. In addition, some effective techniques for reducing or hiding decompression overhead were devised. Specifically, the decompression buffer plays an important role in reducing decompression overhead by caching recently accessed compressed data blocks in uncompressed form and providing them without re-decompression for the subsequent requests. The performance evaluation of SCMS was performed via an analytic model using selected probability parameters, which make the evaluation results realistic and reliable without performing complicated simulation. According to the evaluation results, it is expected that if the decompression buffer can filter more than 70% of the accesses to the compressed blocks in the L2 cache, the performance improvement of the SCMS can be significant. The improvement is the greatest for the high-speed implementation of the decompressor hardware and for specific applications with high compression efficiency, a large number of on-chip cache misses, and high probability of accesses to the compressed blocks. Moreover, because the evaluation results estimated in this research do not reflect any of the advantages due to compression of main memory, more performance improvement can be expected if it is included.