دانلود مقاله ISI انگلیسی شماره 156826
ترجمه فارسی عنوان مقاله

دسترسی به دسترسی به داده ها برای روش های پراکنده مجدد

عنوان انگلیسی
Data access skipping for recursive partitioning methods
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
156826 2018 20 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computer Languages, Systems & Structures, Volume 53, September 2018, Pages 143-162

ترجمه کلمات کلیدی
حافظه، فراگیری ماشین، بهینه سازی کامپایلر، برنامه ریزی موازی،
کلمات کلیدی انگلیسی
Memory; Machine learning; Compiler optimization; Parallel programming;
پیش نمایش مقاله
پیش نمایش مقاله  دسترسی به دسترسی به داده ها برای روش های پراکنده مجدد

چکیده انگلیسی

The memory performance of data mining applications became crucial due to increasing dataset sizes and multi-level cache hierarchies. Recursive partitioning methods such as decision tree and random forest learning are some of the most important algorithms in this field, and numerous researchers worked on improving the accuracy of model trees as well as enhancing the overall performance of the learning process. Most modern applications that employ decision tree learning favor creating multiple models for higher accuracy by sacrificing performance. In this work, we exploit the flexibility inherent in recursive partitioning based applications regarding performance and accuracy tradeoffs, and propose a framework to improve performance with negligible accuracy losses. This framework employs a data access skipping module (DASM) using which costly cache accesses are skipped according to the aggressiveness of the strategy specified by the user and a heuristic to predict skipped data accesses to keep accuracy losses at minimum. Our experimental evaluation shows that the proposed framework offers significant performance improvements (up to 25%) with relatively much smaller losses in accuracy (up to 8%) over the original case. We demonstrate that our framework is scalable under various accuracy requirements via exploring accuracy changes over time and replacement policies. In addition, we explore NoC/SNUCA systems for similar opportunities of memory performance improvement.