دانلود مقاله ISI انگلیسی شماره 118814
ترجمه فارسی عنوان مقاله

یک داده با کارایی بالا چارچوب انقباض تانسور موازی: کاربرد الکترومکانیکی متصل

عنوان انگلیسی
A high performance data parallel tensor contraction framework: Application to coupled electro-mechanics
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
118814 2017 33 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Computer Physics Communications, Volume 216, July 2017, Pages 35-52

ترجمه کلمات کلیدی
انقباض تنسور، موازی داده ها، الگوهای ابرمتن دامنه، الکتریکی مکانیکی غیر خطی،
کلمات کلیدی انگلیسی
Tensor contraction; Data parallelism; Domain-aware expression templates; Nonlinear coupled electro-mechanics;
پیش نمایش مقاله
پیش نمایش مقاله  یک داده با کارایی بالا چارچوب انقباض تانسور موازی: کاربرد الکترومکانیکی متصل

چکیده انگلیسی

The paper presents aspects of implementation of a new high performance tensor contraction framework for the numerical analysis of coupled and multi-physics problems on streaming architectures. In addition to explicit SIMD instructions and smart expression templates, the framework introduces domain specific constructs for the tensor cross product and its associated algebra recently rediscovered by Bonet et al. (2015, 2016) in the context of solid mechanics. The two key ingredients of the presented expression template engine are as follows. First, the capability to mathematically transform complex chains of operations to simpler equivalent expressions, while potentially avoiding routes with higher levels of computational complexity and, second, to perform a compile time depth-first or breadth-first search to find the optimal contraction indices of a large tensor network in order to minimise the number of floating point operations. For optimisations of tensor contraction such as loop transformation, loop fusion and data locality optimisations, the framework relies heavily on compile time technologies rather than source-to-source translation or JIT techniques. Every aspect of the framework is examined through relevant performance benchmarks, including the impact of data parallelism on the performance of isomorphic and nonisomorphic tensor products, the FLOP and memory I/O optimality in the evaluation of tensor networks, the compilation cost and memory footprint of the framework and the performance of tensor cross product kernels. The framework is then applied to finite element analysis of coupled electro-mechanical problems to assess the speed-ups achieved in kernel-based numerical integration of complex electroelastic energy functionals. In this context, domain-aware expression templates combined with SIMD instructions are shown to provide a significant speed-up over the classical low-level style programming techniques.