دانلود مقاله ISI انگلیسی شماره 81583
ترجمه فارسی عنوان مقاله

یک برنامه خطی عدد صحیح جمع آوری هزینه برای پیدا کردن موتیف

عنوان انگلیسی
A cost-aggregating integer linear program for motif finding
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
81583 2011 9 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Journal of Discrete Algorithms, Volume 9, Issue 4, December 2011, Pages 326–334

ترجمه کلمات کلیدی
زیست شناسی محاسباتی، پیدا کردن دنباله دنباله، برنامه ریزی خطی عدد صحیح
کلمات کلیدی انگلیسی
Computational biology; Sequence motif finding; Integer linear programming

چکیده انگلیسی

In the motif finding problem one seeks a set of mutually similar substrings within a collection of biological sequences. This is an important and widely-studied problem, as such shared motifs in DNA often correspond to regulatory elements. We study a combinatorial framework where the goal is to find substrings of a given length such that the sum of their pairwise distances is minimized. We describe a novel integer linear program for the problem, which uses the fact that distances between substrings come from a limited set of possibilities allowing for aggregate consideration of sequence position pairs with the same distances. We show how to tighten its linear programming relaxation by adding an exponential set of constraints and give an efficient separation algorithm that can find violated constraints, thereby showing that the tightened linear program can still be solved in polynomial time. We apply our approach to find optimal solutions for the motif finding problem and show that it is effective in practice in uncovering known transcription factor binding sites.