A PHP Error was encountered

Severity: 8192

Message: Non-static method URL_tube::usage() should not be called statically, assuming $this from incompatible context

Filename: url_tube/pi.url_tube.php

Line Number: 13

A Real Linear and Parallel Multiple Longest Common Subsequences (MLCS) Algorithm

KDD Topics

Abstract

Information in various applications is often expressed as character sequences over a ﬁnite alphabet (e.g., DNA or protein sequences). In Big Data era, the lengths and sizes of these sequences are growing explosively, leading to grand challenges for the classical NP-hard problem, namely searching for the Multiple Longest Common Subsequences (MLC-S ) from multiple sequences. In this paper, we ﬁrst unveil the fact that the state-of-the-art MLCS algorithms are unable to be applied to long and large-scale sequences alignments. To overcome their defects and tackle the longer and large-scale or even big sequences alignments, based on the pro-posed novel problem-solving model and various strategies, e.g., parallel topological sorting, optimal calculating, reuse of intermediate results, subsection calculation and serialization, etc., we present a novel parallel MLCS algorithm. Exhaustive experiments on the datasets of both synthetic and real-world biological sequences demonstrate that both the time and space of the proposed algorithm are only linear in the number of dominants from aligned sequences, and the proposed algorithm signiﬁcantly outperforms the state-of-the-art MLCS algorithms, being applicable to longer and large-scale sequences alignments.

Filed under: Mining Rich Data Types

A PHP Error was encountered

KDD Topics

A Real Linear and Parallel Multiple Longest Common Subsequences (MLCS) Algorithm

Abstract

Comments