Progressive Alignment Using Shortest Common Supersequence
Loading...
Files
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The comparison among sequences is very important task in bioinformatics. Sequence
alignment provides the better information about comparison among sequences.
Alignment of more than two sequences called multiple sequence alignment. Multiple
sequence alignment solves many problems of bioinformatics.
Multiple Sequence Alignment is an NP-hard problem. The complexity of finding the
optimal alignment is O (LN) where L is the length of the longest sequence and N is the
number of sequences. Hence the optimal solution is nearly impossible for most of the
datasets. Progressive alignment solves MSA in very economic complexity but does
not provide accurate solutions because progressive alignment has problem of local
maxima. There is a tradeoff between accuracy and complexity. Most of the developers
are trying to create or enhance the techniques for better accuracy with lesser time
complexity.
ClustalW is used for progressive alignment, and ClustalW2.1 is the latest version
released till now. Guide tree is a binary tree that guides the alignment of sequences.
Guide tree is generated by distance scores between sequences. Distance score is
calculated by the alignment score divided by the length of shorter sequence. In this
paper, Shortest Common Supersequence (SCS) is utilized to generate the guide tree
for progressive alignment and the output alignment results are checked by BAliBASE
benchmarks for accuracy. According to SP and TC scores, progressive alignment
using the guide tree generated by SCS is better than the guide tree generated by
alignment score. Original ClustalW2.1 is modified by SCS, and modified ClustalW2.1
gives better results than the original tool.
Description
ME, CSED
