Analysis of Basic Local Alignments in Biomolecular Sequences
Loading...
Files
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Necessity is the Mother of Invention. Automated data collection tools and mature
database technology lead to tremendous amounts of data stored in databases, data
warehouses and other information repositories. Every day the world creates 52,000 terabytes of data. Only 4% of the data is used for any purpose. So a thought came that if we could do something useful with this data. And with this thought the field of DATA MINING was born.
Data in biology are very diverse and abundant. They can be catalogued and classified, but
often cannot be easily summarized or abstracted using a formula. Moreover the data of even a single microorganism is very large. Human genome sequences are several billion bp in length. So with the significant growth of the amount of biomolecular data, it becomes increasingly important to develop new techniques for extracting knowledge
from the data. Data mining is a fundamental operation in such a domain.
The gene sequences of related species of plants, animals and microorganisms show
complex patterns of similarity to one another and many molecular biologists are
convinced that an understanding of sequence evolution is the first step towards
understanding the evolution itself. In fact this is one of the most fascinating aspects of the study of evolution. Thus the comparison of gene sequences or biological sequence analysis is one of the processes used to understand sequence evolution. Just as the ancient Greeks used comparative anatomy to understand the human body and linguists used the Rosetta stone to decipher Egyptian hieroglyphs, today we can use comparative sequence analysis to understand genomes. There is variety of different tools available to perform sequence analysis. We studied a few tools of sequence analysis and in the end selected
the BLAST to be studied in detail due to its unmatched popularity, importance, speed and sensitivity. BLAST (Basic Local Alignment Search Tool), is a sophisticated software package for rapid searching of nucleotide and protein databases developed by Altschul. It rapidly identifies statistically significant matches between newly sequenced segments of nucleotide or amino acid and databases of known nucleotide or amino acid sequences.
