Software to align dna sequences

8/18/2023

However, these faster approaches also have some limitations. įor the global alignment of DNA sequences and the construction of phylogenetic trees, many faster approaches using Spark have been proposed, such as HAlign and HAlign-II, that can efficiently build phylogenetic trees from large numbers of biological sequences and provide user-friendly web servers s through high-performance and distributed-computing infrastructures. Sequential SparkBWA, in which the Spark cluster improves computing efficiency by using the BWA approach, enabled multi-node computing to improve computing performance. Some software packages use distributed computing frameworks, such as Hadoop and Spark, and they are gradually attracting more attention. The user interface of MAFFT is still terminal based, allowing the user to manipulate and select the algorithms. In addition, those unrelated sequences require more memory when using Dynamic programming algorithms if the spatial complexity of Needleman–Wunsch is O( n 2). However, using MAFFT to align massive unrelated sequences is highly time consuming. Various iterations and optimizations to MAFFT have been made by contributors worldwide, and it is, at present, the most popular software for DNA and protein sequence alignments. It offers a range of multiple alignment methods, including L-INS-I and FFT-NS-2. MAFFT is a multiple sequence alignment program for Unix-like operating systems. Additionally, numerous phylogenetic tree construction software programs are also being widely used in comparative genomics, cladistics, bioinformatics, and other fields, ,,. Existing state-of-the-art tools, such as MAFFT, PASTA (>200,000 sequences MSA), ProbPFP (PHMM model optimized by particle swarm optimization) and Minimap2 (developed for nanopore sequencing), allow sequence alignments to be run on a multi-thread workstation and for specific context. Numerous sequence alignment analysis software packages are available online.

Thus, the performance levels of multi-sequence analysis and phylogenetic tree construction tools must improve. For instance, in metagenomics studies, millions of sequence reads are analyzed to determine the functional or taxonomic contents of microbial samples from the environment. In recent years, with the extreme increase in next-generation sequencing results, the data processing scale has grown from Mega Byte (MB) and Giga Byte (GB) to Terabyte (TB), PB, and even EB and ZB. That software is of great significance in these aspects, especially in the study of the whole-mitochondrial genome of plants.ĭuring biological information processing, similarities among sequences are used to construct phylogenetic trees for biological analyses. Nowadays, whole-genome research and NGS technology are becoming more popular, and it is necessary to save computational resources for laboratories. We implement a multiple DNA/RNA sequence alignment tool based on Center Star strategy and use suffix array algorithm to optimize the spatial and time efficiency. For mitochondrial genome datasets having limited numbers of sequences, MAFFT performed the required tasks, but it could not handle ultra-large mitochondrial genome datasets for core dump error.

It outperformed the existing technical tools, including MAFFT and HAlign-II. Ultra-large test DNA datasets, containing sequences of different lengths, some over 300 kb (kilobase), revealed that the Multiple DNA/RNA Sequence Alignment Tool Based on Suffix Tree (SaAlign) saved time and computational space. To improve the alignment of ultra-large datasets and ultra-long sequences, we optimized a dynamic programming algorithm using longest common substring methods. Mitochondrial genome analyses of multiple individuals and species require bioinformatics software therefore, their performances need to be optimized. With DNA-sequencing improvements, the amount of bioinformatics data is constantly increasing, and various tools need to be iterated constantly. Multiple DNA/RNA sequence alignment is an important fundamental tool in bioinformatics, especially for phylogenetic tree construction.

0 Comments

Software to align dna sequences

Leave a Reply.

Author

Archives

Categories