MAFFT is a multiple alignment method that includes two algorithmic techniques:
(i) Using fast Fourier transform (FFT) to identify homologous regions rapidly. In this technique, an amino acid sequence is converted to a sequence composed of volume and polarity values.
(ii) Using a simplified scoring system to reduce CPU time and increase accuracy of alignments. MAFFT employs a progressive method (FFT-NS-2) and an iterative refinement method (FFT-NS-i).
A manual for MAFFT is available. [doc] [pdf]
MAFFT home page here.
INPUT = dna or protein sequences in multiple fasta format (MFA).
Test input file (nucleic acid): mafft_input.txt
Test output file1 (clustal format): output.mafft
Test output file2 (aligned fasta format): output_afa.mafft
If you use MAFFT, please cite (as appropriate):
Katoh, K.and Toh, H. (2010) "Parallelization of the MAFFT multiple sequence alignment program." Bioinformatics 26(15): 1899-1900.
Katoh, K., and Toh, H. (2007) "PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences." Bioinformatics 23:372-374. (describes the PartTree algorithm). [pdf]
Katoh, K., Kuma, K., Toh, H. and Miyata, T.(2005) "MAFFT version 5: improvement in accuracy of multiple sequence alignment." Nucl. Acids Res. 33:511-518. (describes [ancestral versions of] the G-INS-i, L-INS-i and E-INS-i strategies) [pdf]
Katoh, K., Misawa, K., Kuma K., and Miyata, T.(2002) "MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform." Nucleic Acids Res. 30:3059-3066. (describes the FFT-NS-1, FFT-NS-2 and FFT-NS-i strategies) [pdf]
Copyright © 2002-2007 Kazutaka Katoh (mafft)
If there is a tool or a feature you need, please let us know.