MAFFT

MAFFT is a multiple alignment method that includes two algorithmic techniques:

(i) Using fast Fourier transform (FFT) to identify homologous regions rapidly. In this technique, an amino acid sequence is converted to a sequence composed of volume and polarity values.

(ii) Using a simplified scoring system to reduce CPU time and increase accuracy of alignments. MAFFT employs a progressive method (FFT-NS-2) and an iterative refinement method (FFT-NS-i).

A manual for MAFFT is available. [doc] [pdf]

MAFFT home page here.

INPUT = dna or protein sequences in multiple fasta format (MFA).

Test input file (nucleic acid): mafft_input.txt

Test output file1 (clustal format): output.mafft

Test output file2 (aligned fasta format): output_afa.mafft

If you use MAFFT, please cite (as appropriate):

Katoh, K.and Toh, H. (2010) "Parallelization of the MAFFT multiple sequence alignment program." Bioinformatics 26(15): 1899-1900.

Katoh, K., and Toh, H. (2007) "PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences." Bioinformatics 23:372-374. (describes the PartTree algorithm). [pdf]

Katoh, K., Kuma, K., Toh, H. and Miyata, T.(2005) "MAFFT version 5: improvement in accuracy of multiple sequence alignment." Nucl. Acids Res. 33:511-518. (describes [ancestral versions of] the G-INS-i, L-INS-i and E-INS-i strategies) [pdf]

Katoh, K., Misawa, K., Kuma K., and Miyata, T.(2002) "MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform." Nucleic Acids Res. 30:3059-3066. (describes the FFT-NS-1, FFT-NS-2 and FFT-NS-i strategies) [pdf]

Copyright © 2002-2007 Kazutaka Katoh (mafft)

If there is a tool or a feature you need, please let us know.