READSEQ reads and writes nucleic/protein sequences in various formats. Data files may have multiple sequences. .The java version employed here is also more efficient, working faster than the compiled C classic version. Bear in mind that the program is designed to extract sequences, and it does not pay strict attention to metadata surrounding the sequences. In other words, information can be lost during conversions. We have made a couple of fixes to READSEQ, so that it doesn't truncate taxon names any longer. We are happy to share those fixes on request. We have made a minimal installation, exposing only the features we think will be used. But if you require a feature we haven't exposed, please let us know.
Supported INPUT formats: Fasta, Clustal, Nexus, Phylip and Phylip 3.2, Plain/Raw, GCG, MSF, IG/Stanford, GenBank, NBRF, EMBL, PIR/CODATA, DNAStrider, FlatFeat, GFF, ACEDB, SCF
Supported OUTPUT formats: Fasta, Clustal, Nexus, Phylip and Phylip 3.2, Plain/Raw, GCG, MSF, Pretty, IG/Stanford, GenBank/G, NBRF, EMBL, PIR/CODATA, DNAStrider, FlatFeat, GFF, ACEDB, SCF
A manual for READSEQ is available.
READSEQ home page here.
INPUT = dna and protein sequences, in various formats.
Test input file (fasta): readseq_in_fasta.txt
Test output file1 (nexus): readseq_out_nex.txt
If there is a tool or a feature you need, please let us know.