[WashU] HMMER
User's Guide


| Dept. of Genetics | WashU | Medical School | Sequencing Center | CGM | IBC|
| Eddy lab | Internal (lab only) | HMMER | PFAM | tRNAscan-SE | Software | Publications |

next up previous contents
Next: Command line options Up: Introduction Previous: Plan 7

Sequence file formats

For all the programs, unaligned sequence files can be in FASTA, Genbank, EMBL, or SWISS-PROT format, as well as a few other common file formats. The programs automatically detect what format the file is in and whether the sequences are DNA, RNA, or protein.

Aligned sequence files can be in ClustalW, GCG MSF, or SELEX format. SELEX format is a simple format of one line per sequence, containing the name first, followed by the aligned sequence. ClustalW, MSF and SELEX alignment files can also be used where unaligned format files are required; the sequences will be read in and their gaps removed.

Full specifications of these file formats and the other formats recognized by the HMM package are in the file formats chapter near the end of the guide.

The programs work on RNA, DNA, and protein sequence. They automatically detect what your sequences are. The behavior of the programs when a nucleic acid model is used to analyze protein sequences, or vice versa, is undefined. Certain other situations may arise (trying to search the ``complementary strand'' of a protein database, for example) that are nonsensical in certain contexts. Be forewarned. If you're lucky, the software will issue a snide warning to you if you try to do something nonsensical, but usually it will assume you know what you're doing.


next up previous contents
Next: Command line options Up: Introduction Previous: Plan 7


Direct comments and questions to <eddy@genetics.wustl.edu>