|
HMMER
User's Guide
|
|
Dept. of Genetics |
WashU |
Medical School |
Sequencing Center |
CGM |
IBC|
|
Eddy lab |
Internal (lab only) |
HMMER |
PFAM |
tRNAscan-SE |
Software |
Publications
|
Next: hmmsearch - search a
Up: Manual pages
Previous: hmmindex - create a
Subsections
hmmpfam [options] hmmfile seqfile
hmmpfam reads a single
sequence from seqfile and compares it against all the HMMs in hmmfile
looking for significantly similar sequence matches.
hmmfile will be looked
for first in the current working directory, then in a directory named
by the environment variable HMMERDB. This lets administrators install HMM
library(s) such as Pfam in a common location.
The output consists of three
sections: a ranked list of the best scoring HMMs, a list of the best scoring
domains in order of their occurrence in the sequence, and alignments for
all the best scoring domains. A sequence score may be higher than a domain
score for the same sequence if there is more than one domain in the sequence;
the sequence score takes into account all the domains. All sequences scoring
above the -E and -T cutoffs are shown in the first list, then every
domain found in this list is shown in the second list of domain hits. If
desired, E-value and bit score thresholds may also be applied to the domain
list using the -domE and -domT options.
- [-h ] Print brief help; includes
version number and summary of all options, including expert options.
- [-n
] Specify that models and sequence are nucleic acid, not protein. Other
HMMER programs autodetect this; but because of the order in which hmmpfam
accesses data, it can't reliably determine the correct "alphabet" by itself.
- [-A <n> ] Limits the alignment output to the <n> best scoring domains. -A0 shuts
off the alignment output and can be used to reduce the size of output
files.
- [-E <x> ] Set the E-value cutoff for the per-sequence ranked hit list
to <x>, where <x> is a positive real number. The default is 10.0. Hits with
E-values better than (less than) this threshold will be shown.
- [-T <x> ] Set
the bit score cutoff for the per-sequence ranked hit list to <x>, where <x>
is a real number. The default is negative infinity; by default, the threshold
is controlled by E-value and not by bit score. Hits with bit scores better
than (greater than) this threshold will be shown.
- [-Z <n> ] Calculate the E-value
scores as if we had seen a sequence database of <n> sequences. The default
is arbitrarily set to 59021, the size of Swissprot 34.
- [-cpu <n> ] Sets the maximum number of CPUs that the program will run on. The
default is to use all CPUs in the machine. Overrides the HMMER_NCPU environment
variable. Only affects threaded versions of HMMER (the default on most
systems).
- [-domE <x> ] Set the E-value cutoff for the per-domain ranked hit list
to <x>, where <x> is a positive real number. The default is infinity; by default,
all domains in the sequences that passed the first threshold will be reported
in the second list, so that the number of domains reported in the per-sequence
list is consistent with the number that appear in the per-domain list.
- [-domT <x> ] Set the bit score cutoff for the per-domain ranked hit list to
<x>, where <x> is a real number. The default is negative infinity; by default,
all domains in the sequences that passed the first threshold will be reported
in the second list, so that the number of domains reported in the per-sequence
list is consistent with the number that appear in the per-domain list. Important
note: only one domain in a sequence is absolutely controlled by this parameter,
or by -domT. The second and subsequent domains in a sequence have a de
facto bit score threshold of 0 because of the details of how HMMER works.
HMMER requires at least one pass through the main model per sequence;
to do more than one pass (more than one domain) the multidomain alignment
must have a better score than the single domain alignment, and hence the
extra domains must contribute positive score. See the Users' Guide for more
detail.
- [-forward ] Use the Forward algorithm instead of the Viterbi algorithm
to determine the per-sequence scores. Per-domain scores are still determined
by the Viterbi algorithm. Some have argued that Forward is a more sensitive
algorithm for detecting remote sequence homologues; my experiments with
HMMER have not confirmed this, however.
- [-null2 ] Turn off the post hoc second
null model. By default, each alignment is rescored by a postprocessing
step that takes into account possible biased composition in either the
HMM or the target sequence. This is almost essential in database searches,
especially with local alignment models. There is a very small chance that
this postprocessing might remove real matches, and in these cases -null2
may improve sensitivity at the expense of reducing specificity by letting
biased composition hits through.
- [-pvm ] Run on a Parallel Virtual Machine
(PVM). The PVM must already be running. The client program hmmpfam-pvm must
be installed on all the PVM nodes. The HMM database hmmfile and an associated
GSI index file hmmfile.gsi must also be installed on all the PVM nodes.
(The GSI index is produced by the program hmmindex.) Because the PVM implementation
is I/O bound, it is highly recommended that each node have a local copy
of hmmfile rather than NFS mounting a shared copy. Optional PVM support
must have been compiled into HMMER for -pvm to function.
- [-xnu ] Turn on
XNU filtering of target protein sequences. Has no effect on nucleic acid
sequences. In trial experiments, -xnu appears to perform less well than
the default post hoc null2 model.
Next: hmmsearch - search a
Up: Manual pages
Previous: hmmindex - create a
Direct comments and questions to <eddy@genetics.wustl.edu>