The ModPipe configuration file¶
The configuration file, specified by the --conf_file command line
argument to many ModPipe programs, provides file locations, such as the
location of the template sequence and profile files, and the location of
ModPipe output files. The configuration file also provides a number of run
parameters, such as whether template sequences will be clustered before
building models. The variables in the configuration file are
described here.
Environment variables can be used in the configuration file using familiar
Unix syntax - e.g. $FOO is replaced with the value of the environment
variable FOO.
See also Databases used by ModPipe for information on the databases pointed to by this file, and information on setting up these databases to run ModPipe if you are not in the Sali lab.
See also a sample configuration file.
TMPDIRModPipe will create a new directory here for every sequence it processes and will use this as scratch space for all calculations. This directory should be local to the machine running ModPipe in order to reduce network traffic.
DATDIRThis is the base directory in which the ModPipe filesystem will be created.
TEMPLATESEQDBThe name of the file of template sequences to use in searches for matches. Unless you have some special need and know what you are doing you should use a binary (HDF5) database file; for example,
/netapp/sali/ModPipe/database/PDB95/db/pdb_95.hdf5.XPRF_LISTName of file containing list of template profile (
.prf) files – one for each template in theTEMPLATESEQDBdatabase file.XPRF_PSSMDBName of the file containing position-specific scoring matrix data for each template sequence.
PDB_REPOSITORYThe name of the directory containing PDB files.
NRSEQDBThe name of the file of non-redundant sequences to use for construction of profiles by Modeller’s
Profile.build(). Unless you have some special need and know what you are doing you should use a binary (HDF5) database file; for example,/netapp/database/uniprot/sequences/uniprot90.hdf5.NCBISEQDBThe name of the file of non-redundant sequences to use for construction of profiles by PSI-BLAST.
NRDBTAGA short-name for the non-redundant sequence database that ModPipe will use as part of the name of profile files (multiple sequence alignments) constructed using that database. Usually it will be
uniprot90.PRFUPDATEIf this flag is set to ON, irrespective of the existence of a profile for the target sequence, a new profile will be calculated. If
PRFUPDATEis set to OFF, it will calculate a new target-sequence profile only if one does not already exist.NUMMODELSNumber of models to calculate for each alignment.
CLUSTERALIWhen using multiple fold assignment methods, since they are used independently they will typically find a number of templates in common. By setting this variable to ON, redundant and highly similar hits in the alignment of all templates found will be removed.