I just wanted to ask a few questions...I am modelling loops on a GPCR.
My current script follows my questions.
1. Will I get better sampling by running 20 different runs (with
different random number seeds) that create 25 models each thus a total
of 500 models?
This is probably virtually the same as using the same random seed and
just building 500 models. In difficult modeling cases the sampling
becomes an issue, so the more models you build the more likely you are
to sample the 'true' best model.
2. Would setting dynamic_coulomb = True and a relative _dielectric = 80
be the best way to simulate an aqueous environment for the loops?
No. The loop modeling potential is a statistical potential, and thus
implicitly includes solvation. It is not parameterized to work well with
electrostatics, to the best of my knowledge.
3. Is it possible to run the loop optimization with explicit
hydrogens? I guess my env.io.hydrogen = True does not actually set up
the calculation to do explicit hydrogens
but only tells Modeller to read the hydrogens from my pdb file.
Correct. To build models with hydrogens, you'd need to load the
top_allh.lib topology library rather than top_heav.lib (the default).
The loop modeling potential is not parameterized for hydrogens though,
so this probably wouldn't work well anyway. You'd probably be better off
adding hydrogens to your models after optimization is complete.
4. For the refinement level, I am using md_level = refine.very_slow for
what size system is the refine.slow_large used?
slow_large uses a larger timestep (10fs rather than 4fs) so you are
likely to see more integrator error if you use this instead of very_slow.
5. Would I get better sampling by setting repeat.optimization greater
than 1?
This just repeats the optimization several times, so might help to avoid
local minima.