Hi everyone, here is the response I got from Jeff Hoch. Check it out if you
want to be in the loop.
In short we'll cover introductions and probably go over some slides via
Jeff on Bayesian validation WRT the PDB. If you won't be attending can you
confirm so I can prepare a quick intro on your behalf or you can send me
one.
Thanks!
- Jared
---------- Forwarded message ---------
From: Hoch,Jeffrey <hoch(a)uchc.edu>
Date: Thu, Jun 1, 2023 at 5:54 AM
Subject: Re: First meeting on Bayesian model validation (05/26)
To: Jared Sagendorf <jared.sagendorf(a)rcsb.org>
Cc: Baskaran,Kumaran <baskaran(a)uchc.edu>, Gryk,Michael R. <gryk(a)uchc.edu>,
Eghbalnia,Hamid R. <heghbalnia(a)uchc.edu>, Pustovalova,Yulia <
ypustovalova(a)uchc.edu>, Pozhidaeva,Alexandra <pozhidaeva(a)uchc.edu>,
Courtney,Joseph M. <jcourtney(a)uchc.edu>
Hi Jared –
I love your “few things”! Those are a mouthful . I’ll introduce our team
members, and it would be great if you could summarize the discussions that
you had in your initial meeting. We can offer an NMR perspective on the
topics. As this will be our first all-hands meeting, I might share a small
slide deck that I used to make the case to the wwPDB PIs that there is a
need and an opportunity to make structure validation “more Bayesian”. I
think I shared some of those slides with you all when I visited, but the
deck is short and would be a good way of getting us all on the same
wavelength (although it’s pretty clear were very close, if not already
there).
I’m attaching a manuscript that’s currently undergoing final revisions.
Please share it among your group, but not outside. A bit of context – when
I took over as head of BMRB, there was already a validation task force
working on revamping the validation pipeline for NMR structures.
Unfortunately, although the effort is/was very well-intentioned, it retains
much ad hoc and archaic language, e.g. “violations” of NMR “restraints” are
dealt with in a way that simply isn’t consistent or applicable in a broader
sense to any other type of empirical data. Rather than move the goalposts
on the task force, I suggested they complete their work to achieve the
original goal, and we would start a Bayesian initiative afresh. It serves
to highlight some of the issues we will have to deal with – such as how do
you convert hard upper and lower distance bounds into something that can
yield a realistic distribution of errors/structures? This will be necessary
for retrospective analysis of NMR structures in the PDB because distance
bounds are in most cases all that was supplied by depositors. Going
forward, BMRB will need to require peak tables with intensities of NOESY
cross-peaks, or perhaps even raw time-domain data for NOESY experiments.
I’m cc’ing the rest of our team so you can capture their email addresses if
you haven’t already. They are
Kumaran Baskaran – BMRB liaison to wwPDB and BMRB representative of the
NMR VTF
Michael Gryk – associate director of BMRB and our bona fide data scientist
Hamid Eghbalnia – lead of the analytics technology development component of
the NMRbox P41 grant, and our bona fide statistician/Bayesian
Yulia Pustovalova and Sasha Pozhidaeva, NMR spectroscopists par excellence,
who have been utilizing AlphaFold in their workflows and trying out ways to
validate computed structures based on prior knowledge.
Joseph Courtney – NMR spectroscopist and developer of the COMPASS package
from Chad Rienstra’s group – COMPASS used MODELER, chemical shift
prediction, and integrated some other software packages to determine
protein structures from unassigned carbon-carbon correlation spectra
(solid-state NMR). The use of forward-modeling of chemical shifts and
unassigned peak lists to drive/constrain the structure determination was
ahead of its time and very pertinent to Bayesian validation.
Looking forward to seeing you tomorrow.
Yours, Jeff
*From: *Jared Sagendorf <jared.sagendorf(a)rcsb.org>
*Date: *Wednesday, May 31, 2023 at 2:01 PM
*To: *"Hoch,Jeffrey" <hoch(a)uchc.edu>
*Subject: *Re: First meeting on Bayesian model validation (05/26)
*** Attention: This is an external email. Use caution responding, opening
attachments or clicking on links. ***
Hi Jeff, just looping back to this! A few things that came up during the
initial meeting with Andrej:
- Development of improved metrics for model quality
- Standardization of priors, forward functions, and likelihoods
- Establishment of a standardized vocabulary
- How to get different communities involved in all of the above decisions
In addition to any of the above, I'd be keen to learn more about what your
group has been working in w.r.t. model validation, Bayesian or otherwise!
Let me know if you have any thoughts!
- Jared