- Bayesian validation

Fwd: First meeting on Bayesian model validation (05/26)
by Jared Sagendorf 01 Jun '23

01 Jun '23

Hi everyone, here is the response I got from Jeff Hoch. Check it out if you want to be in the loop. In short we'll cover introductions and probably go over some slides via Jeff on Bayesian validation WRT the PDB. If you won't be attending can you confirm so I can prepare a quick intro on your behalf or you can send me one. Thanks! - Jared ---------- Forwarded message --------- From: Hoch,Jeffrey <hoch(a)uchc.edu> Date: Thu, Jun 1, 2023 at 5:54 AM Subject: Re: First meeting on Bayesian model validation (05/26) To: Jared Sagendorf <jared.sagendorf(a)rcsb.org> Cc: Baskaran,Kumaran <baskaran(a)uchc.edu>, Gryk,Michael R. <gryk(a)uchc.edu>, Eghbalnia,Hamid R. <heghbalnia(a)uchc.edu>, Pustovalova,Yulia < ypustovalova(a)uchc.edu>, Pozhidaeva,Alexandra <pozhidaeva(a)uchc.edu>, Courtney,Joseph M. <jcourtney(a)uchc.edu> Hi Jared – I love your “few things”! Those are a mouthful . I’ll introduce our team members, and it would be great if you could summarize the discussions that you had in your initial meeting. We can offer an NMR perspective on the topics. As this will be our first all-hands meeting, I might share a small slide deck that I used to make the case to the wwPDB PIs that there is a need and an opportunity to make structure validation “more Bayesian”. I think I shared some of those slides with you all when I visited, but the deck is short and would be a good way of getting us all on the same wavelength (although it’s pretty clear were very close, if not already there). I’m attaching a manuscript that’s currently undergoing final revisions. Please share it among your group, but not outside. A bit of context – when I took over as head of BMRB, there was already a validation task force working on revamping the validation pipeline for NMR structures. Unfortunately, although the effort is/was very well-intentioned, it retains much ad hoc and archaic language, e.g. “violations” of NMR “restraints” are dealt with in a way that simply isn’t consistent or applicable in a broader sense to any other type of empirical data. Rather than move the goalposts on the task force, I suggested they complete their work to achieve the original goal, and we would start a Bayesian initiative afresh. It serves to highlight some of the issues we will have to deal with – such as how do you convert hard upper and lower distance bounds into something that can yield a realistic distribution of errors/structures? This will be necessary for retrospective analysis of NMR structures in the PDB because distance bounds are in most cases all that was supplied by depositors. Going forward, BMRB will need to require peak tables with intensities of NOESY cross-peaks, or perhaps even raw time-domain data for NOESY experiments. I’m cc’ing the rest of our team so you can capture their email addresses if you haven’t already. They are Kumaran Baskaran – BMRB liaison to wwPDB and BMRB representative of the NMR VTF Michael Gryk – associate director of BMRB and our bona fide data scientist Hamid Eghbalnia – lead of the analytics technology development component of the NMRbox P41 grant, and our bona fide statistician/Bayesian Yulia Pustovalova and Sasha Pozhidaeva, NMR spectroscopists par excellence, who have been utilizing AlphaFold in their workflows and trying out ways to validate computed structures based on prior knowledge. Joseph Courtney – NMR spectroscopist and developer of the COMPASS package from Chad Rienstra’s group – COMPASS used MODELER, chemical shift prediction, and integrated some other software packages to determine protein structures from unassigned carbon-carbon correlation spectra (solid-state NMR). The use of forward-modeling of chemical shifts and unassigned peak lists to drive/constrain the structure determination was ahead of its time and very pertinent to Bayesian validation. Looking forward to seeing you tomorrow. Yours, Jeff *From: *Jared Sagendorf <jared.sagendorf(a)rcsb.org> *Date: *Wednesday, May 31, 2023 at 2:01 PM *To: *"Hoch,Jeffrey" <hoch(a)uchc.edu> *Subject: *Re: First meeting on Bayesian model validation (05/26) *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. *** Hi Jeff, just looping back to this! A few things that came up during the initial meeting with Andrej: - Development of improved metrics for model quality - Standardization of priors, forward functions, and likelihoods - Establishment of a standardized vocabulary - How to get different communities involved in all of the above decisions In addition to any of the above, I'd be keen to learn more about what your group has been working in w.r.t. model validation, Bayesian or otherwise! Let me know if you have any thoughts! - Jared

1 0

Meeting with Jeff Hoch's group moved to June 2nd
by Jared Sagendorf 22 May '23

22 May '23

Hi all, there was a schedule conflict on Jeff's end so we are moving the first meeting with his group to Friday June 2nd at 10:00 AM PST. This is outside our normal bi-weekly schedule, so apologies if this creates new time conflicts on our end! I'd like to keep this week's regularly scheduled meeting however, and I was thinking an overview of some current (probably non-Bayesian) methods for model or data validation would be useful. Would anyone like to volunteer to present some of their past or current work on model or data validation? Or perhaps review what is currently being done in the PDB/BMRB/EMDB? For any non-Bayesian methods, a discussion could then follow of what would be possible with a Bayesian approach, and any related trade-offs. Let me know if you can present some material! Otherwise I'll pester people individually - Jared

1 0

Subgroup Meeting tomorrow
by Jared Sagendorf 11 May '23

11 May '23

Hi everyone, just a reminder that we're meeting tomorrow at 10AM PST. I put together some slides/ipython notebook with a brief, high-level overview of Bayesian inference and I'll review the following paper: https://dx.doi.org/10.1021/acs.jpca.0c05026 which I chose because I found it a nice application of Bayesian methods in a setting that is close to our work, but perhaps something most people are less familiar with (me especially). I created a shared google drive so we can all share materials from these meetings, but I wasn't able to invite people with non-gmail e-mails so let me know if you want access and I'll try to figure it out. In addition we can discuss any agenda we'd like to set for the first meeting with Jeff Hoch's group, which will be on the 26th of May. Looking forward to the discussion! - Jared

1 0