- modeller_usage - salilab.org

Re: 0.25 backbone RMSD/2.9 heavy atoms
by Jianhui Wu 20 May '03

20 May '03

Hi, Shiyong, Sure. The RMSD are from the LS superimposed structures. Using the crystal structure of the same protein as the template (perfect alignment here), the backbone RMSD are great (under 0.3) but if all the heavy atoms are selected, the RMSD jumped to around 3.0 angstroms. Here, only 10 models (instead of 50) generated. So, it is an 'initial' result. But 3.0 angstrom is really surprising given the perfect template. Since the only parameter changed is the MD_level, all other default parameters were used. Perhaps the result is what we expect? Best wishes, Jian Hui

1 0

0.25 backbone RMSD/2.9 heavy atoms
by Jianhui Wu 20 May '03

20 May '03

Dear Modeller users, I am using Modeller6.2. To test the quality of the model, I tried to build a model for a protein (300 aa)using its crystal structure as the template. With MD_level = refine_1 or 3, the backone RMSD is 0.25-0.3, which is great. However, in both refinement conditions, the RMSD of the heavy atoms of the backbone plus chainchains is close to 3.0 angstroms. By visual inspection of the superimposed structures, the sidechains of the model indeed do not overlap with its crystal structure so well. My questions: Did you observe similar result? How do you refine the sidechains? MD simulation of the sidechains with backbone restrained (in explicit water solution) is in my mind. I would appreciate your suggestion and experience. Regards, Jain Hui Wu Lady Davis Institute McGill Universty

1 0

info log file
by Angelo Favia 20 May '03

20 May '03

I find in the log file this message: Implied target CA(i)-CA(i+1) distances longer than 8.0 angstroms: ALN_POS TMPL RID1 RID2 NAM1 NAM2 DIST ---------------------------------------------- 424 1 424 425 R Y 16.817 END OF TABLE Is it a big problem? Can I ignore it? Thanks in advance. Angelo Favia

1 0

info log file
by Angelo Favia 20 May '03

20 May '03

I find in the log file this message: Implied target CA(i)-CA(i+1) distances longer than 8.0 angstroms: ALN_POS TMPL RID1 RID2 NAM1 NAM2 DIST ---------------------------------------------- 424 1 424 425 R Y 16.817 END OF TABLE Is it a big problem? Can I ignore it? Thanks in advance. Angelo Favia

1 0

D aminoacids
by Angelo Favia 20 May '03

20 May '03

How can I prevent the formation of D aminoacids? I'm trying to build a protein with a low degree of homology (25%) respect to the template. Is this the reason why I find in the final models so many D aminoacids? Angelo Favia

1 0

How model two o more alignment automatic
by JP 15 May '03

15 May '03

I have two alignment , how I can automate the process for two o more alignments only one file? for example have this two alignment in one file "alignment.ali" : C; Alignment1 >P1;1efwA structureX:1efw:162:A:300:A:::: GFVQVETPFLTKSTPEGARDFLVPYR---HEPG----------------LFYALPQSPQL FKQMLMVAG-LDRYFQIARCFRDEDLRADRQPDFTQLDLEMSFVE--VE----------- ------DVLELNERLMAHVFREALGVELPLPFPRLSYEEAMERYGSDKPDLRFGLELK* >P1;whi6464 sequence:whi6464:9::186::::: GFVEVETPVLLKSTPEGAREFLVPTRTSASAPSVKSSIGGGPRESSSQPLFYALPQSPQQ PKQLLIASGAVDRYYQIAKCFRDEDGRKDRQPEFTQVDLEMAWVSWGIEPSNCSQGDHNV WRIGGKEVREIIERLIRKIWSTVEGIELPSSFTVMTYEEAMGRFGSDKPDTRFGLEVR* C; Alignment2 >P1;1cqjD structureX:1cqj:132:D:284:D:::: CKI--GIQPGHIHKPGKVGIVSRSGTLTYEAVKQTTDYGFGQSTCVGIGGDPIPGSNFID ILEMFEKDPQTEAIVMIGEIGGSAEEEAAAYIKEH-VTKPVVGYIAGVTAPKGK---RMG HAGAIIAGGKGTADEKFAALEAAGVKTVRSLADIGEALK* >P1;whi6880 sequence:whi6880:12::170::::: CSMMDNIIASKLYRPGSVGYVSKSGGMSNELNNILSLVTNGTYEGIAIGGDRYPGSTFID HLLRYENDPECKMLVLLGEVGGIEEYRVIEAVKKGLIKKPIVAWAIGTCAKMFTTEVQFG HAGSMANSDMETADAKNAAMRKAGFIVPDTFEDLPQVLR* and which is the configuration for file "TOP"? thanks

1 0

bad pdb file error
by Mike Kurtz 12 May '03

12 May '03

Sorry for the bother, but I can't figure this out for the life of me... I'm attempting to do a 2d alignment of the R2R3 region of AtTT2 with the MYB 1MSF. However, when I try to run the align2d command (script below), it says that I am using a malformed pdb file. I am certain that is not the case, as I just recently got the file directly from the pdb. If anyone could help me out with this, it would be much appreciated. ____ <every ATOM definition in the pdb file, same as below> iup2crm_279W> IUPAC atom not found in topology library; residue type index: C4 25 Possible reasons are a non-standard PDB file, or a new residue/atom types. Compare the offending residue in the PDB file with its definition in the topology library. fndatmi_285W> Number of residues <> number of atoms; atom code: 127 105 CA fndatmi_285W> Number of residues <> number of atoms; atom code: 127 105 CA openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL6v2}/modlib//as1.sim.mat rdrrwgh_268_> Number of residue types: 20 dispers_247E> Internal error: 126 127 recover____E> ERROR_STATUS >= STOP_ON_ERROR: 1 1 Dynamically allocated memory at finish [B,kB,MB]: 5936909 5797.763 5.662 Starting time : 2003/05/08 17:50:17.028 Closing time : 2003/05/08 17:50:26.897 Total CPU time [seconds] : 0.00 _____ The .top: SET OUTPUT_CONTROL 1 1 1 1 1 READ_TOPOLOGY FILE = '/home/osu3233/bin/modeller6v2/modlib/top_allh.lib' READ_MODEL FILE = '1MSF' SEQUENCE_TO_ALI ALIGN_CODES = '1MSF' ALIGN_CODES = '1MSF' #GENERATE_TOPOLOGY SEQUENCE = '1MSF' #SEQUENCE_TO_ALI ALIGN_CODES = '1MSF' READ_ALIGNMENT FILE = 'test.ali', ALIGN_CODES = 'AtTT2', ADD_SEQUENCE = ON ALIGN2D WRITE_ALIGNMENT FILE='AtTT2-1msf.ali', ALIGNMENT_FORMAT='PIR' WRITE_ALIGNMENT FILE='AtTT2-1msf.pap', ALIGNMENT_FORMAT='PAP' Thanks, Mike Kurtz

1 0

Re: modeling question
by Modeller Care 12 May '03

12 May '03

forwarded by the list owner -------------------------------------------------------- Hi, for question 1, i think it is normal and expected that a model, even if built on a sequentially 100% identical template, will be somewhat different compared to an experimental solution. Although it should not go beyond let us say 0.5, or certainly not beyond 1.0 Ang RMSD. It is below the "experimental error" i.e. if the same protein is solved experimentally in different crystal forms, or at different resolution levels, or solved at high resolution but once by X-ray and once by NMR, you will still see an approx <1 Ang RMSD difference among the structures. So there is nothing special to see that your model is not exactly identical to the experimental one. for a reference you can look up figure 6 (and text) in chapter 7 (pp.167-206), book: Protein Structure (determination, analysis and applications for drug discovery) editor: DI Chasman, 2003 Marcel Dekker. question 2: it is a very interesting and useful survey that you did. Unfortunately it is difficult to generalize, because in each modeling case the set of available templates (their sequence identity to the target and structural variability with each other) is different. However your experiment about a proper "essay" is near exhaustive within your specific experiment, so you are certainly in a position to make a point. Of course the best would be to use instead of Procheck or other programs the actual experimental structures to verify the best "essay", e.g. re-model your protein A without the 100 % identical template and explore the same question you did for protein B. In this case you can compare your resulting models with the actual X-ray structure. Andras On Mon, 2003-05-12 at 15:52, Douglas Kojetin wrote: > please see the message, originally directed towards dr. sali, below. > > if anyone has any comments, please send them! > > many thanks, > doug kojetin > > Begin forwarded message: > > > Dr. Sali: > > > > I am a graduate student in the Department of Molecular and Structural > > Biochemistry at North Carolina State University. I have a question > > more about modeling process itself rather than the program MODELLER. > > > > I have used your program, MODELLER, to create models of a subfamily of > > proteins our lab and collaborators are interested in (total ~ 30). > > There are approximately 10 solved structures to the domain of > > interest. One of these solved structures (structure A) is in the same > > subfamily within the same species of proteins we are modeling (model > > A), whereas the other 29 proteins are of unknown solved structure. My > > question concerning the use of templates in the modeling process. > > > > ############## > > my main question > > ############## > > > > (if this is confusing, please let me know and i will rephrase) ... > > > > Would using a solved structure (structure A) to model a protein of > > exact sequence (model A) which will be used in a comparison of 29 > > other structures with no known structures (and lower 'homology' > > compared to that of structure A to model A -- which is 100%) bias > > model A? Overall, we are interested in comparing all 30 structures. > > This comes mostly from outside comments that our modeled protein does > > not look 'exactly' like the solved structure. As one would like it to > > look as close as possible to the solved structure, it is a model after > > all, and perhaps we just need to be more descriptive in explaining our > > results, especially pertaining to this specific model. > > > > ##################### > > how i modeled the proteins > > ##################### > > > > I performed a 'modeling parameter assay' to find the number of > > templates to use to model a protein (model B), ranging from 1 to ~8 > > templates. In addition, I 'assayed' the amount of refinement to use. > > > > Overall, I had an assay 'shaped' like a matrix with, for example, > > refinement across the top and # of templates going down. I produced 50 > > models for each and ran a variety of analyses on the models (including > > Ca RMSD to the most homologous protein, ERRAT, PROCHECK, etc) and > > computed the average 'value' output from the respective analyses. > > > > All in all, using four (4) templates and a refinement value of 1 > > produced the 'stereochemically best' models. > > > > I applied the same rationale to another protein of interest (model C), > > and the same trends were extrapolated. > > > > question > > --> is this rationale 'acceptable'? or how would you do something > > similar? > > > > Many thanks for your input, and I'm sorry for the long-winded email. > > > > Douglas Kojetin -- , Andras Fiser, PhD, assistant professor Dept. of Biochemistry & Seaver Center for Bioinformatics Albert Einstein College of Medicine 1300 Morris Park Ave, Bronx, NY 10461, USA phone:(718)430-3233 fax:(718)430-8565 http://www.fiserlab.org, mailto:andras@fiserlab.org ------ End of Forwarded Message

1 0

modeling question
by Douglas Kojetin 12 May '03

12 May '03

please see the message, originally directed towards dr. sali, below. if anyone has any comments, please send them! many thanks, doug kojetin Begin forwarded message: > Dr. Sali: > > I am a graduate student in the Department of Molecular and Structural > Biochemistry at North Carolina State University. I have a question > more about modeling process itself rather than the program MODELLER. > > I have used your program, MODELLER, to create models of a subfamily of > proteins our lab and collaborators are interested in (total ~ 30). > There are approximately 10 solved structures to the domain of > interest. One of these solved structures (structure A) is in the same > subfamily within the same species of proteins we are modeling (model > A), whereas the other 29 proteins are of unknown solved structure. My > question concerning the use of templates in the modeling process. > > ############## > my main question > ############## > > (if this is confusing, please let me know and i will rephrase) ... > > Would using a solved structure (structure A) to model a protein of > exact sequence (model A) which will be used in a comparison of 29 > other structures with no known structures (and lower 'homology' > compared to that of structure A to model A -- which is 100%) bias > model A? Overall, we are interested in comparing all 30 structures. > This comes mostly from outside comments that our modeled protein does > not look 'exactly' like the solved structure. As one would like it to > look as close as possible to the solved structure, it is a model after > all, and perhaps we just need to be more descriptive in explaining our > results, especially pertaining to this specific model. > > ##################### > how i modeled the proteins > ##################### > > I performed a 'modeling parameter assay' to find the number of > templates to use to model a protein (model B), ranging from 1 to ~8 > templates. In addition, I 'assayed' the amount of refinement to use. > > Overall, I had an assay 'shaped' like a matrix with, for example, > refinement across the top and # of templates going down. I produced 50 > models for each and ran a variety of analyses on the models (including > Ca RMSD to the most homologous protein, ERRAT, PROCHECK, etc) and > computed the average 'value' output from the respective analyses. > > All in all, using four (4) templates and a refinement value of 1 > produced the 'stereochemically best' models. > > I applied the same rationale to another protein of interest (model C), > and the same trends were extrapolated. > > question > --> is this rationale 'acceptable'? or how would you do something > similar? > > Many thanks for your input, and I'm sorry for the long-winded email. > > Douglas Kojetin

1 0

malign3D
by Modeller Care 07 May '03

07 May '03

Message forwarded by list-owner (log file delete because of encoding error) --------------------------------------------------- Dear sir, I'll be very much grateful to you if you can help me in one of my modeller job submission. I am trying to superimpose some structures simultaneously using malign3D. My top script file and alignment file are as follows. I had attached the log file also. There structures were of ~410 residues and with heme and a bound ligand. I always failed to get the structural alignment, the reason I could not figure out. Please help me in running this program, also tell me what are the general steps to follow for doing structural alignments and where are the chances for potential errors. ###########################TOP SCRIPT################## SET OUTPUT_CONTROL = 1 1 1 1 2 SET STOP_ON_ERROR = 1 SET MAXRES = 2000 READ_ALIGNMENT FILE = 'str.ali' SEQUENCE_TO_ALI SET ALIGN_CODES = '1dz8' '2cpp' SET ATOM_FILES = ALIGN_CODES SET ATOM_FILES_DIRECTORY = './' MALIGN3D SET ADD_SEQUENCE = on SET CURRENT_DIRECTORY = on SET OUTPUT = 'LONG' SET GAP_PENALTIES_3D = 0.0 1.75 SET FIT_ATOMS = 'CA' SET WRITE_FIT = on SET WRITE_WHOLE_PDB = on WRITE_ALIGNMENT FILE = 'str_STR.pir' SET ALIGNMENT_FORMAT = 'PIR' #######################ALIGNMENT FILE###################### >P1;1dz8 structureX:1dz8: 11 :A: 414 :A:undefined:undefined: 9.99: 9.99 -----LAPLPPHVPEHLVFDFDMYNPSNLSAGVQEAWAVLQESNVPDLVWTRCNGGHWIA TRGQLIREAYEDYRHFSSECPFIPREAGEAYDFIPTSM---DPPEQRQFRALANQVVGMP VVDKLENRIQELACSLIESLRPQGQCNFTEDYAEPFPIRIFMLLAG--LPEEDIPHL-KY LTDQMT-----------RPDGS-MTFAEAKEALYDYLIPIIEQRRQKPGTD-----AISI VANGQVNGR--PITSDEAKRMCGLLLVGGLDTVVNFLSFSMEFLAKSPEHRQELIQRPER IP------------------AACEELLRRFS-LVADGRILTSDYEFHGVQLKKGDQILLP QMLSGLDERENACPMHVDFSRQKVS----------HTTFGHGSHLCLGQHLARREIIVTL KEWLTRIPDFSIAPGAQ--IQHKSGIVSGVQALPLVWDPATTKAV* >P1;2cpp structureX:2cpp: 10 : : 414 : :undefined:undefined: 1.63: 9.99 ----NLAPLPPHVPEHLVFDFDMYNPSNLSAGVQEAWAVLQESNVPDLVWTRCNGGHWIA TRGQLIREAYEDYRHFSSECPFIPREAGEAYDFIPTSM---DPPEQRQFRALANQVVGMP VVDKLENRIQELACSLIESLRPQGQCNFTEDYAEPFPIRIFMLLAG--LPEEDIPHL-KY LTDQMT-----------RPDGS-MTFAEAKEALYDYLIPIIEQRRQKPGTD-----AISI VANGQVNGR--PITSDEAKRMCGLLLVGGLDTVVNFLSFSMEFLAKSPEHRQELIERPER IP------------------AACEELLRRFS-LVADGRILTSDYEFHGVQLKKGDQILLP QMLSGLDERENACPMHVDFSRQKVS----------HTTFGHGSHLCLGQHLARREIIVTL KEWLTRIPDFSIAPGAQ--IQHKSGIVSGVQALPLVWDPATTKAV* ########################################################## Thanking you sridhar ------ End of Forwarded Message

1 0