CoreRefine

This script generates a mutations.resfile file that allows better core packing of a protein.

DESCRIPTION:

This script does the following:

prints out a mutations.resfile file that allows better core packing of a protein, the core is calculated by the SASA (solvent-accessible surface area) of each amino acid within the protein. The Resfile code should result in a better packed protein core.

Combining information from these references (2,3,8,9) I found that these chosen amino acids fits them all, thus the mutations.resfile code is chosen as follows:

For the core: If the amino acids in the protein's core is part of the loop, then mutate the residue to one of the following amino acids: AVILPFWM If the amino acids in the protein's core is part of the helix, then mutate the residue to one of the following amino acids: AVILFWM If the amino acids in the protein's core is part of the sheet, then mutate the residue to one of the following amino acids: AVILFWM

For the boundery (CoreBoundRefine.py): If the amino acids in the protein's boundery is part of the loop, then mutate the residue to one of the following amino acids: AVILFYWGNQSTPDEKR If the amino acids in the protein's boundery is part of the helix, then mutate the residue to one of the following amino acids: AVILWQEKFM If the amino acids in the protein's boundery is part of the sheet, then mutate the residue to one of the following amino acids: AVILFYWQTM

There are two version of this script: with Alenine (_A) and without Alenine (_noA). Refinement without Alenine seems to be better.

This script was written to run on GNU/Linux using python 3, it was not tested in Windows or MacOS. This script will mostly be useful to refine a protein's core packing after a Rosetta FFL (Fold From Loop) computation using RosettaDesign, but can still be used to refine any protein's core. Contact the author at [email protected] for any questions regarding this script.

HOW TO USE:

To use follow these steps:

Install biopython by running the following command in terminal (python3 -m pip install biopython) you need pip to be installed, if you do not have it you can install it in linux by running the following command in terminal (sudo apt-get install python3-pip).
Install DSSP in linux by running the following command in terminal (sudo apt-get install dssp).
Install numpy (python3 -m pip install numpy).
All files must be in the same directory as this script.
Identify the motif's start residue (MOTIF_START) and the motif's end residue (MOTIF_END). The motif is the part of the protein that you do now want to mutate.
Run the script:
- Run by navigating to the working directory then typing this in the command line:
  python3 CoreBoundRefine_noA.py FINENAME.pdb MOTIF_START MOTIF_END > mutations.resfile
To only select the core for refinement use CoreRefine.py. To select the Core AND Boundery for refinement use the CoreBoundRefine_A.py (to include alenine) or CoreBoundRefine_noA.py (do not include alenine).
Use the mutations.resfile with RosettaDesign.

REFERENCES:

Refere to the paper by (Koga et.al., 2012 - PMID: 23135467) Methods section's Sequence Design Protocol for more explanation on protein refinement and each layer's SASA parameters.
Refere to the paper by (Correia et.al., 2014 - PMID: 24499818) for details about the Rosetta Fold From Loop (FFL) protocol.
Refere to this webpage for details about Rosetta LayerDesign Protocol and which amino acids should be in which layer.
Refere to this webpage for how to use BioPython.
Refere to the references (Cock et.al., 2009 - PMID: 19304878) and (Hamelryck and Manderick, 2003 - PMID: 14630660) regarding the applications used here from biopython.
Refere to the references (Kabsch W. and Sander C., 1983 - PMID: 6667333) regading DSSP.
Refere to the reference (Tien et.al., 2013 - PMID: 24278298) regarding Wilke SASA parameters.
Refere to this webpage for the proterties of different amino acids.
Refere to this webpage for the Chou-Fasman helix/sheet propensities i.e. which secondary structure prefers which amino acids, The page information is taken from the following reference (Prevelige, P. Jr. and Fasman, G.D., "Chou-Fasman Prediction of the Secondary Structure of Proteins," in Prediction of Protein Structure and The Priniciples of Protein Conformation (Fasman, G.D., ed.) Plenum Press, New York, pp. 391-416 (1989).).

mingchenchen / corerefine Goto Github PK

corerefine's Introduction

CoreRefine

DESCRIPTION:

HOW TO USE:

REFERENCES:

corerefine's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent