Giter VIP home page Giter VIP logo

corerefine's Introduction

CoreRefine

This script generates a mutations.resfile file that allows better core packing of a protein.

DESCRIPTION:

This script does the following:

  1. prints out a mutations.resfile file that allows better core packing of a protein, the core is calculated by the SASA (solvent-accessible surface area) of each amino acid within the protein. The Resfile code should result in a better packed protein core.

Combining information from these references (2,3,8,9) I found that these chosen amino acids fits them all, thus the mutations.resfile code is chosen as follows:

For the core: If the amino acids in the protein's core is part of the loop, then mutate the residue to one of the following amino acids: AVILPFWM If the amino acids in the protein's core is part of the helix, then mutate the residue to one of the following amino acids: AVILFWM If the amino acids in the protein's core is part of the sheet, then mutate the residue to one of the following amino acids: AVILFWM

For the boundery (CoreBoundRefine.py): If the amino acids in the protein's boundery is part of the loop, then mutate the residue to one of the following amino acids: AVILFYWGNQSTPDEKR If the amino acids in the protein's boundery is part of the helix, then mutate the residue to one of the following amino acids: AVILWQEKFM If the amino acids in the protein's boundery is part of the sheet, then mutate the residue to one of the following amino acids: AVILFYWQTM

There are two version of this script: with Alenine (_A) and without Alenine (_noA). Refinement without Alenine seems to be better.

This script was written to run on GNU/Linux using python 3, it was not tested in Windows or MacOS. This script will mostly be useful to refine a protein's core packing after a Rosetta FFL (Fold From Loop) computation using RosettaDesign, but can still be used to refine any protein's core. Contact the author at [email protected] for any questions regarding this script.

HOW TO USE:

To use follow these steps:

  1. Install biopython by running the following command in terminal (python3 -m pip install biopython) you need pip to be installed, if you do not have it you can install it in linux by running the following command in terminal (sudo apt-get install python3-pip).
  2. Install DSSP in linux by running the following command in terminal (sudo apt-get install dssp).
  3. Install numpy (python3 -m pip install numpy).
  4. All files must be in the same directory as this script.
  5. Identify the motif's start residue (MOTIF_START) and the motif's end residue (MOTIF_END). The motif is the part of the protein that you do now want to mutate.
  6. Run the script:
    • Run by navigating to the working directory then typing this in the command line:
      python3 CoreBoundRefine_noA.py FINENAME.pdb MOTIF_START MOTIF_END > mutations.resfile
  7. To only select the core for refinement use CoreRefine.py. To select the Core AND Boundery for refinement use the CoreBoundRefine_A.py (to include alenine) or CoreBoundRefine_noA.py (do not include alenine).
  8. Use the mutations.resfile with RosettaDesign.

REFERENCES:

  1. Refere to the paper by (Koga et.al., 2012 - PMID: 23135467) Methods section's Sequence Design Protocol for more explanation on protein refinement and each layer's SASA parameters.
  2. Refere to the paper by (Correia et.al., 2014 - PMID: 24499818) for details about the Rosetta Fold From Loop (FFL) protocol.
  3. Refere to this webpage for details about Rosetta LayerDesign Protocol and which amino acids should be in which layer.
  4. Refere to this webpage for how to use BioPython.
  5. Refere to the references (Cock et.al., 2009 - PMID: 19304878) and (Hamelryck and Manderick, 2003 - PMID: 14630660) regarding the applications used here from biopython.
  6. Refere to the references (Kabsch W. and Sander C., 1983 - PMID: 6667333) regading DSSP.
  7. Refere to the reference (Tien et.al., 2013 - PMID: 24278298) regarding Wilke SASA parameters.
  8. Refere to this webpage for the proterties of different amino acids.
  9. Refere to this webpage for the Chou-Fasman helix/sheet propensities i.e. which secondary structure prefers which amino acids, The page information is taken from the following reference (Prevelige, P. Jr. and Fasman, G.D., "Chou-Fasman Prediction of the Secondary Structure of Proteins," in Prediction of Protein Structure and The Priniciples of Protein Conformation (Fasman, G.D., ed.) Plenum Press, New York, pp. 391-416 (1989).).

corerefine's People

Contributors

sarisabban avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.