Giter VIP home page Giter VIP logo

okkevaneck / prospr Goto Github PK

View Code? Open in Web Editor NEW
18.0 1.0 1.0 8.62 MB

Prospr is a universal toolbox for protein structure prediction within the HP-model. The Python package is based on a C++ core, which gives Prospr its high performance. The C++ core is made available as a separate zip file to facilitate high-performance computing applications. The package comes with many prediction algorithms and datasets to use.

License: GNU Lesser General Public License v3.0

Python 37.99% C++ 51.36% Shell 10.65%
protein structure-prediction protein-folding toolbox cpp python package protein-structure-prediction datasets high-performance-computing

prospr's Introduction

Prospr: The Protein Structure Prediction Toolbox

Prospr's logo

GitHub PyPI GitHub Workflow Status (branch) Documentation Status pre-commit

Creator: Okke van Eck

Prospr is a universal toolbox for protein structure prediction within the HP-model. At the core, Prospr offers an easy-to-use Protein data structure, which can be used to simulate protein folding. It also offers algorithms, datasets and visualization functions. The Protein data structure tracks many properties when folding the protein. This includes tracking the number of conformation changes, which makes it possible to determine the relative hardness of a protein for a specific algorithm. This allows for a fair comparison between different algorithms.

So far, only square lattices are supported in n-dimensions. The amino acids can only be placed in the corners of the squares and have to be one unit distance away from the previously placed amino acid.

The Python package is based on a C++ core, which gives Prospr its high performance. The C++ core is made available as a separate zip file to facilitate high-performance computing applications. See the C++ core section below for direct links to the core.

Installation and documentation

This package can simply be installed via pip by running:

pip install prospr

A quickstart and reference documentation can be found at prospr.readthedocs.io. The PDF version of the complete documentation can be found here.

Archives

All the C++ core files and datasets are also available as compressed archives. See the subsections below for direct links.

C++ core

All the core code which prospr runs on, is available as a compressed archive. The folder archives contains a .zip and a .tar.gz archive.

Datasets

The complete collection of datasets is available as a compressed archive in the archives folder. It is available as a .zip and a .tar.tz archive.

Future work

This toolbox could be used for other protein folding problems within discrete models. It would be a great extension to support different models by creating a modular amino acid.

License

The used license is the GNU LESSER GENERAL PUBLIC LICENSE. A copy can be found in the LICENSE file on GitHub.

prospr's People

Contributors

okkevaneck avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

rubenhorn

prospr's Issues

Enhance visualization of Proteins

Visualization can be made more pretty and include extra information. Don't forget to update documentation figures as well!

  • Note: Y-axis not rendered properly when placing all amino acids in a row.
  • Fix visualization compatibility with new version of prospr_core

Datasets are not provided cleanly

Change the way datasets are provided. Rename folder 'datasets' to 'data', use package_data in setup.py, and find good file structure.

CI for building is too slow

Make a decision on how to speed up the CI. Non-master pushes do not need all wheel builds, maybe only do a ubuntu-32bit build, or a 64-bit build on all platforms.

Core is only available as ZIP

Change auto-zipping to support multiple archive types. Create a separate folder for the zips as well, which is referenced in the README. Also change the pre-commit file to be in a .githooks folder instead of the github folder.

No Python tests in CI workflow

The build version of the code needs to be tested. In order to do so, a tool like pytest needs to be used to perform tests. The tests need to be added to the CI workflow.

Small hotfixes

For GitHub actions:

  • Set python-version in setup_python for build_source_distribution, build_wheels, and deploy
  • Set right version for pypa/gh-action-pypi-publish@master in deploy

For fixing imports:

  • Add AminoAcid, load_vanEck250 and load_vanEck1000 to __all__ list.
  • Add load_vanEck250 and load_vanEck1000 to import from prospr.data

Small test coverage

Only some parts of the core are tested with PyTest. Extend the test suite to also test all other functionalities.

Folding algorithms extension

  • Algorithms don't need to return the pointer, folding algorithms can have rettype void.
  • For deterministic algorithms, build load_tmp functions for loading intermediate results from a hash_fold()

Fix Python11 buildwheels

Cibuildwheels gives the error that a [project] section inside pyproject.toml file is invalid for Python 11. See what is the issue and fix it to support Python11.

Change core according to documentation + add new features

Change core into:

  • Change to doubly linked list for storing conformations
  • Remove required move for removal of amino acids
  • Add possibility to remove / place multiple amino acids in a given direction (add to docs)
  • Add possibility to rotate the chain at a given point (add to docs)
  • Add Amino Acid class for storing the values in the values of space
  • Add copyright to all headers with https link to the project's LICENSE file

Import structure not defined

The core code is available as from prospr import prospr_core instead of directly as from prosper import ....
The visualize code needs to be tested on importability, as well as the helper code.

As a side task, the visualize code also needs to be tested on functionality.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.