Giter VIP home page Giter VIP logo

km_prediction_function's Introduction

Description

This repository contains an easy-to-use python function for the KM prediction model from our paper "Deep learning allows genome-scale prediction of Michaelis constants from structural features". Please note that the provided model is not identical to the one presented in the paper: Here, we used enzyme representations that are slightly different. Instead of the UniRep model, here we are using the ESM-1b model to create the enzyme representations. It was shown that the ESM-1b model outperforms the UniRep model as it is trained with a more up-to-date model for natural language processing (with a transformer network instead of a LSTM).

Predicting Km values for enzyme-substrate pairs

The KM prediction model was only trained with natural enzyme-substrate pairs. Hence, the model will not be good at detecting non-substrates, but it is only suitable for predicting the KM value if we already know the substrate for an enzyme. Moreover, we only trained our model with wild-type ennymes. Therefore, we would not expect that the model to be good at predicting the effect of singe amino acid mutations, as it was not trained to do so.

Using KEGG Compound IDs as substrate representations

If you wish to use KEGG Compound IDs as inputs for the substrates, you need to unzip a zipped file called "mol-files", which is in the folder "data". The unzipped folder "mol-files" has to be stored in the folder "data".

Alternatively, you can use InChI strings and SMILES strings as substrate representations.

Predicting Km values for BiGG genome-scale metabolic network

We added two jupyter notebookes in the folder "code" ("01 BiGG - ..." and "02 BiGG - ...") that contain code to calcualte KM predictions for genome-scale metabolic netowrks.

Requirements

  • python 3.7
  • tensorflow 2.3.1
  • jupyter
  • pandas 1.1.3
  • torch 1.7.1
  • numpy
  • rdkit 2020.09.1
  • fair-esm 0.3.1
  • py-xgboost 1.3.1

The listed packaged can be installed using conda and anaconda:

pip install torch
pip install numpy
pip install tensorflow
pip install fair-esm
conda install -c conda-forge py-xgboost=1.3.3
conda install -c rdkit rdkit

Content

There exist a Jupyter notebook "Tutorial KM prediction.ipynb" in the folder "code" that contains an example on how to use the KM prediction function.

Problems/Questions

If you face any issues or problems, please open an issue.

km_prediction_function's People

Contributors

alexanderkroll avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.