Giter VIP home page Giter VIP logo

nmr-pred's Introduction

nmr-pred

Prediction of NMR Spectra from Structure

Predict chemical shifts from structures in smiles or inchi. Use open source software to convert the peaks into a nmr spectograph of peaks to intensity

Use machine learning models to predict NMR Spectra from Structure

CDK Library and Descriptors - http://cdk.github.io/cdk/1.4/docs/api/org/openscience/cdk/qsar/descriptors/atomic/package-summary.html

Spinach Library - http://spindynamics.org/Spinach.php

Spinach Documentation - http://spindynamics.org/wiki/index.php?title=Main_Page

WEKA Documentation - http://www.cs.waikato.ac.nz/ml/weka/

Code Layout and Flow

NmrPred.java

  • Main Function in Program. You will have to call the main function no matter what you do.
  • Code to Run NmrExperiment to do 10 fold cross validation on a training set.
  • You can also RunPrediction Tests based on 3D SDF structures in \test folder
  • RunPrediction will CallMatlab function to simulate the NMR Spectra based on predicted chemical shifts and structure

NmrExperiment.java

  • RunClassifier function starts a 10 fold Cross validation based on a training set ( saved or created ). RandomForest algorithm works th e best currently
  • BuildTrainingClassification function builds a training set for classification problem. For every Structure in the dataset folder, it calculates 28 atomic descriptors for every atom in the structure. It also reads chemical shifts for every Hydrogen in the structure. Now, for every Hydrogen it takes the 28 descriptors for that hydrogen and the nearest three hydrogen atoms for it to build a 128 descriptor featureset for classification. There are 100 classes of chemical shifts from 0.1 to 10.0 for every 0.1.
  • BuildTrainingRegression function that builds a training set for regression analysis
  • BuildTestClassification function builds a test set for classification used by RunPrediction in NmrPred.java
  • GetChemicalShifts uses ReadChemicalShift function to read a list of chemical shift text files downloaded from HMDB

GetCDKDescriptors.java

  • Uses CDK Package built in Java to calculate atomic properties of atoms, molecules, distances of hydrogen atoms from given 3D SDF structure
  • GetAtomicDescriptor is the main function used to calculate 28 Atomic Descriptors for every atom in a particular structure/molecule. Returns an ArrayList of doubles (descriptors). The Arraylist is the number of atoms in molecule and each molecule will have 28 doubles/descriptors
  • GetNearestAtoms is used to calculate the nearest atoms to any atom in a molecule. This is used to find the three nearest atoms to any Hydrogen atom in a molecule
  • ComputeDescriptorsAtomic calculates atomic descriptors for an Arraylist of Atoms
  • There are molecular descriptors too but they are not currently used for prediction

NmrStructure.java

  • A Java class to define an NMR Structure including its chemical shifts of hydrogen atoms, its descriptors for every atom, hydrogen positions, nearest atoms to every atom, hose codes
  • AssignShiftClasses rounds a chemical shift to 0.1 for classification

Layout

  • DataSet folder contains the training set of 3D SDF structures and chemical shifts in text files
  • Test Folder contains the test set of 3D SDF structures used for prediction and simulation, not for training or cross-validation
  • Models folder contains saved training classification or regression sets or trained models like RandomForest
  • Matlab Folder contains the only code create_NMR_1H_plot.m in matlab to simulate the NMR spectra based on predicted chemical shifts and known J-Coupling constants. It uses Fourier transformation, apodization etc to create an NMR image.
  • Java folder contains all the prediction algorith, dataset and code in Java
  • Spinach folder contains the Spinach library used for NMR Simulation
  • Docs folder contains relevant papers including my paper for the individual study course

nmr-pred's People

Contributors

tsajed avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nmr-pred's Issues

running nmr-pred on nmrbox.org

Hi,

I am not able to run NmrPred on nmrbox.org
It would be great to use it there... could you provide point by point instruction how to run the program
(I am also not able to run NmrPred on my local Ubuntu 22... what Java version should be used?)

Kind regards
Krzysztof

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.