Giter VIP home page Giter VIP logo

speedcom's Introduction

License: MIT Build Status Coverage Status

SPEEDCOM

Authors: Joe Abbott, Ryan Beck, Hang Hu, Yang Liu, Lixin Lu.


Overview

SPEEDCOM is an open source python package that aims to predict the fluorescence emission and absorption spectra of small conjugated organic molecules. These features are predicted using a convolutional neural network, implemented with keras, and trained on data from the PhotochemCAD database. The software has a graphical-user-interface (GUI) where users can input the SMILES string for a given molecule and be returned its predicted spectra and associated characteristic quantities. For further details on the background science, and the operations of our program, please see our use cases.


Current Features

  • Input vaild SMILES of small organic molecules as input for prediction
  • Predictions on the maximum absorption/emission peak
  • Wrapper functions for a numerial encoding of SMILES and common descriptors generation
  • GUI on local host

Incoming Featrues

  • Sanitize SMILES input; add alternative input options
  • Predictions on multiple peaks of absorption/emission spectra
  • Customization of model training with user input data
  • Pipelining predictions to facilitate fluorophore design

GUI

Below shows the spectra and characteristics prediction of an example molecule, inputted via our GUI.

SPEEDCOM spectra prediction

SPEEDCOM chracteristics prediction


Configuration

Pre-requirements:

  • Python version 3.6.7 or later
  • conda version 4.6.8 or later
  • GitHub

Installation & Usage Instructions

You can execute the following commands from your computer's terminal application:

  1. Either clone the SPEEDCOM repository:

    git clone https://github.com/emissible/SPEEDCOM.git

    or download the zip file:

    curl -O https://github.com/emissible/SPEEDCOM/archive/master.zip

  2. cd SPEEDCOM

  3. conda env create -n speedcom_environment.yml

  4. conda activate speedcom

  5. cd speedcom && python runhtml.py

  6. Find the isomeric SMILES string for the moelcule you want the spectra for and input it into the GUI!


Directory Structure

SPEEDCOM (master)  
|---data  
    |--- 
|---doc  
    |--- 
|---speedcom  
    |---frontend
        |---output
        |--jquery.min.js
        |--molecule_ex.png
        |--spectra_ex.png
        |--welcome.png
    |---notebook_scripts
        |--R2_plot.ipynb
        |--encode_smiles.ipynb
        |--rdkit_descriptor.ipynb
        |--rdkit_exploration.ipynb
        |--smiles_clean.ipynb
        |--smiles_cnn.ipynb
        |--to_help_write_tests.ipynb
    |---saved_models
        |--ems_dropna.best1.hdf5
        |--epsilon.best_-1.hdf5
        |--model_lstm_qy.json
        |--model_smiles_cnn.json
        |--model_smiles_ems_dropna.json
        |--model_smiles_epsilon_lstm.json
        |--smiles_wordmap.json
        |--weights.best.hdf5
        |--weight.qy_best.hdf5
    |---tests
        |---DATA_CLEAN_TEST_DIR
        |--__init__.py
        |--context,py
        |--tests_NNModels.py
        |--test_Prediction.py
        |--test_dataUtils.py
    	  |--test_data_extract.py
     	  |--test_model_utils.py
     	  |--test_readData_temp.tsv
     	  |--test_speedcom.py
     	  |--test_utilities.py
    |--NNModels.py
    |--Prediction.py
    |--__init__.py
    |--core.py
    |--dataUtils.py
    |--data_extract.py
    |--matplotlibrc
    |--model_utils.py
    |--runhtml.py
    |--speedcom.html
    |--utilities.py
    |--version.py  
|--.coveragerc
|--.gitignore  
|--.travis.yml
|--LICENSE  
|--README.md 
|--download.sh
|--numpyversion.py
|--requirements.txt
|--runtests.sh
|--setup.sh
|--speedcom_environment.yml 

Contributions

Any contributions to the project are warmly welcomed! If you discover any bugs, please report them in the issues section of this repository and we'll work to sort them out as soon as possible. If you have data that you think will be good to train our model on, please contact one of the authors.


References

Garrett B. Goh et al. 2018. SMILES2vec. In Proceedings of ACM SIGKDD Conference, London, UK, Aug, 2018 (KDD 2018), 8 pages


License

SPEEDCOM is licensed under the MIT license.

speedcom's People

Contributors

hanghu avatar jwa7 avatar lixin19 avatar rbeck4 avatar yliu92uw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

speedcom's Issues

Project pre-review

This is looking really nice. Good working resolving the pep8 things I mentioned on Thu.

A few things I noticed:

  • Check your .gitignore - see things like .DS_Store which is a mac thing that should be in your ignore file.

  • Your installation instructions should probably be pip not conda or python setup.py install. Minor change but... If you have a UI, you can put a screenshot in the README.md to entice visitors.

  • Your build is failing and coverage is unknown. Not a big deal at the moment, but it would be nice if you could fix that up by Friday 9 AM.

  • Please add your use cases, component sketch, tech review presentation and final poster to the doc directory.

  • I notice some .html, .json, .hdf5, .csv and .ipynb files in the python package itself. Maybe the data files should go into a data subdirectory or at the top level as you see fit. Some cleanup here is still necessary.

  • Overall the tests are a very good start. There are some missing comments to say what individual tests are doing and some tests that are commented out and maybe should be removed or fixed.

This is looking really great speedcommies!

visualization

@rbeck4, @yliu92uw, @lixin19 Data visualization function. Will take data from a file labled "absorption_emission_data.csv" containing a list of arrays with the [abs. wavelenght, abs. itn, ems. wavelength, ems. itn] for each molecule. Will return a *.png.

Clear use cases

Your use cases in the current format are difficult to parse out for someone who has no background information about your project. I would consider adding a little background before jumping into your use cases, the organizing your use cases hierarchically (which you have begin to do), with descriptions of both inputs and outputs, as described in the Software Design lecture.

And I would also consider putting your use cases in a MarkDown file, as this will help with readability.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.