Giter VIP home page Giter VIP logo

gauche's People

Contributors

a-r-j avatar antobi avatar aryandeshwal avatar austint avatar bojana-rankovic avatar gck25 avatar gkwt avatar infprobscix avatar leojklarner avatar ryan-rhys avatar sangttruong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gauche's Issues

Write tests for DataLoader classes

Parent:

  • splitting and scaling

Molecular property prediction

  • loading benchamarks
  • validating SMILES
  • featurisation to fingerprints, fragments and fragprints

Protein ligand binding affinity

  • loading benchmarks
  • validating pdb/sdf files
  • featurisation to graph-based features, interaction-based features
  • check whether ligand extraction is correct by comparing extracted ligand IDs to those in PDBbind

Inconsistent / unclear python version requirements?

Currently the python versioning of this project is unclear: the README has no info, the supplied conda env says python 3.7, while the internal setup.py says python>=3.8. Maybe this should be standardized? It looks like any python version should do as long as the dependencies are supported, no?

Refactor data loader

  • include additional variable for internal representation which is not overwritten upon featurisation

DataLoader class for PL binding affinity

  • load benchmark sets (such as pdbbind)
  • add download and splitting of arbitrary pdb complexes
  • cluster by ligand and protein similarity
  • add PLEC fingerprints, BINANA+Vina features, RFScore features
  • add explicit hydrogens during bond order augmentation

Investigate Convolutional Kernel Networks

It may be worth investigating whether convolutional kernel networks [1] can be integrated as a GP graph kernel.

[1] Chen, D., Jacob, L. and Mairal, J., Convolutional kernel networks for graph-structured data. ICML, 2020.

Difficulty saving and loading models using`NonTensorialInputs` data

First off, great work, this is a really cool package!

I've been playing with the graph representation inputs using graphein to a model building off of SIGP (some examples in your codebase call it GraphGP) and have been getting some really great performance out of it. However, I'm struggling to understand how to correctly save and then load the model back into memory for inference after training. If I save the state dict then re-init using that state dict, the model performs as if it had been randomly initialized. I also tried pickling the model (not the ideal solution) I get the following exception:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[88], [line 4](vscode-notebook-cell:?execution_count=88&line=4)
      [1](vscode-notebook-cell:?execution_count=88&line=1) import pickle
      [3](vscode-notebook-cell:?execution_count=88&line=3) with open('model.pkl', 'wb') as file:
----> [4](vscode-notebook-cell:?execution_count=88&line=4)     pickle.dump(model, file)

RuntimeError: Pickling of "rdkit.Chem.rdchem.Atom" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)

I tried setting train_inputs to None before saving. This took care of the exception, however I'm back to the original issue where the model seems to be randomly initialized.

I was wondering if you had any guidance here, or if there was something in the docs that I missed. Thanks!

How to install gauche as a library?

Hi, thanks for your work! I'm wondering how exactly do I install gauche as a library. It seems that the instructions in the README are only for installing dependencies. Meanwhile, when I do pip install git+https://github.com/leojklarner/gauche.git, I get an error

error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [1 lines of output]
      error in gauche setup command: 'extras_require' must be a dictionary whose values are strings or lists of strings containing valid project/version requirement specifiers.
      [end of output]

Thanks!

It would be nice for non pip install suggestions

All dependencies mentioned in the readme are available as conda packages.
Always nicer to avoid pip when using anaconda imho. Also, much easier to prepare a yaml file?
you already have a requirements file. then you wouldn't need a setup.py file?
you know, a simple
Just a thought, not an issue per se.

Move benchmark models into Gprotorch codebase

I think moving the contents of the benchmarks directory into the codebase (.e.g gprotorch.benchmarks) will make organisation and docs clearer. It also helps (modestly) to enforce a consistent API across the library.

Also, we should rename from gprotorch to gauche.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.