Giter VIP home page Giter VIP logo

drug-discovery's Introduction

Main idea

PAMNET: Current state of the art, does not use local information that much, and is only state of the art in qm9, which is known to have much smaller molecules than normal

Our Goal:

  • Make a network that is able to generalize to much larger molecules over time
  • Need to capture better dependencies => transformers with extensive pre training

The main starting architecture:

  • Initial preprocessing - a GAT to take in local dependencies (need to configure the range of this)
  • After - topological relative encoding with transformer - hope that attention will capture more robust global dependencies
    • proff - Elembert - preprocessing was pre-made cluster based instead of dynamic

Idea Progression to Test:

  • Current - getting the data cleaning to function - DONE
  • Getting the transformer to function
  • Testing different local region size: one with just bond lengths, maybe adding bond angles?
  • Adding in global dependencies - molecular energy, orbital energies, or estimating them ourselves?
  • Splitting based on groups - chunk the graph like a vision transformer and run on individual chunks
    • need to decide on topological encoding
    • chunks increasing in size over time?
  • Research molecular wavefunction and orbitals - how could it be estimated more accurately?
    • maybe optimize to outside parts on the molecule and their efficacies?
  • Trying pretraining tasks?
    • predicting a part of the molecule?
    • predicting which items would bond

The new idea

  • wavefunction data augmented attention
  • Add info to Q + K matrices
    • radial function estimator and spherical harmonic estimator
    • adds along with globals => gives a better output?
  • Plus local details

My Main Idea

  • wavefunction estimation for substructures
  • hierarchical attention

Another Idea

  • estimating a LCAO
  • generate coefficients based on wavefunction attention

How to make wavefunction-based embeddings

  • first - generate radial and spherical basis sets
  • for radial and spherical - generate coefficients for each basis
    • add them up => gives us the wavefunction approximation
  • for substructures - generate coefficients for LCAO
  • how to integrate the wavefunction or molecular wavefunction into attention
    • first comparison - basic info like atomic number, mass, hybridization
      • topological - adds a basic positioning system to understand relationships
      • radial wavefunction - adds atomic distance estimations
      • spherical wavefunction -
    • the wavefunction representation will be a list of coefficients
      • when attending to two different atoms, they

The final idea

  • two main aspects: substructures and wavefunction estimations
  • the wavefunction estimation
    • coefficients of radial and spherical bases
    • will be LCAO-ed together for substructures
  • substructure searching
    • one layer - will attend to atom-level relationships
    • other layers - detect repeating substructures just like JT-VAE
      • create a wavefunction-based embedding - LCAO estimation based on the coefficient estimation from each atom
        • done through local attention =>

The actual final idea

  • Substructure
    • each substructure is detected
    • local attention is done within the substructure - spherical + radial bessel function analysis
    • we get a substructure embedding
  • The ACTUAL THING
    • attention is done, but substructure embeddings are inserted in
    • also some feedback loop to update the substructure embeddings based on the attention value?
  • Things
    • substructure detection
    • substructure analysis - message passing + LCAO-inspired aggregation or... a TRANSFORMER!
    • then: global - attach a substructure embedding during attention
      • used to compute relations
      • add the substructure embedding to each atom when relevant
        • have a vector v_nosub for when not in substructure
      • keep the positional attention embedding
        • need to fine tune this with some sort of positioning map
      • create the relations and compute attention
        • when atoms in same substructure - no effect
        • atoms in different substructure - add to the substructure embedding slightly
          • this will most likely be a simple linear layer that takes in distance + relative properties

drug-discovery's People

Contributors

vish317 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.