Giter VIP home page Giter VIP logo

gflownet's Introduction

Build-and-Test Code Quality Python versions license: MIT

gflownet

GFlowNet-related training and environment code on graphs.

Primer

GFlowNet, short for Generative Flow Network, is a novel generative modeling framework, particularly suited for discrete, combinatorial objects. Here in particular it is implemented for graph generation.

The idea behind GFN is to estimate flows in a (graph-theoretic) directed acyclic network*. The network represents all possible ways of constructing an object, and so knowing the flow gives us a policy which we can follow to sequentially construct objects. Such a sequence of partially constructed objects is a trajectory. *Perhaps confusingly, the network in GFN refers to the state space, not a neural network architecture.

Here the objects we construct are themselves graphs (e.g. graphs of atoms), which are constructed node by node. To make policy predictions, we use a graph neural network. This GNN outputs per-node logits (e.g. add an atom to this atom, or add a bond between these two atoms), as well as per-graph logits (e.g. stop/"done constructing this object").

The GNN model can be trained on a mix of existing data (offline) and self-generated data (online), the latter being obtained by querying the model sequentially to obtain trajectories. For offline data, we can easily generate trajectories since we know the end state.

Repo overview

  • algo, contains GFlowNet algorithms implementations (only Trajectory Balance for now), as well as some baselines. These implement how to sample trajectories from a model and compute the loss from trajectories.
  • data, contains dataset definitions, data loading and data sampling utilities.
  • envs, contains environment classes; a graph-building environment base, and a molecular graph context class. The base environment is agnostic to what kind of graph is being made, and the context class specifies mappings from graphs to objects (e.g. molecules) and torch geometric Data.
  • examples, contains simple example implementations of GFlowNet.
  • models, contains model definitions.
  • tasks, contains training code.
    • qm9, temperature-conditional molecule sampler based on QM9's HOMO-LUMO gap data as a reward.
    • seh_frag, reproducing Bengio et al. 2021, fragment-based molecule design targeting the sEH protein
    • seh_frag_moo, same as the above, but with multi-objective optimization (incl. QED, SA, and molecule weight objectives).
  • utils, contains utilities (multiprocessing).
  • train.py, defines a general harness for training GFlowNet models.

Getting started

A good place to get started is with the sEH fragment-based MOO task. The file seh_frag_moo.py is runnable as-is (although you may want to change the default configuration in main()).

Installation

PIP

This package is installable as a PIP package, but since it depends on some torch-geometric package wheels, the --find-links arguments must be specified as well:

pip install -e . --find-links https://data.pyg.org/whl/torch-1.13.1+cu117.html

Or for CPU use:

pip install -e . --find-links https://data.pyg.org/whl/torch-1.13.1+cpu.html

To install or depend on a specific tag, for example here v0.0.10, use the following scheme:

pip install git+https://github.com/recursionpharma/[email protected] --find-links ...

Developing & Contributing

TODO: Write Contributing.md.

gflownet's People

Contributors

bengioe avatar pjanowski avatar dmaljovec avatar alfred-rxrx avatar julienroyd avatar dependabot[bot] avatar sobhanmp avatar timgaripov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.