Giter VIP home page Giter VIP logo

syntheseus's Introduction

Navigating the labyrinth of synthesis planning


CI Python Version code style License

Syntheseus is a package for end-to-end retrosynthetic planning.

  • โš’๏ธ Combines search algorithms and reaction models in a standardized way
  • ๐Ÿงญ Includes implementations of common search algorithms
  • ๐Ÿงช Includes wrappers for state-of-the-art reaction models
  • โš™๏ธ Exposes a simple API to plug in custom models and algorithms
  • ๐Ÿ“ˆ Can be used to benchmark components of a retrosynthesis pipeline

Setup

We support two installation modes:

  • core installation allows you to build and benchmark your own models or search algorithms
  • full installation additionally allows you to perform end-to-end search using some of the supported models

For core installation (minimal dependencies, no ML libraries) run

# Create and activate a new conda environment (or use your own).
conda env create -f environment.yml
conda activate syntheseus

pip install -e .

For full installation (including all supported models and dependencies for visualization/development) run

conda env create -f environment_full.yml
conda activate syntheseus-full

pip install -e ".[all]"

Both sets of instructions above assume you already cloned the repository via

git clone https://github.com/microsoft/syntheseus.git
cd syntheseus

Note that environment_full.yml pins the CUDA version (to 11.3) for reproducibility. If you want to use a different one, make sure to edit the environment file accordingly.

Additionally, we also support GLN, but that requires a specialized environment and is thus not installed via pip. See here for a Docker environment necessary for running GLN.

Reducing the number of dependencies

To keep the environment smaller, you can replace the all option with a comma-separated subset of {chemformer,local-retro,megan,mhn-react,retro-knn,root-aligned,viz,dev} (viz and dev correspond to visualization and development dependencies, respectively). For example, pip install -e ".[local-retro,root-aligned]" installs only LocalRetro and RootAligned. If installing a subset of models, you can also delete the lines in environment_full.yml marked with names of models you do not wish to use.

Syntheseus contains two subpackages: reaction_prediction, which deals with benchmarking single-step reaction models, and search, which can use any single-step model to perform multi-step search. Each is designed to have minimal dependencies, allowing it to run in a wide range of environments. While specific components (single-step models, policies, or value functions) can make use of Deep Learning libraries, the core of syntheseus does not depend on any.

If you only want to use either of the two subpackages, you can limit the dependencies further by installing the dependencies separately and then running

pip install -e .  --no-dependencies

See pyproject.toml for a list of dependencies tied to each subpackage.

Model checkpoints

See table below for links to model checkpoints trained on USPTO-50K alongside with information on how these checkpoints were obtained. Note that all checkpoints were produced in a way that involved external model repositories, hence may be affected by the exact license each model was released with. For more details about a particular model see the top of the corresponding model wrapper file in reaction_prediction/inference/.

Model checkpoint link Source
Chemformer finetuned by us starting from checkpoint released by authors
GLN released by authors
LocalRetro trained by us
MEGAN trained by us
MHNreact trained by us
RetroKNN trained by us
RootAligned released by authors

In reaction_prediction/cli/eval.py a forward model can be used for computing back-translation (round-trip) accuracy. See here for a Chemformer checkpoint finetuned for forward prediction on USPTO-50K. As for the backward direction, pretrained weights released by original authors were used as a starting point.

Development

Syntheseus is currently under active development. If you want to help us develop syntheseus please install and run pre-commit checks before committing code.

We use pytest for testing. Please make sure tests pass on your branch before submitting a PR (and try to maintain high test coverage).

python -m pytest --cov syntheseus/tests

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

syntheseus's People

Contributors

kmaziarz avatar austint avatar jagarridotorres avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.