The paddlehelix from chengxuanying

View Code? Open in Web Editor NEW

Bio-Computing Platform featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

License: Apache License 2.0

CMake 0.03% Python 37.24% Shell 0.27% C++ 24.57% C 0.12% Jupyter Notebook 37.78%

paddlehelix's Introduction

PaddleHelix is a machine-learning-based bio-computing framework aiming at facilitating the development of the following areas:

Vaccine design

Drug discovery

Precision medicine

Features

High Efficiency: We provide LinearRNA, a highly efficient toolkit for RNA structure prediction and analysis. LinearFold & LinearPartition achieve O(n) complexity in RNA-folding prediction, which is hundreds of times faster than traditional folding techniques.

Large-scale Representation Learning: Self-supervised learning for molecule representations offers prospects of a breakthrough in tasks with limited annotation, including drug profiling, drug-target interaction, protein-protein interaction, RNA-RNA interaction, protein folding, RNA folding, and molecule design. PaddleHelix implements various representation learning algorithms and state-of-the-art large-scale pre-trained models to help developers start from "the shoulders of giants" quickly.

Rich examples and applications: PaddleHelix provides frequently used components such as networks, datasets, and pre-trained models. Users can easily use those components to build up their models and systems. PaddleHelix also provides multiple applications, such as compound property prediction, drug-target interaction, and so on.

The installation prerequisites and guide can be found here.

We provide abundant tutorials to help you navigate the repository and start quickly.
PaddleHelix is based on PaddlePaddle, a high-performance Parallelized Deep Learning Platform.