Giter VIP home page Giter VIP logo

pl-rex's Introduction

PL-REX

Data set of protein-ligand complexes with reliable experimental structures and affinities.

The data set has been introduced in the following paper, please refer to it for more details and don't forget to cite it:
A. Pecina, J. Fanfrlík, M. Lepšík and J. Řezáč; Nature Communications 2024, 15, 1127.

The data set is also archived at Zenodo: DOI

Data Set Description

The data set consists of 10 target proteins, each with a series of ligands (164 in total). Crystal structures of the P-L complex are available for the majority of the ligands (147). The PL-REX set uses a single crystal structure of the protein, carefully selected so that it can accommodate all the ligands, into which the ligand poses have been superimposed. In the resulting structures, the ligand and surrounding protein residues have been optimized, yielding the final geometry presented in the data set.

Code Target Ligands Crystals pKi range Protein
001-CA2 Human carbonic anhydrase II 10 10 2.2 5NXG
002-HIV-PR HIV-1 protease 22 12 5.1 2AQU
003-CK2 Zea mays casein kinase 2 16 16 1.9 3KXN
004-AR Human aldose reductase 14 14 2.8 4XZH
005-Cath-D Human cathepsin D 10 3 3.5 6QCB
006-BACE1 Human beta-secretase 1 16 16 3.6 5QCZ
007-JAK1 Human Janus kinase 1 12 12 3.4 4IVD
008-Trp Bovine trypsin 15 15 4.4 1K1I
009-CDK2 Human cyclin-dependent kinase 2 31 31 3.6 3R9H
010-MMP12 Human matrix metallopeptidase 12 18 18 3.9 3EHY

Data Set Organization

The top-level numbered directories contain data for the ten protein targets. Individual P-L complexes are identified by their PDB code (majority of the systems) or labeled as a model (some ligands in 002-HIV-PR and 005-Cath-D that were modeled from similar structures).

PL-REX Structures

The structures of the optimized P-L complexes are stored in the subdirectory structures_pl-rex of each protein target. Each P-L complex has its own subdirectory containing:

  • ligand.sdf - The geometry of the ligand in the optimized complex.
  • receptor.pdb - Model of the active site used in the complex.

These files represent the refined "PL-REX" geometry of the complexes on which the SQM2.20 score was calculated. Additionally, the optimized active site has been ported back into the structure of the whole protein and is provided as protein.pdb.

Structures from AMBER calculations

Some of the calculations reported in the paper were carried out on structures of the complexes optimized with AMBER/GAFF2 forcefield. These are available in the AMBER subdirectory of each P-L complex. There are two variants, with ligand optimized in frozen protein (complex_lig_opt.pdb) and with relaxed flexible region around the ligands (analogously to the PL-REX geometries, complex_flex_opt.pdb). In addition to the complex (the active site model with the ligand) in the PDB format, the ligand is also provided as a SDF file (ligand_lig_opt.sdf and ligand_flex_opt.sdf, respectively).

Small Model Structures

The smaller model (to which the DFT was applied) is available in the structures_small_model directories. Note that the small model has been optimized independently and therefore the geometries of the P-L complexes differ from those in the larger model.

Crystal structures

The crystal structure of the protein from which all the P-L complexes were derived, protonated and prepared for calculations, is provided as the PDB file structures_crystal/protein_crystal_geo.pdb. In the same directory, the initial geometries of the ligands (obtained by overalapping the crystal structures oftheir complexes, protonated, but not optimized) are provided as SDF files.

Experimental Affinities and Scores

The experimental affinities have been converted to binding free energies and are available in the experimental_dG.txt files.

The SQM2.20 score computed on the PL-REX geometries is available as score_SQM2.20.txt for each target. The same score, and the DFT score, compute on the small model are available in the structures_small_model subdirectory.

Scores obtained with standard scoring functions on the PL-REX geometries are available in the scores subdirectory.

pl-rex's People

Contributors

honza-r avatar

Stargazers

Ding Luo avatar  avatar  avatar  avatar Paul Maragakis avatar Andrew Marsh avatar Ho Leung Ng avatar  avatar  avatar Yip Yew Mun avatar Chrinide avatar Leo Gaskin avatar Moon seok hyun avatar Talha Karabıyık avatar  avatar David Zhu avatar Leela S. Dodda avatar Alejandro Martínez-León avatar Danial Gharaee avatar  avatar  avatar  avatar

Watchers

 avatar

pl-rex's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.