sirimullalab / dlscore Goto Github PK
View Code? Open in Web Editor NEWDLSCORE: A deep learning based scoring function for predicting protein-ligand binding affinity
License: MIT License
DLSCORE: A deep learning based scoring function for predicting protein-ligand binding affinity
License: MIT License
Hi,
Thank you for releasing this code. We have been trying to implement this but found that DLScore is very slow in terms of performance, even when disabling the NNscore component, around 5 seconds per compound. Furthermore, we have not achieved good scaling when using multiple parallel instances (on the same host), observing little scale-up (<50%) when splitting the job across 10 CPUs (10 concurrent DLScore runs) and plateau around 20 CPUs.
Is this something you've observed as well and could you give some pointers as to how to improve performance?
Thanks!
Hello,
I am using DLSCORE for the first time.
The input files were obtained from a docking made using Schrodinger Suite tools (GLIDE and Induced Fit docking protocol). I am getting warnings like below (the full output is in the attached file):
....
WARNING: Duplicate receptor atom detected: "ATOM 237 N BVAL B 29 5.157 -86.693 -37.071 0.29 26.50 -0.337 N". Not loading this duplicate.
WARNING: Duplicate receptor atom detected: "ATOM 239 CA BVAL B 29 4.047 -85.753 -37.075 0.29 27.11 0.190 C". Not loading this duplicate.
WARNING: Duplicate receptor atom detected: "ATOM 241 C BVAL B 29 3.957 -85.097 -38.460 0.29 27.68 0.349 C". Not loading this duplicate.
....
Should I be concerned? Will this affect the scoring?
The protein was downloaded from Protein Data Bank and prepared (fixed) using the Protein Preparation Wizard from the Suite.
Regards,
Camps
GS1_H.txt
Hi
Thanks for providing the dlscore script.
I have been trying to run dlscore on docked files of small molecules saved in .mol2 format.
Sometimes it reads the .mol2 ligand files and sometimes the script tries to automatically search for ligands with .pdbqt by repalcing the extension: e.g:
--ligand test.mol2 # from the run command
error report :
Command-line parameters used:
Receptor: /ichec/work/nmlif042b/VS/receptor.pdbqt
Ligand: test.pdbqt
Vina executable: /ichec/work/nmlif042b/dlscore/autodock_vina_1_1_2_linux_x86/bin/vina
Traceback (most recent call last):
File "/ichec/work/nmlif042b/dlscore/dlscore.py", line 2466, in
output = ds.get_output()
File "/ichec/work/nmlif042b/dlscore/dlscore.py", line 2392, in get_output
f = open(lig,'r')
FileNotFoundError: [Errno 2] No such file or directory: 'test.pdbqt'
Thanks
Ajay
Hello,
I have just download and tried DLScore. When trying to run the test file to check that everything is ok, I got the following error:
bash test_run.sh
Using TensorFlow backend.
setting PYTHONHOME environment
setting PYTHONHOME environment
adding gasteiger charges to peptide
Command-line parameters used:
Receptor: samples/10gs/10gs_protein.pdbqt
Ligand: samples/10gs/10gs_ligand.pdbqt
Vina executable: /mnt/sda1/Shared_folder/DLSCORE-master/autodock_vina_1_1_2_linux_x86/bin/vina
Lines 874-881 of file "saving.py" show:
but if I change both if statements to
then the calculation gets to the end giving the output:
bash test_run.sh
Using TensorFlow backend.
setting PYTHONHOME environment
setting PYTHONHOME environment
adding gasteiger charges to peptide
Command-line parameters used:
Receptor: samples/10gs/10gs_protein.pdbqt
Ligand: samples/10gs/10gs_ligand.pdbqt
Vina executable: /mnt/sda1/Shared_folder/DLSCORE-master/autodock_vina_1_1_2_linux_x86/bin/vina
which I guess is OK.
My question: the changes made to the file saving.py might alter or affect the results of DLScore in any way or can I safely use the software in this way? I understand that the modified lines are only querying the keras and backend versions, so it might be that this is not so important for DLScore execution, is it?
Thanks for your comments.
Jordi
Hello, I've been testing out your scoring function and had a few questions. I combed through the paper but could not find where you specify the optimal number of hidden layers and number of neurons per hidden layer. Does this train a number of networks with those parameters varied and then take the top 10 best performing networks for the final score?
Another question: When I convert a PDB file with multiple ligands to a PDBQT, the code seems to only process the first ligand and not the rest, as I only get 1 score reported. Does the code require that I split the multiple ligands into separate files and process the files one at a time?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.