jenniening / deltavinaxgb Goto Github PK
View Code? Open in Web Editor NEWThis is a machine-learning based protein-ligand scoring function.
Home Page: https://www.nyu.edu/projects/yzhang/DeltaVina/
License: GNU General Public License v3.0
This is a machine-learning based protein-ligand scoring function.
Home Page: https://www.nyu.edu/projects/yzhang/DeltaVina/
License: GNU General Public License v3.0
Dear jenniening,
Thank you for the tool. I installed all the necessary dependencies and I was trying to run the command provided in the tutorial
python run_DXGB.py --runfeatures --datadir ../Test_2al5 --pdbid 2al5 --average
and I received the following error messages :
(DXGB) [yrui@ibet DXGB]$ python run_DXGB.py --runfeatures --datadir ../Test_2al5 --pdbid 2al5 --average
pdb index: 2al5
file directory: /home/deltaVinaXGB/Test_2al5
feature will be calculated:all
output filename : score.csv
1 molecule converted
Ligand for conformation stability:2al5_ligand.mol2
Ligand for Vina, SASA, BA, ION:2al5_ligand_rename.pdb
Protein without water molecules:2al5_protein.pdb
Protein with water molecules:2al5_protein_all.pdb
Finish Input Preparation
No Consideration of Water
No Optimized Ligand
C
sh: /home/mgltools_x86_64Linux2_1.5.6//bin/python: No such file or directory
sh: /home/mgltools_x86_64Linux2_1.5.6//bin/python: No such file or directory
sh: /home/vina4dv/build/linux/release//vina: No such file or directory
Traceback (most recent call last):
File "run_DXGB.py", line 103, in <module>
main()
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "run_DXGB.py", line 43, in main
run_features(datadir, pdbid, water_type = water, opt_type = opt, rewrite = rewrite, feature_type = featuretype)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 599, in run_features
feature_calculation_ligand(datadir, pdbid, inlig_pdb, inlig_rdkit, inpro_pro, water_type, opt_type, rewrite, feature_type)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 522, in feature_calculation_ligand
run_Vina_features(datadir, i, fn, inpro, inlig)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 364, in run_Vina_features
featureVina(outfile, fn, inpro, inlig, datadir)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/cal_vina58.py", line 58, in featureVina
vinalist = runVina(fn,protpdbqt,ligpdbqt)
File "/home/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/cal_vina58.py", line 15, in runVina
for lines in fileinput.input("score_v1.txt"):
File "/home/.conda/envs/DXGB/lib/python3.7/fileinput.py", line 252, in __next__
line = self._readline()
File "/home/.conda/envs/DXGB/lib/python3.7/fileinput.py", line 364, in _readline
self._file = open(self._filename, self._mode)
FileNotFoundError: [Errno 2] No such file or directory: 'score_v1.txt'
I would like to know how to solve this issue?
Thank you in advance,
Rui
Hi Jenniening,
Thanks for your sharing this interesting working! As a novice, I first tried to run the example in the package.
"python run_DXGB.py --runfeatures --datadir ../Test_2al5 --pdbid 2al5 --average " I run on my linux server. However, the failure was blocking me, which looks like :
.............
Protein with water molecules:2al5_protein_all.pdb
Finish Input Preparation
No Consideration of Water
No Optimized Ligand
C
Finish Vina
has no corresponding radius valuenfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry
has no corresponding radius value /nfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry * [0-9]H. 15
has no corresponding radius value /nfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry * F[1-9].* 40
has no corresponding radius value /nfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry * CL[1-9].* 41
has no corresponding radius value /nfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry * BR[1-9].* 42
has no corresponding radius value /nfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry * I[1-9].* 43
has no corresponding radius value /nfs/software_local/delta/deltaVinaXGB/DXGB/atmtypenumbers entry * FE2 25
....................................
1.1
SASA failed
Traceback (most recent call last):
File "run_DXGB.py", line 103, in
main()
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "run_DXGB.py", line 43, in main
run_features(datadir, pdbid, water_type = water, opt_type = opt, rewrite = rewrite, feature_type = featuretype)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 599, in run_features
feature_calculation_ligand(datadir, pdbid, inlig_pdb, inlig_rdkit, inpro_pro, water_type, opt_type, rewrite, feature_type)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 528, in feature_calculation_ligand
run_SASA_features(datadir, i, fn, inpro, inlig)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 383, in run_SASA_features
cal_SASA(out_SASA,fn,inlig,inpro,datadir)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/cal_sasa.py", line 26, in cal_SASA
sasa_features = sasa(datadir,pro,lig)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/featureSASA.py", line 219, in init
self.rawdata, self.rawdata_pro, self.rawdata_lig, self.sasa, self.sasa_pro, self.sasa_lig = featureSASA( self.datadir, self.prot, self.lig)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/featureSASA.py", line 149, in featureSASA
df,df_pro,df_lig = runMSMS(inprot, inlig, datadir)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/featureSASA.py", line 103, in runMSMS
tmp1 = np.genfromtxt('p_sa.area', skip_header=1)[:,2]
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/numpy/lib/npyio.py", line 1793, in genfromtxt
fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/numpy/lib/_datasource.py", line 193, in open
return ds.open(path, mode, encoding=encoding, newline=newline)
File "/home/hadoop/.conda/envs/DXGB/lib/python3.7/site-packages/numpy/lib/_datasource.py", line 533, in open
raise IOError("%s not found." % path)
OSError: p_sa.area not found.
I have checked the msms reference for issue #6, It looks the solution in issue #6 does not work for my issue. Could you give me some help please ? Thanks a lot !
The command "docker build -t delta-vina-xgb ." fails showing following error lines:
Step 12/28 : RUN tar -xvzf mgltools_x86_64Linux2_1.5.6.tar.gz && cd mgltools_x86_64Linux2_1.5.6/ && /bin/bash -c "source install.sh"
---> Running in df00b30c96cb
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
ERROR conda.cli.main_run:execute(47): conda run /bin/bash -c tar -xvzf mgltools_x86_64Linux2_1.5.6.tar.gz && cd mgltools_x86_64Linux2_1.5.6/ && /bin/bash -c "source install.sh"
failed. (See above for error)
The command 'conda run -n DXGB /bin/bash -c tar -xvzf mgltools_x86_64Linux2_1.5.6.tar.gz && cd mgltools_x86_64Linux2_1.5.6/ && /bin/bash -c "source install.sh"' returned a non-zero code: 2
Thank you very much in advance for your reply.
Hi
Ubuntu LTS 16.04 is EOL for public updates, switching the base to 18.04 in the Dockerfile is trivial and seems to work fine.
ref: https://wiki.ubuntu.com/Releases
One can also delete the downloaded files once the tarball/zip files are expanded.
Finally making a pre-built docker image available from dockerhub would also expand your user base!
Cheers
Tru
Singularity> python $DXGB/run_DXGB.py --runfeatures --datadir /tmp/Test_2al5 --pdbid 2al5 --average
pdb index: 2al5
file directory: /tmp/Test_2al5
feature will be calculated:all
output filename : score.csv
1 molecule converted
Ligand for conformation stability:2al5_ligand.mol2
Ligand for Vina, SASA, BA, ION:2al5_ligand_rename.pdb
Protein without water molecules:2al5_protein.pdb
Protein with water molecules:2al5_protein_all.pdb
Finish Input Preparation
No Consideration of Water
No Optimized Ligand
C
Finish Vina
1.1
SASA failed
Traceback (most recent call last):
File "/app/DXGB/run_DXGB.py", line 103, in <module>
main()
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/app/DXGB/run_DXGB.py", line 43, in main
run_features(datadir, pdbid, water_type = water, opt_type = opt, rewrite = rewrite, feature_type = featuretype)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 599, in run_features
feature_calculation_ligand(datadir, pdbid, inlig_pdb, inlig_rdkit, inpro_pro, water_type, opt_type, rewrite, feature_type)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 528, in feature_calculation_ligand
run_SASA_features(datadir, i, fn, inpro, inlig)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 383, in run_SASA_features
cal_SASA(out_SASA,fn,inlig,inpro,datadir)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/cal_sasa.py", line 26, in cal_SASA
sasa_features = sasa(datadir,pro,lig)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/featureSASA.py", line 219, in __init__
self.rawdata, self.rawdata_pro, self.rawdata_lig, self.sasa, self.sasa_pro, self.sasa_lig = featureSASA( self.datadir, self.prot, self.lig)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/featureSASA.py", line 149, in featureSASA
df,df_pro,df_lig = runMSMS(inprot, inlig, datadir)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/featureSASA.py", line 103, in runMSMS
tmp1 = np.genfromtxt('p_sa.area', skip_header=1)[:,2]
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/numpy/lib/npyio.py", line 1793, in genfromtxt
fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/numpy/lib/_datasource.py", line 193, in open
return ds.open(path, mode, encoding=encoding, newline=newline)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/numpy/lib/_datasource.py", line 533, in open
raise IOError("%s not found." % path)
OSError: p_sa.area not found.
docker image available at docker://registry-gitlab.pasteur.fr/tru/deltavinaxgb:Light
pdb index: 2al5
file directory: /app/Test_2al5
feature will be calculated:all
output filename : score.csv
1 molecule converted
Ligand for conformation stability:2al5_ligand.mol2
Ligand for Vina, SASA, BA, ION:2al5_ligand_rename.pdb
Protein without water molecules:2al5_protein.pdb
Protein with water molecules:2al5_protein_all.pdb
Finish Input Preparation
No Consideration of Water
No Optimized Ligand
C
Please verify that both the operating system and the processor support Intel(R) X87, CMOV, MMX, FXSAVE, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2 and POPCNT instructions.
Traceback (most recent call last):
File "run_DXGB.py", line 103, in
main()
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "run_DXGB.py", line 43, in main
run_features(datadir, pdbid, water_type = water, opt_type = opt, rewrite = rewrite, feature_type = featuretype)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 599, in run_features
feature_calculation_ligand(datadir, pdbid, inlig_pdb, inlig_rdkit, inpro_pro, water_type, opt_type, rewrite, feature_type)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 522, in feature_calculation_ligand
run_Vina_features(datadir, i, fn, inpro, inlig)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/run_features.py", line 364, in run_Vina_features
featureVina(outfile, fn, inpro, inlig, datadir)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/cal_vina58.py", line 58, in featureVina
vinalist = runVina(fn,protpdbqt,ligpdbqt)
File "/opt/conda/envs/DXGB/lib/python3.7/site-packages/DXGB-0.1.0-py3.7.egg/DXGB/cal_vina58.py", line 15, in runVina
for lines in fileinput.input("score_v1.txt"):
File "/opt/conda/envs/DXGB/lib/python3.7/fileinput.py", line 252, in next
line = self._readline()
File "/opt/conda/envs/DXGB/lib/python3.7/fileinput.py", line 364, in _readline
self._file = open(self._filename, self._mode)
FileNotFoundError: [Errno 2] No such file or directory: 'score_v1.txt'
After taking a look at your code I realised that you only provide inference functionality with pre-trained models.
Are you planning on releasing code with the implementation of your training procedure, including pre-processing on the PDBbind database for the construction of the training/validation sets?
I am asking this because I would like to train your model with additional data.
I'm trying to run the tests on a WSL (Ubuntu 18.04) and the custom Vina crashes everytime. Any ideas of what might be causing the problem?
Hi,
After a successful checking running the Test_2al5, I'm getting with different protein-ligand complexes negative values for both vina and XGB scores, that does not make sense from the deltavinaXGB scoring formula. Whether I'm wrong, please could someone explain it to me? Thanks a lot in advance.
@jenniening,
First of all, thank you for the free availability of your software to all the scientific community.
The deltaVinaXGB docker image works fine for me but if --runrf option is added the running ends failing and showing some error lines as you can find below. Both the get_RF20.R and RF20_rm2016.rda files are present in the working directory.
Thank you very much in advance for your reply.
/opt/conda/envs/DXGB/lib/python3.7/site-packages/pandas/core/frame.py:3641: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
hi jianing,
thanks for open sourcing this interesting method.
i have looked at the source code and am wondering about your modification of VINA https://github.com/jenniening/deltaVinaXGB/tree/Light/vina_package
as it has been several years, do you know if VINA has commandline argument to output the features needed by deltaVinaXGB
?
i am not keen to use the forked version if it is too outdated from the latest official version of VINA.
could you please elaborate on which lines of code should we change to make VINA output the feature? as the whole VINA source code was added in 1 commit, a752d0b
it is not possible for me to check the diff between commits to see the changes you have implemented. so, it is difficult to pinpoint which lines of code you have changed and how we can reproduce that change on the latest VINA source code.
much appreciated,
Min Htoo
Hello Jenniening,
May I know whether I can use openbabel to transfer pdbqt files (already dock with Vina and contain 10 poses ) to the Mol2 format and rescore all the poses at the same time?
BTW, if I want to dock multiple ligands to a single protein, is there a method to rescore all the docking results
Regards,
Andy
Hello Jenniening,
I installed all the packages and dependencies according to the procedure mentioned in the "readme.md" file.
However, when I tried to run the first test command, the following error popped up.
Could you please give me a clue what I am missing?
Thank you,
Sajjad
(DXGB) sajads-air:DXGB sajadahrari$ conda activate DXGB
(DXGB) sajads-air:DXGB sajadahrari$ python run_DXGB.py --help
Traceback (most recent call last):
File "run_DXGB.py", line 6, in
from DXGB.convert_file import RF20_main as RF20_main
ModuleNotFoundError: No module named 'DXGB'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.