rlabduke / probe Goto Github PK
View Code? Open in Web Editor NEWEvaluate and visualize protein interatomic packing
Home Page: http://kinemage.biochem.duke.edu/software/probe.php
Evaluate and visualize protein interatomic packing
Home Page: http://kinemage.biochem.duke.edu/software/probe.php
Hi,
in the Debian packaged version of probe of version 2.16 (git commit 9b198c1) we added an CI test which created a file 6ins.kin
which is compared with some checksum. Unfortunately this checksum has changed and thus I've created a diff between the two result files which actually show several differences:
--- /tmp/king-probe-test.2.16.160404+git20200121.9b198c1-3/data/6ins.dotinfo.table.head 2021-10-06 10:37:24.349423219 +0200
+++ 6ins.dotinfo.table.head 2021-10-06 10:35:28.076993097 +0200
@@ -8,33 +8,33 @@
subgroup: self dots
atoms selected: 1268
-potential dots: 219619
-potential area: 13726.2 A^2
+potential dots: 218753
+potential area: 13672.1 A^2
type # % score score/A^2 x 1000
- C wide_contact 2716 1.2% 24.9 1.81
- C close_contact 1733 0.8% 76.0 5.53
- C small_overlap 576 0.3% -18.3 -1.34
- C bad_overlap 21 0.0% -4.2 -0.31
+ C wide_contact 2748 1.3% 25.0 1.83
+ C close_contact 1755 0.8% 76.7 5.61
+ C small_overlap 605 0.3% -19.6 -1.43
+ C bad_overlap 22 0.0% -4.6 -0.34
C H-bond 281 0.1% 7.9 0.58
- N wide_contact 550 0.3% 5.7 0.41
- N close_contact 503 0.2% 23.0 1.67
- N small_overlap 447 0.2% -20.6 -1.50
- N bad_overlap 67 0.0% -10.6 -0.77
+ N wide_contact 543 0.2% 5.6 0.41
+ N close_contact 499 0.2% 22.7 1.66
+ N small_overlap 444 0.2% -20.5 -1.50
+ N bad_overlap 67 0.0% -10.6 -0.78
N H-bond 61 0.0% 2.1 0.15
- O wide_contact 1806 0.8% 18.0 1.31
- O close_contact 1608 0.7% 72.4 5.28
- O small_overlap 1304 0.6% -61.9 -4.51
- O bad_overlap 240 0.1% -43.1 -3.14
- O H-bond 2964 1.3% 87.2 6.35
- S wide_contact 212 0.1% 2.1 0.15
- S close_contact 82 0.0% 3.3 0.24
- S small_overlap 3 0.0% -0.1 -0.00
+ O wide_contact 1800 0.8% 17.8 1.30
+ O close_contact 1603 0.7% 72.2 5.28
+ O small_overlap 1305 0.6% -62.2 -4.55
+ O bad_overlap 237 0.1% -42.8 -3.13
+ O H-bond 2939 1.3% 86.4 6.32
+ S wide_contact 209 0.1% 2.0 0.15
+ S close_contact 87 0.0% 3.5 0.26
+ S small_overlap 1 0.0% -0.0 -0.00
S H-bond 36 0.0% 1.2 0.09
- tot contact: 9210 4.2% 225.2 16.41
- tot overlap: 2658 1.2% -158.7 -11.56
- tot H-bond: 3342 1.5% 98.4 7.17
+ tot contact: 9244 4.2% 225.6 16.50
+ tot overlap: 2681 1.2% -160.2 -11.72
+ tot H-bond: 3317 1.5% 97.7 7.14
- grand tot: 15210 6.9% 164.9 12.01
+ grand tot: 15242 7.0% 163.1 11.93
-contact surface area: 950.6 A^2
+contact surface area: 952.6 A^2
I wonder whether this kind of differences are expected or whether something is wrong. The same is true for the other test data set that is used later in our test script.
Kind regards, Andreas.
The (legacy) Probe commandline allows selection of types of contact partners. Whatever parses this is converting parts of the selection string to capital letters. This makes it impossible to select case-sensitive chains containing lower-case letters. (I have not tested the Probe2 commandline.)
For example:
phenix.probe -u -q -con -mc -het -once -ONLYBADOUT 'chainb ogt10 not water' 'ogt10' 8b0x_B_and_b_chains.pdb
This command should show only the chain b (lowercase) contacts for the attached file. Instead it shows only the contacts for chain B (uppercase).
Compare to:
phenix.probe -u -q -con -mc -het -once -ONLYBADOUT 'chainB ogt10 not water' 'ogt10' 8b0x_B_and_b_chains.pdb
which selects for the chain B (uppercase) contacts.
Also compare to:
phenix.probe -u -q -con -mc -het -once -ONLYBADOUT 'ogt10 not water' 'ogt10' 8b0x_B_and_b_chains.pdb
which does not specify a chain, and shows the contacts for both B and b. This shows that Probe is successfully case-sensitive internally, and the issue is likely localized to the selection syntax.
We would like to support lowercase characters for chains (and alternates?). On the MolProbity site, this affects the interface contacts tool. In personal work, this affects some of my database construction scripts.
Sample file containing B and b chains (zipped b/c GitHub doesn't permit .pdb files):
8b0x_B_and_b_chains.zip
Hi,
I'm packaging probe for Debian. To do so it would help to know what code you consider a release that is considered for the end user. So it would help if you would tag the releases to mark the distributable code.
Thanks for considering, Andreas.
Probe crashes when processing residues that contain at least two oxygen atoms named "O" if the first residue of the chain does not have any "N" nitrogen atoms named "N". This can be seen for example in "1ob6" (chain B, residue 0). The crash is due to a presumably erroneous check in ProcessResInfo. When creating the list of "O" oxygen atoms in a residue (ambigO), there is a check to see if the currently added atom is in the same residue as the previous atom in a list of "N" nitrogen atoms (ambigN). Checking ambigN does not seem to make much sense here. I think the intended behavior is to check the previous atom in ambigO which would make sense to ensure that all oxygen atoms in the list belong to the same residue, as stated in the comment above the code. If the check against ambigN was intended, then there would only ever be 1 oxygen atom added to the list, except on the n-terminal residue, because the first 4 atoms in ambigN all belong to the n-terminal residue if they are not NULL. The ambigO list however is used to determine the c-terminal mainchain oxygen atoms. So if my understanding of the code is correct, the erroneous check could in some cases also lead to undetected c-terminal mainchain oxygen atoms.
Instead of changing the check from ambigN to ambigO, it could probably also be removed entirely. This is because ProcessResInfo is only ever called after a call to NtermCheck if the residue changes, which sets all values in ambigO to NULL. There is a small difference though. The way residue changes are detected differs between the code calling NtermCheck and the check in ProcessResInfo. If the residue name of a residue changes between two alternate locations of the same atom, that will result in a different "r" pointer in the atom, leading the check in ProcessResInfo to consider this a different residue and not add the atom to the list, despite the residues having the same ID. The code that calls NtermCheck only looks at the residue ID and chain, meaning it would consider both atoms to be part of the same residue. The latter seems to be the intended behavior, so removing the check entirely would likely fix another small bug. The same argument could be made about ambigN[4-7], but not ambigN[0-3]. For ambigN[0-3] the check in ProcessResInfo would need to be adjusted to look at the residue ID + chain instead of the "r" pointer.
Hi again
as I reported in issue #9 we now again have a change in the checksum of results between version 2.21 of probe (which was the last one we have packaged) and the latest version 2.23 I intend to package for Debian.
Is this change expected again? Would you be able to craft some test suite from your side we can simply run to prove expected behaviour?
Kind regards
Andreas.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.