hello, i started using hhsuite v3 last october for testing. i noticed then the qua

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

hhblits "scores" decrease as n increases about hh-suite HOT 5 CLOSED

soedinglab commented on July 18, 2024

hhblits "scores" decrease as n increases

from hh-suite.

Comments (5)

martin-steinegger commented on July 18, 2024

Hello @fslee62,

thanks a lot for evaluating HH-suite3.0 and sorry for this late answer.

To your observations:
(1) you mean that the hits from later (>2) iterations have a lower score? This could be explained because of profile divergences through picking up a lot of diversity.
(2) This is possible. The scoring scheme changed.
(3) There is a length limitation of 20.000 within hh-suite (without adjusting the -maxres parameter). So you problably found some very long target sequence. This is a problem. sequences longer 20.000 can lead to corrupted memory. In the successor databases to the Uniprot20, the Uniclust there are no longer sequences than 14000 residues (http://gwdu111.gwdg.de/~compbiol/uniclust/2017_04/ ).

To your questions:
(1) I dont think its a usage error of the method. HH-suite3.0 and 2.0.16 just perform differently because of the scoring scheme change.
(2) What do you mean by lower n? In our benchmarks 3 iterations of HHblits3.0 (and 2.0.16) results in the best performance. HH-suite3.0 should create better models.
(3) I think hh-suite2.0.16 just did not print this error message, but the limitation still exists.

Cheers
Martin

from hh-suite.

fslee62 commented on July 18, 2024

hello martin,
sorry for my late reply also. thanks very much for the informative answers.

related to my observations:
(1) yes i meant when n (# of hhblits search iterations) went up (>1), the hhblits scores went down. this was true for both v2 and v3. i can understand it was related to profile divergence.
(2) got it. the scoring schemes are different between v2 and v3.
(3) so if i use unclust30 (instead of uniprot20), then i can avoid the warnings and memory corruptions??

related to my questions:
(2) yes i wasn't clear there. i used the antibody modeling assessments 2 (AMA-II) as my benchmark. i used an in-house software to predict models for the 11 AMA-II cases. then i used RMSDs (predictions vs x-ray structures) as a main gauge for performance.

our in-house software uses hh-suite v2 OR v3 for this benchmark. all else (code-wise) were identical. the HHM's for the 11 targets were generated using n=3 (hhblits) for both the v2 and v3 cases. then we used hhsearch to find the templates (i.e., alignments). after that, we built the 11 models and calculated 11 RMSDs (for both the v2 and v3 cases). the average backbone RMSD for the v2 case was around 1.25A, whereas the one for the v3 case was around 1.65A. that is, 0.4A worse using v3.

since i homed in on #iterations (n of hhblits), i have since performed the following experiments:
(1) all else the same except n=1, n=2, or n=3 during the HHM construction stage.
(2) calculate final average backbone RMSD (cf <b-RMSD>) as described above.

here are the results: as n went up (1 -> 3),
(A) <b-RMSD> went UP for v2: performance got worse with lower n consistent with your expectation.
(B) <b-RMSD> went DOWN for v3: performance got worse with higher n different from your expectation.
i did not use n>3.

in summary, v2 produced the best final models using n=3 during HHM constructions.
v3 produced the best final models using n=1 during HHM constructions.

overall, v3 still performed less well relative to v2 (all else the same). in particular, best v2 <b-RMSD> was around 1.25A (n=3). best v3 <b-RMSD> was around 1.37A (n=1).

IS IT POSSIBLE that hh-suite is not meant for HIGH homology cases like the variable domains of Fab?? in general. since v3 is even more "sensitive" than v2, the negative impact is even greater for such cases??
fred

from hh-suite.

soeding commented on July 18, 2024

Hi Fred,
the model quality can be influenced a lot by various details of how you build your models, which might make the your results meaningless.
(1) Your models could always use global alignments or local alignments from HHblits for creating the distance restraints for Modeller (or some other homology modeling software).
(2) Your model can either contain all residues of the query sequence or it might only contain those residues that are part of the local query-template alignment.
To get useful results, you need to use global alignments (e.g. using option -mact 0.05) and include all residues in your model, because adding more residues to your model can only hurt your RMSD. The greediness (length) of the alignments (e.,g. of hhblits v2 versus v3) could otherwise have a large influence on the RMSD of your model. Long, greedy alignments will give higher RMSDs than short, conservative alignments in which unreliably aligned parts are left out of the query-template alignment.

Second, you have a trade-off between specificity and sensitivity. Adding more iterations only helps if you need to get more sensitive, in the case that your template is very distantly related. As a simple rule of thumb, if your template is more closely related to your query than the most distantly related sequences in your query MSA with the query itself, than you have added too much diversity to your query MSA and you should rather reduce the number of iterations.

For a more detailed discussion and a method to find the optimal balance of sensitivity and specificity in building your query MSA, see our paper http://onlinelibrary.wiley.com/doi/10.1002/prot.22499/full

RMSD has a very bad reputation as a measure of the quality of protein structure models in the (CASP) community. Better use TM-score, GDT-HA, or any other score used in recent CASP competitions.

I hope this helps.
Best wishes,
Johannes

from hh-suite.

fslee62 commented on July 18, 2024

hi johannes,
sorry for the slow reply. i appreciated very much your expert guidance. i will study the effect of local vs global alignment (i.e., the -mach values) on the overall quality of the final HM models in our benchmark.

however, i guess where i got stuck was trying to understand why using hhsuite v2 vs hhsuite v3 gave me noticeable different overall benchmark performance when ALL else were the same. i was not trying to get a better overall result using either v2 or v3. indeed, i am/was most interested in finding the corresponding conditions or parameters that will gave me comparable performance between v2 and v3. i can see that v2 and v3 may have been "tuned" differently, thus needing slightly different parameter values.

finally, the backbone average RMSDs (referred to above as <b-RMSD>) were calculated according the specs of AMA-II (i.e., using C=O instead of Ca) so that i may compare my results to those published. yes i also calculated other gauges concurrently such GDT-TS, TM-score, MP-score, and our own internal "health" score. although i don't have the numbers in front of me, i take it that i would see similar performance parity between v2 and v3 if i say use <TM-score> instead of <b-RMSD>.
most respectfully,
fred

POSTSCRIPT: i made a careless mistake describing my results a week ago: here is the correction:
>>>>>
here are the results: as n went up (1 -> 3),
(A) <b-RMSD> went DOWN for v2: performance got worse with lower n.
(B) <b-RMSD> went UP for v3: performance got worse with higher n.
<<<<<

from hh-suite.

fslee62 commented on July 18, 2024

all else being the same, using -mact=0.05 and n=1 (vs -mact=0.35 and n=1) did not have huge impact on the overall model accuracy using AMA-II as a benchmark. the Fv mean backbone RMSD (average of 11 cases) went from 1.36A (mact=0.35) to 1.34A (mact=0.05). template alignments were done using hhsuite-v3.

using the same benchmark, pretty much all default hhsuite-v2 parameters (mact=0.35, n=3), and the same model building and model selection engines, the corresponding Fv mean backbone RMSD was 1.26A.

i have yet determined a set of hhsuite-v3 parameters that will give me equal or better overall results relative to those obtained from using hhsuite-v2. the one hhsuite-v3 parameter that has generated the largest performance variations was n (# of iterations of hhblits used in making the query HHM's).

with such, i will conclude my question and close this ticket. thank you for all your help and suggestions.

from hh-suite.

hhblits "scores" decrease as n increases about hh-suite HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent