Comments (3)
What do you think @torognes? the idea of using top_hits_only
and --maxaccepts 0
is to be sure to find all best matches: if there are 250 target sequences 100% identical to my query sequence, I want to get all of them in the output.
Or did I misunderstand the process? If the target sequences are pre-sorted by k-mer profile similarity first, can I safely set --maxrejects 32
to implement the speed up process I described above?
from vsearch.
In this scenario I think it would make sense to set a reasonable maxrejects value (e.g. 32 or even 1000). I think this will be a good heuristic that would capture all the top hits in almost all cases.
from vsearch.
Yes, I agree with you.
As the initial similarity sorting is done on k-mer profiles, which is a proxy for the actual pairwise similarity, there is always a risk of missing the optimal hit. This is especially true when none of the target sequences is close to the query: the behavior of the k-mer profile pre-sorting gets less predictable for low similarities.
Nonetheless, it is probably safe to use --maxrejects 32
as you suggested.
from vsearch.
Related Issues (20)
- compilation warning with ar: 'u' modifier ignored since 'D' is the default HOT 2
- sintax output is sometimes 4 columns and other times 5 columns HOT 3
- fastq_stripleft when the resulting length is null?
- forward read trimming and filtering (Minardi et al. 2021) HOT 1
- control of 2 separate randseed events in sintax HOT 10
- from fasta files to an OTU table HOT 1
- --uchime_denovo takes abundance information into account HOT 1
- how to detect matches containing many ambiguous symbols? HOT 1
- more compile-time checks HOT 2
- Issue encountered when using vsearch --usearch_global to generate OTU frequency table HOT 3
- clean-up stale branches HOT 1
- --makeudb_usearch truncates fasta headers HOT 3
- maxseqlength is not supported by uchime_denovo command HOT 6
- vsearch --usearch_global not showing "full alignment" instead only the segment pair HOT 3
- vsearch --top_hits_only --maxaccepts 1 returns sometimes 2 values HOT 6
- Issue related to usearch_global match HOT 4
- missing userfields options
- Consequences of using vsearch on NovaSeq data HOT 4
- Fix warnings reported by Lintian HOT 2
- Obtaining the expected error for each read HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vsearch.