wdecoster / chopper Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hello,
Is it possible to use the contamination flag to filter the reads marked as contaminants by chopper?
Thank you
Wanted behavior: Run chopper directly on input files
I noticed that I can run chopper in a pipe like this:
cat file.fastq | chopper --minlength 1350 --maxlength 1600 --quality 10 > test.fastq
But it seems I cannot run it directly on the file like this:
chopper --minlength 1350 --maxlength 1600 --quality 10 file.fastq> test.fastq
Resulting in:
error: Found argument 'file.fastq' which wasn't expected, or isn't valid in this context
Maybe you could consider adding this functionality to make the use of this tool more straightforward?
Hello,
Would be nice if chopper could output a sequencing_summary.txt file similar to guppy. Most QC tools require this file so I am unable to use some of them after using chopper.
In the example you have the following:
gunzip -c reads.fastq.gz | chopper -q 10 -l 500 | gzip > filtered_reads.fastq.gz
Is this what you would recommend for the default settings? I'm adding this to my VEBA software suite (https://github.com/jolespin/veba) so I want to include some decent defaults that could be used in most cases.
Thanks!
Hi @wdecoster, I installed the the binary of version 0.4.0 through chopper-linux.zip of that release.
When I try to run it, it complains that there are some missing libraries required by chopper. Is that correct?
$ chopper
chopper: /lib64/libc.so.6: version `GLIBC_2.28' not found (required by chopper)
chopper: /lib64/libc.so.6: version `GLIBC_2.25' not found (required by chopper)
chopper: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by chopper)
chopper: /lib64/libc.so.6: version `GLIBC_2.33' not found (required by chopper)
chopper: /lib64/libc.so.6: version `GLIBC_2.32' not found (required by chopper)
chopper: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by chopper)
chopper: /lib64/libm.so.6: version `GLIBC_2.29' not found (required by chopper)
chopper: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by chopper)
Can you give me all the required libraries/dependencies to use chopper?
Kind regards,
Mohammad Ali Amir
Hi there, I'm trying to run chopper on an aarch64 architecture from release.
It works perfectly fine on my amd64.
Is there any way to make it works on an aarch64 ?
Thanks for your time
Arthur Cousson
Dear Wouter De Coster:
When I install ready-to-use binary chopper, and I used chopper-musl.zip as you mentioned in previous issues. I still have the same issue, see screenshot.
I am running centos7.
Hello! Thank you for creating chopper for us. However, I noticed when I was trying to remove DCS reads from my fastq files that a good portion of contaminating reads still remain. This is an example of one read blasted against the DCS sequence.
(query) bad read: 3,800 bp (90% =3,420bp)
(target) DCS: 3,560 bp
Chopper left these reads, so I decided to manually run minimap2 -ax map-ont DCS.fasta read.fq
to see the PAF results. My "match_len" was 3,510bp. Please correct me if I'm misinterpreting the filter function, but I assume because 3,510bp > 3,420bp it should be classified as a contaminate.
Alternatively if i run minimap2 -x map-ont DCS.fasta read.fq
my "match_len" was 3,268bp. Because it is not greater than 3,420bp the read would be retained. Could chopper be inaccurately reporting the lengths because the Aligner setup in lines: 178-184 is missing ".with_cigar()"? lh3/minimap2#158
fn setup_contamination_filter(contam_fasta: &str) -> Aligner {
Aligner::builder()
.with_threads(8)
.map_ont()
.with_index(contam_fasta, None)
.expect("Unable to build index")
}
I remember quite a while ago there was a discussion that there were discrepancies in q score interpretation/filtering by albacore and nanofilt depending whether the reads or the sequencing summary was used.
Discussed here:
https://gigabaseorgigabyte.wordpress.com/2017/07/15/nanofilt-using-albacore-sequencing_summary-for-quality-filtering/
Is that still the case? If I use chopper to filter the fastqs based on a certain q score, will I obtain the same reads like using guppy_basecaller with the same q score filter?
Thanks!
I would love to include this tool on a course I am running, but the students will be new to bioinformatics and would struggle to install this back home outside of the training environment without prompts - would it be possible to include installation instructions on the README? Many thanks
Is the -c command the same as the -r for read removal in Nanolyse? I used it to remove host reads and the size of the resulting fastq.gz file was smaller, but no verbose messages indicated that any reads had been removed, like in Nanolyse. Can it generate a log file of how many reads were removed due to low quality, length, and host contamination? How would I add that in? I am not familiar with how to do this when the output is piped immediately to gzip.
Thanks for the help and for the sequencing tools!
Hello.
I have been trying to download Chopper on my Macbook Pro M1. I tried installing via Conda as suggested and it keeps telling me it cannot find Chopper. I also tried downloading the released files and that did not work either as my MacBook thinks that it is malware. Could I have more precise installation instructions? I do not write rust myself so I don't know where to find the solutions.
Thank you so much.
Hi,
It would be useful to add the ability to filter reads by timestamp. I believe this function would be possible only for ONT. A similar tool has already been developed in Rust (1).
One last thing ... it would be interesting to add the possibility to select the number of bases to keep (from the start of the read). Much like the CROP
option of Trimmomatic (2).
1- https://github.com/angelovangel/nanotimes
2- https://github.com/usadellab/Trimmomatic
thank you for sharing your work
Joel
Hi,
would be nice if you compiled vs an older glibc to make it more compatible. I could load the whole toolchain and compile within my container, or use conda, but sometimes the joy of rust is just running a wget and using the binary.
chopper: /lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.33' not found (required by chopper) chopper: /lib/x86_64-linux-gnu/libc.so.6: version
GLIBC_2.32' not found (required by chopper)
chopper: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by chopper)
I get these errors on Ubuntu 20.04 for example.
Thanks,
Colin
Hello,
I noticed that here minimap is only given 8 threads. It would be nice if the threads flag was passed on to minimap.
Installing via conda using command
conda install -c bioconda chopper
Error
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
Hello,
I was wondering if there are any plans to add Nanopore adapter and/or barcode detection.
Thanks
chopper -v results in chopper 0.2.0. Mamba when installing says 0.3.0. Am I missing something?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.