usadellab / trimmomatic Goto Github PK

View Code? Open in Web Editor NEW

216.0 216.0 70.0 224 KB

License: Other

Java 100.00%

trimmomatic's People

Contributors

Stargazers

Watchers

trimmomatic's Issues

Trimmomatic parallel de/compression

From what I could see, the next trimmomatic version will support parallel decompression and compression. If so, any idea when the new 0.40 version will be released?

There is a snakemake-wrappers/multiqc issue (snakemake/snakemake-wrappers#961) that would be greatly simplified if trimmomatic would perform de/compression in parallel.

thanks,

Negative Value of "Input Read Pairs" and "Both Surviving" in log file

Hi,

We are using Trimmomatic as a benchmark tool in our pipeline for the trimming of Illumina reads.

We recently found that the metrics of "Input Read Pairs" and "Both Surviving" reported by Trimmomatic log file are negative in one of our flowcells (others are good).

Please check the screenshot below (line 9):

The reads we used are human whole genome sequence reads. The size of each reads is around 160GB.

Based on our previous work, these two metrics should be always positive. I have several questions below:
(1) May I know why we get the negative value for these two metrics?
(2) If it is not a bug, how we understand these negative value?
(3) How we get the real number of "Input Read Pairs" and "Both Surviving"? Should we simply reverse the negative to positive?

Thanks so much for your help.
Best regards
Xin

Input reads is obviously less than my data

Trimmomatic java -jar trimmomatic-0.39.jar PE -threads 16 -phred33 /Users/lubo/RBH/RBH_9_1.filtlowGC.R1.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R2.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R1_paired.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R1_unpaired.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R2_paired.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R2_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:5 TRAILING:5 SLIDINGWINDOW:4:15 MINLEN:36

TrimmomaticPE: Started with arguments:
-threads 16 -phred33 /Users/lubo/RBH/RBH_9_1.filtlowGC.R1.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R2.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R1_paired.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R1_unpaired.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R2_paired.fq.gz /Users/lubo/RBH/RBH_9_1.filtlowGC.R2_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:5 TRAILING:5 SLIDINGWINDOW:4:15 MINLEN:36
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Exception in thread "Thread-0" java.lang.RuntimeException: Sequence and quality length don't match: 'ACCATCATAGCAGCAGATCGCACACTGATGACC1101:211TGATGTACTCATAGAAT' vs 'FFFFFFFFFFFFFF'
at org.usadellab.trimmomatic.fastq.FastqRecord.(FastqRecord.java:25)
at org.usadellab.trimmomatic.fastq.FastqParser.parseOne(FastqParser.java:89)
at org.usadellab.trimmomatic.fastq.FastqParser.next(FastqParser.java:179)
at org.usadellab.trimmomatic.threading.ParserWorker.run(ParserWorker.java:42)
at java.base/java.lang.Thread.run(Thread.java:832)
Exception in thread "Thread-1" java.lang.RuntimeException: Sequence and quality length don't match: 'CCTCAGGCTTTGGCGGCTCAGGCTCCTCCTTCTCCTCTTCCTTCTTCTCCTCCGGCGGAGGCGGTATCGGCGACAAGAGCTCCACCTTGCGGCCGGTCTTCTTCTGGACGCGCTCCACCACCTA' vs 'FCGGGAGGCGCTTCTCGGCCTTGGGCG2FFFFFFFFFF:10936:1485T,FFF,FFFFFFFFFFFFFFFFFFFFFFCTTCTGATTTCAAATTTTGCATTGGTCG:AGTCATGGAC9CACATAAGCAGTGGCAC'
at org.usadellab.trimmomatic.fastq.FastqRecord.(FastqRecord.java:25)
at org.usadellab.trimmomatic.fastq.FastqParser.parseOne(FastqParser.java:89)
at org.usadellab.trimmomatic.fastq.FastqParser.next(FastqParser.java:179)
at org.usadellab.trimmomatic.threading.ParserWorker.run(ParserWorker.java:42)
at java.base/java.lang.Thread.run(Thread.java:832)
Input Read Pairs: 5000 Both Surviving: 4996 (99.92%) Forward Only Surviving: 4 (0.08%) Reverse Only Surviving: 0 (0.00%) Dropped: 0 (0.00%)
TrimmomaticPE: Completed successfully

My RNA sequence is about 1.3Gb, the input reads is only 5000. I want to know why did this happen? I run it on my MAC terminal, Or it's just cased by my MAC RAM is so little.

Multiple Threads

Hi guys,

Hope you are well.

I was trying to run it using multiple threads, but I notice that it doesn't really matter if I use 1 or more, the program runs always with the same speed.
Can you think of any possible reason?

Here is the command I run:
sudo java -jar /opt/Trimmomatic-0.39/trimmomatic-0.39.jar SE -threads 12 -trimlog trimlog_fileX.txt fileX filtered_fileX ILLUMINACLIP:TruSeq3-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:15

Am I doing something wrong? Is it on the Java VM side?

Thanks in advance for any insight.

-Jesse

ILLUMINACLIP trimming occurs from 5' or 3' for ?

Could you please let me know from where the trimming happens (5 or 3' end) so that I could set my adpator sequence file correctly?

I am asking about palindrome mode.

Thanks
Marwa

phred33/64 quality score autodetection

Has the autodetection for phred quality score been implemented already?

trimmomatic V0.39, keepBothReads: flag or boolean?

Hi,

following your example commands in the Quick start guide at https://github.com/usadellab/Trimmomatic and http://www.usadellab.org/cms/?page=trimmomatic it seems that keepBothReads is a flag that is passed to trimmomatic V0.39 and not a boolean value as in the manual for earlier versions, http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf.

Running trimmomatic V0.39 with ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:keepBothReads does however not produce the same results as ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:TRUE. With TRUE there are more surviving read pairs while setting keepBothReads (or a random word) retains fewer.

To me it looks like TRUE is the correct way of setting this parameter but in your posted examples this looks like a flag. Please advice.

encouding error on WSL

I am using WSL, meeting some error like this:

(base) ez@NYM:~/Trimmomatic$ ./Trimmomatic-0.39/trimmomatic-0.39.jar
./Trimmomatic-0.39/trimmomatic-0.39.jar: line 1: $'PK\003\004': command not found
./Trimmomatic-0.39/trimmomatic-0.39.jar: line 2: $'\bN[\210N': command not found
./Trimmomatic-0.39/trimmomatic-0.39.jar: line 3M[�Ng�6��META-INF/MANIFEST.MFM��: No such file or directory
./Trimmomatic-0.39/trimmomatic-0.39.jar: line 16: syntax error near unexpected token `)'
./Trimmomatic-0.39/trimmomatic-0.39.jar: line 16:'BY<��Y��AUTHORS.jbzip2�M,)�H-W��Sp+J�K�,��PK��'

No survival & Incorrect input reads

Tried to perform a trim on DRR099963. (renamed to fimi)

Input Code:

trimmomatic PE -phred33 fimi_R1.fastq fimi_R2.fastq
fimi_R1_paired.fastq.gz fimi_R1_unpaired.fastq.gz
fimi_R2_paired.fastq.gz fimi_R2_unpaired.fastq.gz
LEADING:10 TRAILING:10 SLIDINGWINDOW:5:20 MINLEN:100

Output:

TrimmomaticPE: Started with arguments:
-phred33 fimi_R1.fastq fimi_R2.fastq fimi_R1_paired.fastq.gz >fimi_R1_unpaired.fastq.gz fimi_R2_paired.fastq.gz >fimi_R2_unpaired.fastq.gz LEADING:10 TRAILING:10 >SLIDINGWINDOW:5:20 MINLEN:100
Multiple cores found: Using 4 threads
Input Read Pairs: 195 Both Surviving: 0 (0.00%) Forward Only >Surviving: 0 (0.00%) Reverse Only Surviving: 195 (100.00%) >Dropped: 0 (0.00%)

Unsure on why the input read pairs are incorrect. A different person ran the same code, was able to get a complete trim.

Is there anything that I could check? Or is the code incorrect?

Inconsistency between design and implementation caused by misconception about ListIterator's previous() method

Inconsistency between design and implementation is found in the code for the calculateMaximumRange() method in the IlluminaClippingSeq abstract class.

Trimmomatic/src/org/usadellab/trimmomatic/trim/IlluminaClippingTrimmer.java

Lines 589 to 608 in 156c1a8

 if (val < 0 && mergeIter.hasPrevious() && mergeIter.hasNext()) 

 { 

 float prev = mergeIter.previous(); 

 mergeIter.next(); 

 float next = mergeIter.next(); 

 if ((prev > -val) && (next > -val)) 

 { 

 mergeIter.remove(); 

 mergeIter.previous(); 

 mergeIter.remove(); 

 mergeIter.previous(); 

 mergeIter.set(prev + val + next); 

 scanAgain = true; 

 } 

 else 

 mergeIter.previous(); 

 } 

 }

prev gets the same value as val in line 591 because ListIterator's previous method returns the current value and then moves backwards.

val < 0 in line 589. prev > -val in line 595 means val >0 if prev were equal to val. Since val < 0 and val >0 cannot coexist, the if block in lines of 596-604 will never be executed. Hence merges will never be changed. I think this is not what the author wants to do. Is this really a bug?

Zhen

Quality score autodetection fails for Element Biosciences AVITI data

Trimmomatic fails for sequencing data generated by the Element Biosciences AVITI instrument if Phred33/64 encoding is not provided (quality score is autodetected for Illumina data).

"Error: Unable to detect quality encoding"

Adding the "-phred33" argument solves the issue.

Run time error exception occur

ava -jar /usr/share/java/trimmomatic-0.39.jar SE /home/centyle/Documents/metagenome/20607400637-Water-Sample-Molsys-Pvt-Ltd-NCGM-958_S26_L001_R1_001.fastq.gz /home/centyle/Documents/metagenome/20607400637-Water-Sample-Molsys-Pvt-Ltd-NCGM-958_S26_L001_R1_001.fastq_output.gz ILLUMINACLIP:/usr/share/trimmomatic/TruSeq3-SE.fa ILUMINACLIP:LEADING:3 TRAILING:1 SLIDINGWINDOW:1:5 MINLEN:10
TrimmomaticSE: Started with arguments:
/home/centyle/Documents/metagenome/20607400637-Water-Sample-Molsys-Pvt-Ltd-NCGM-958_S26_L001_R1_001.fastq.gz /home/centyle/Documents/metagenome/20607400637-Water-Sample-Molsys-Pvt-Ltd-NCGM-958_S26_L001_R1_001.fastq_output.gz ILLUMINACLIP:/usr/share/trimmomatic/TruSeq3-SE.fa ILUMINACLIP:LEADING:3 TRAILING:1 SLIDINGWINDOW:1:5 MINLEN:10
Automatically using 1 threads
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:67)
at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:32)
at org.usadellab.trimmomatic.Trimmomatic.createTrimmers(Trimmomatic.java:59)
at org.usadellab.trimmomatic.TrimmomaticSE.run(TrimmomaticSE.java:318)
at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:85)

Support for auto-detect the adapter sequence

@TonyBolger I am creating a pipeline with Trimmomatic being our tool of choice for adapter trimming. One issue in practice is that it's hard for analysts to figure out the correct adapter sequence to use. For that reason, many would turn to Trim Galore instead where adapter can be auto-detected if not specified.

is there a way I can still use Trimmomatic without having to specify the adapter file -- either that trimmomatic can detect it, or, if there are auxiliary codes you could point to me so I can use that to generate the adapter file for input to Trimmomatic? Thank you!

keepBothReads - should it be "true" or "True"

Hi,

For the keepBothReads parameter, should True have a capital letter or not? I have seen different options in the manual/GitHub page.

Best wishes,
Lucy

Suggestion of updates to the README.md

Hej!

I have been struggling to get Trimmomatic to process a custom set of adapters. The reason was due to the naming of the fasta sequences in my adapter file. The section at the end of the README.md could gain with some clarifications on what is required for the naming. My adapter file looked like:

NAME 3p
some sequence
NAME 5p
some sequence

Trimmomatic was consistently using only a single adapter sequence (the last one) for trimming. It took me a while to understand that it must index the sequences by their name and that it must disregard anything following the white space character. It did not throw any warning either that an adapter sequence was being set multiple times / overwritten. Removing the space solved the issue. Having that information in the README will hopefully save others some time :-)

Thanks in advance!

paired read assignation

Hi everyone,

I realized that after performing trimmomatic, I have reads in my file R1_p (forward paired), that aren't in my R2_p (reverse paired) file. The number of reads in both files is the same, what could be happening?

In this case, the first read is found in R1_p but not in the R2_p file. Then the next ids are the same...

head of R1_p file:
@M03132:284:000000000-KM9YY:1:1101:14323:2916 1:N:0:TCGACTTG+AGCCTTAA
GGTACTGGTTGGACAGTGTATCCTCCTTTATCAGGTATTCAATCACATTCGGGAGGTTCTGTAGATTTAGCTATTTTCAGTTTACATTTAGCAGGAACTTCTTCTTTATTGGGTTCT
ATAAATTTTATTACAACTATTTTTAATATGAGAGTTCCTGGAATGGAAATGCATAAAATACCCTTATTTGTATGAGCGATTTTAATAACTGCGTTTTTATTGTTATTATCACTACCT
GTTTTAGCTG
+
AAABBFFFFFBBGGGGGGDGGGHFHHHHHHHHHHHGGHGHHHHHGHFHHHHHGGGGGHHFGGHHGHHHHGHHHHHHHHHBGFFHHFHHHHHHHHFECGHFHFFHBHHHHHHHEGGHG
HHHHFHHHHHHHHHHHHHHHHHGHHHHHGHHFFHHHHHHHHFGHBG2FGHBDDGGGHHGFHFHHHBHHHBGDHEB>G?CHHGHHEHHHBGFFDGGGGEEHHHHHB0G=GBDDG0CBD
DFGHHB/:C;
@M03132:284:000000000-KM9YY:1:1101:18065:2943 1:N:0:ACGACTTG+AGCCGTAA
GGTACTAGTTGGACAGTGTATCCCCCTTTATCAGGAAGTTATTCTCATTCGGGTCCTGCAGTAGATATTGCTATTTTTTCTTTACATTTAGCTGGTATATCTTCAATAGCTGGAGCA
ATAAATTTTATAGTTACAATATTTAATATGCGTTGTAGAGGTTTATCTTTAGATATACTTCCTTTATTTTGTTGAGCTGTTTTATATACAGCATTTTTATTATTATTATCTTTACCT
GTTTTAGCTGGTGCTAT
+
AAAAAFFFFFFFGGGGGGGGGGHHGGGHHHHHHHHHHHHHHHHHHHHHHHHHGHGHHIHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHHHGHHHHHGHHHHHHHHHHHHHHH
HHHHHHHHHHHHHHHGHHGHHHHHHHHHHHHGEEEFEHHHHBGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHGHHHFFHH
HHHHHHBFHFFGBGHFH

and R2_p:

@M03132:284:000000000-KM9YY:1:1101:18065:2943 2:N:0:ACGACTTG+AGCCGTAA
TAGACTTCGGGATGGCCAAAGAATCAAAATAAATGTTGATACAATATAGGGTCTCCACCACCTTCAGGTTCAAAAAATGTTGTACTAACATTTCTATCAGTTAATAACATAGTAATA
GCACCAGCTAAAAC
+

AAAB5DFBB@DEGCFGGGGCFFGHHGHCFGCFHGGHB5FFGGBGGGBD32FFFGGFCGCCAGFGEFFBFEE5FB1EHBFGBF5DF553@GHHFH4GGB4BBB4FDB@3B33FB4F4
4@4B?10?GB433B
@M03132:284:000000000-KM9YY:1:1101:18951:2977 2:N:0:ACGACTTG+AGCCGTAA
AGACCTCGGGGTGGCCAAAAAATCAGAATAAGTGTTGGTATAATACAGGGTCACCTCCACCTGCTGGGTCAAAG

A>AACFFBBBBGGGFGGGGC22FFH3FBHFBBGFHG22FHFFBHGFD531FFGGEHFGBFCHE1G331FG1355

Many thanks in advance.

if the adapter.fa is needed and when to set keepbothreads TRUE

Hi.

I used to Trimmomatic to trim my bulk RNA sequencing data (pair-end). Adapters were pre-removed in some of my data by the company.
I tried to use timmomatic to do the trimming and I have doubts that: 1) if the adapter.fa was needed and 2) when should I set keepbothreads TRUE.

Q1:
For those Sample_RNA_seq.fq.gz file which have been removed adapters by the company, should I used the “ILLUMINACLIP” parameter like:
Trimmomatic PE -phred33 Sample_RNA_seq_1.fq.gz Sample_RNA_seq_2.fq.gz Sample_RNA_seq _trimmo_paired_1.fq.gz Sample_RNA_seq _trimmo_unpaired_1.fq.gz Sample_RNA_seq _trimmo_paired_2.fq.gz Sample_RNA_seq _trimmo_unpaired_2.fq.gz ILLUMINACLIP: TruSeq2-PE.fa:2:30:10 SLIDINGWINDOW:4:15 LEADING:3 TRAILING:3 MINLEN:50

OR should I drop the parameter like:
Trimmomatic PE -phred33 Sample_RNA_seq_1.fq.gz Sample_RNA_seq_2.fq.gz Sample_RNA_seq _trimmo_paired_1.fq.gz Sample_RNA_seq _trimmo_unpaired_1.fq.gz Sample_RNA_seq _trimmo_paired_2.fq.gz Sample_RNA_seq _trimmo_unpaired_2.fq.gz SLIDINGWINDOW:4:15 LEADING:3 TRAILING:3 MINLEN:50

Q2:
I didn’t understand when to set keepbothreads TRUE. For the default False, my previous understanding was that if the two reads (forward read Sample_RNA_seq_1.fq.gz in and the reverse read in Sample_RNA_seq_2.fq.gz) are in reverse complement, the corresponding reverse read in Sample_RNA_seq _trimmo_paired_2.fq.gz will be dropped, resulting some reads without paired one in Sample_RNA_seq _trimmo_paired_2.fq.gz. However, I tried to trim the data by setting keepbothreads False, the total reads number were same in Sample_RNA_seq _trimmo_paired_1.fq.gz and Sample_RNA_seq _trimmo_paired_2.fq.gz.so I think I have misunderstood.

Thanks for your time and concern!

STDOUT appears to go to STDERR destination

We are using Trimmomatic 0.39 on an HPC cluster. This is the latest version available via EasyBuild (its module name is Trimmomatic/0.39-Java-11). We have noticed that the output that we would expect to go to STDOUT in fact goes to STDERR. Any output we would expect to go to STDERR also goes there (correctly).

For example, running against a dummy 100-line fastq file using the following command:
java -jar $EBROOTTRIMMOMATIC/trimmomatic-0.39.jar SE -threads 6 W3.0_rep1.fastq.short W3.0_rep1_trimmed.fastq.short ILLUMINACLIP:TruSeq3-SE.fa:2:30:10 ILLUMINACLIP:TruSeq3-SE.fa:2:30:10 MINLEN:50 1>stdout 2>stderr

...produces nothing to STDOUT, but the following to STDERR:

TrimmomaticSE: Started with arguments:
 -threads 6 W3.0_rep1.fastq.short W3.0_rep1_trimmed.fastq.short ILLUMINACLIP:TruSeq3-SE.fa:2:30:10 ILLUMINACLIP:TruSeq3-SE.fa:2:30:10 MINLEN:50
Using Long Clipping Sequence: 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA'
Using Long Clipping Sequence: 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
ILLUMINACLIP: Using 0 prefix pairs, 2 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Using Long Clipping Sequence: 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA'
Using Long Clipping Sequence: 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
ILLUMINACLIP: Using 0 prefix pairs, 2 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Quality encoding detected as phred33
Input Reads: 25 Surviving: 25 (100.00%) Dropped: 0 (0.00%)
TrimmomaticSE: Completed successfully

Are we calling this correctly, or is there some muddle in the code regarding output destinations?

Feature Request - report adapter names detected/removed

I use trimmo for adapter clipping and wanted to ask if you'd consider adding a more verbose reporting on what's removed. If it's not done due to the extra compute time, could it be an option to turn on? I don't know of another read filtering package that reports the adapter names removed, but I think it would be valuable to log for routine work as a confirmation the

1. correct adapters were removed and
2. off-target adapters were not removed.

Current Output:

Input Read Pairs: 1773329 Both Surviving: 1284520 (72.44%) Forward Only Surviving: 228498 (12.89%) Reverse Only Surviving: 73383 (4.14%) Dropped: 186928 (10.54%)
TrimmomaticPE: Completed successfully

Example of more verbose output:

Input Read Pairs: 1773329 Both Surviving: 1284520 (72.44%) Forward Only Surviving: 228498 (12.89%) Reverse Only Surviving: 73383 (4.14%) Dropped: 186928 (10.54%)
Adapters Removed: 184542
  TruSeq_Adapter_Index_25: 184532
  I7_Primer_Nextera_XT_Index_Kit_v2_N720: 10
TrimmomaticPE: Completed successfully

how to compile the source code?

how to compile the source code? It would be nice if you provide the jar file.

Universal adapter

Hello.
How do I specify that my sequences include universal Illumina adapters?

question about ILLUMINACLIP 2:30:10

Hello,

I have difficulties to understand the meaning of "2:30:10" for adaptor removal command.
The manual of Trimmomatic describes:

ILLUMINACLIP:<fastaWithAdaptersEtc>:<seed mismatches>:<palindrome clip threshold>:<simple clip threshold> 

seedMismatches: specifies the maximum mismatch count which will still allow a full match to be performed

palindromeClipThreshold: specifies how accurate the match between the two 'adapter ligated' reads must be for PE palindrome read alignment.

simpleClipThreshold: specifies how accurate the match between any adapter etc. sequence must be against a read

What exactly is the "ClipThreshold" and how are the values (30 and 10) calculated?

Many thanks in advance,

silviac

still have N after trimming

I am confused why there are still "N" with quality score "#" in the read. I tried different q cutoff at 3, 10, 20 for LEADING and TRAILING and the result is still the same. Is there something wrong with the other options i used?

java -jar trimmomatic.jar -version
0.39

time java -jar trimmomatic.jar PE -phred33 -threads 20
../SRR503008_1.fastq.gz ../SRR503008_2.fastq.gz -baseout test2.fastq.gz
ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10:2:True LEADING:10 TRAILING:10 MINLEN:36

zcat test2_1P.fastq.gz | head -n 1000 | grep "N" -B1 -A3
@SRR503008.365 HWI-1KL117:157:D0LYPACXX:8:1101:4180:2142 length=75
TGGTAAATGAATAGAGGTATATGATGCGGTTAAGTATAATTAAGANGAANNTTGTTTGT
+SRR503008.365 HWI-1KL117:157:D0LYPACXX:8:1101:4180:2142 length=75
CCCDFFFFHHHHHIIJJEHIJJJJJJJJJHIIIJGIJIJJJJJIJ#0?F##7?CHGGI

libzip incompatibility issue

Hi! Noticed this and didn't find other answers (only questions) on the web similar to it, and none connected to Trimmomatic. Currently getting this error with Trimmomatic 0.39 on Ubuntu 18.04.5 LTS, two different computers.

➜ trimmomatic --help  # or any other command
Error occurred during initialization of VM
Corrupted ZIP library: /opt/install/conda/lib/libzip.so

On these computers, here is the libzip:

➜ ls -l /opt/install/conda/lib/libzip.so
lrwxrwxrwx 1 user user 13 Jul  2 19:04 /opt/install/conda/lib/libzip.so -> libzip.so.5.4
➜ ls -l /opt/install/conda/lib/libzip.so.5.4
-rwxrwxr-x 2 user user 141384 Jun 18 21:15 /opt/install/conda/lib/libzip.so.5.4

And trimmomatic still works for me with this version of libzip:

-rw-rw-r-- 2 user user 43312 May 10  2019 /opt/install/conda/lib/libzip.so

Running conda update libzip fixed this, but it upgraded, downgraded, removed, added, and superseded over 100 packages to do so.

2nd paired fastq is empty

Hi! I ran Trimmomatic on paired end reads which, as I understand, is supposed to yield 1_paired, 1_unpaired, 2_paired, and 2_unpaired fastq files. However, after the process was completed, my 2_paired file seemed empty, as the font of the other files were colored red and when I ran a FASTQC on these files, I was only able to get reports for the other three files, but not the 2_paired file. Does this happen sometimes? Should I just proceed with mapping/alignment with just the 1_paired fastq file? Thank you!

trimmomatic does not find adapters folder in coda env

I installed trimmomatic 0.39 in a conda env and used the following command to copy adapters, but trimmomatic does not find adapters

cp /anaconda2/envs/env_name/share/trimmomatic-0.39/adapters/TruSeq3-PE-2.fa .
cp: no such file or directory

How can I solve this?

Making a custom adapter file - confused about the sequences to include

Hi,

My issue is similar to that in #14, but I am still a bit confused about this.

According to NEB, the sequences that I need to trim off are:
Adaptor Read1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
Adaptor Read2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

https://international.neb.com/faqs/2021/01/15/what-sequences-need-to-be-trimmed-for-nebnext-libraries-that-are-sequenced-on-an-illumina-instrument

However, based on the discussion in #14, it looks like these would not correspond to PrefixPE/1 and PrefixPE/2 as I thought. From my understanding, the sequences provided by NEB are those that are likely to contaminate Read 1 and Read 2, respectively, due to read through. How and why do these sequences need to be modified for use with Trimmomatic? Please could you supply the correct sequences to use.

Also, I see that in the TruSeq3-PE-2.fa file, you supply some additional sequences e.g. PE1 and PE1_rc - why are these added?

PrefixPE/1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
PrefixPE/2
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
PE1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
PE1_rc
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
PE2
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
PE2_rc
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC

Many thanks,
Lucy

Error: Exception in thread "main" java.io.IOException: Stale file handle

Dear all,

I'm trying to run Trimmomatic, but for few samples I keep finding this on the error file. Do you have any suggestion of what might be wrong?

Exception in thread "main" java.io.IOException: Stale file handle at java.base/java.io.FileOutputStream.writeBytes(Native Method) at java.base/java.io.FileOutputStream.write(FileOutputStream.java:355) at java.base/java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:254) at java.base/java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:212) at java.base/java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:146) at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303) at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281) at java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) at java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:211) at java.base/java.io.BufferedWriter.flushBuffer(BufferedWriter.java:120) at java.base/java.io.BufferedWriter.write(BufferedWriter.java:233) at java.base/java.io.Writer.write(Writer.java:162) at org.usadellab.trimmomatic.fastq.FastqSerializer.writeRecord(FastqSerializer.java:63) at org.usadellab.trimmomatic.TrimmomaticPE.processSingleThreaded(TrimmomaticPE.java:89) at org.usadellab.trimmomatic.TrimmomaticPE.process(TrimmomaticPE.java:316) at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:555) at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)

Kind regards,
Felipe

trimmomatic does not analyse all fastqs

I automatized my trimmomatic runs and it worked like a peach for 16S metagenomes.

However in my last run with transposases targeted metagenomes I found a problem for which I could not figure out the solution.

I had 24 paired end samples (24 x 2 fastqs) and trimmomatic processed only 13 (out of 24). I ran multiqC after trimmomatic results and it detected 13 logs.

My command to run trimmomatic was:

$cat mob_fqlist.txt | parallel -j 4 "trimmomatic PE -threads 5 mob-rawreads/{}_L001_R1_001.fastq mob-rawreads/{}_L001_R2_001.fastq -baseout mob-trimmomatic/{}.fq.gz ILLUMINACLIP:NexteraPE-PE.fa:2:30:10:2:KeepBothReads LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36 2> mob-trimmomatic/{}trimming.log"

This is the same basic command I had previously used for 16S a couple of times.
In the results folder there are 24 logs, but only 13 sequences were processed as 1P, 1U, 2P, 2U
trimmomatic_results.txt.

Before running trimmomatic I ran fastQC/multiQC and all 48 fastqs were analyzed, in which Nextera transposase adapters were recognized.

Any suggestions are welcome!

Trimmomatic for MGIseq

Dear developers, Could you clarify please Is this program suitable for MGI data (2*150)? As far as I understand, the ILLUMINACLIP parameter does not make sense.
java -jar trimmomatic-0.36.jar PE 1.fq.gz 2.fq.gz pairedPE_1.fq.gz singlePE_1.fq pairedPE_2.fq.gz singlePE_2.fq SLIDINGWINDOW:4:20 LEADING:3 TRAILING:3 HEADCROP:10 MINLEN:50 -threads 8

HEADCROP:10 this option is here to remove adapters.

intallation 0.40

in the git-hub it only says that the binary file should be installed somewhere convinient...
So, Where is the binary file?
What means by convenient?

kind regards,

I really want to use this. TY

Test(s) or Example(s)

Would it be possible to include a set of test(s) or example(s) along with the sample datasets to run either?

I am looking for a simple test/example that includes the expected output so that I can test the deployment of Trimmomatic on Bridges2.

Thanks.

adaptor choice

I contacted Illumina by asking the following
"Recently I got my data back from the Miseq sequencer, flowcells :v3 600c 2x300 and NextraXT V2 was used set A, B and D were used as DNA library preparation kit for 16rRNA.
I am trying to make trimming for adaptors, barcode, indices, etc to have only the insert/fragment back for downstream analysis.
Could you please let me know which one should be used for the adaptors?"

Their response was as following:
"The Adapter sequence to use for Nextera XT adapter trimming is (CTGTCTCTTATACACATCT)
You can find that information in this below link as well:
https://support.illumina.com/bulletins/2016/12/what-sequences-do-i-use-for-adapter-trimming.html
This sequence can be used for both Read 1 and Read 2 if you are using paired-end sequencing run"

My query is: what is exactly the difference between what is recommended by trimmomatic to be used and what I get from Illumina
I am missing something?

So just as following:

PrefixPE/1
CTGTCTCTTATACACATCT
PrefixPE/2
CTGTCTCTTATACACATCT

So this is not ok to be used?

PrefixNX/1
AGATGTGTATAAGAGACAG
PrefixNX/2
AGATGTGTATAAGAGACAG
Trans1
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG
Trans1_rc
CTGTCTCTTATACACATCTGACGCTGCCGACGA
Trans2
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG
Trans2_rc
CTGTCTCTTATACACATCTCCGAGCCCACGAGAC

Thanks in advance
Marwa

Overrepresented sequences remain after adapter trimming

Hello -

I am working on RNA data and I am trying to remove the adapter sequences from my reads. My raw data looks something like this:

When I run the recommended settings for adapters [ILLUMINACLIP:/$EBROOTTRIMMOMATIC/adapters/TruSeq3-PE.fa:2:30:10:2:True],
the "Adapter Content" tab on the fastqc report no longer gives a warning but all the overrepresented sequences are still there.

I tried to adjust the settings of the adapter trimming step, and got some better results, but I still have adapter content in the overrepresented sequences.

I ran trimmomatic like this

java -jar $EBROOTTRIMMOMATIC/trimmomatic-0.39.jar PE RawReads/GMCF-1049-DMD-1_S1_L001_R1_001.fastq.gz RawReads/GMCF-1049-DMD-1_S1_L001_R2_001.fastq.gz -trimlog DMD1-logfile.log -baseout trimmedReads_v2/DMD_1.fq ILLUMINACLIP:/$EBROOTTRIMMOMATIC/adapters/TruSeq3-PE.fa:2:40:15:1:True LEADING:3 TRAILING:3 MINLEN:36 HEADCROP:10

and my overrepresented sequences still look like this:

Now I know that with RNA seq data, you're suppose to get overrepresented sequences because those are the over expressed genes. However, my concern is that the overrepresented sequences are still being identified as adapters. Is this a problem? Should I change the settings on the adapter trimming step again to allow for a higher threshold, or do I run the risk of cutting sequences that I want to keep.

Any advice would be helpful. Thanks.

Foward only surviving reads but no reverse only surviving and no dropped reads

iI run trimmomatic v0.39 for my paired end reads and I get this output:
TrimmomaticPE:
Started with arguments:
-threads 12 -phred33 SR22799322_1.fastq.gz SRR22799322_2. fastq.gz /home/projects/ossabaw/samples/SR413590_yaolei/trimmed/SRR22799322_TRIM_1P.fastq.gz /home/projects/ossabaw/samples/SR413590_yaolei/trimmed/SRR 22799322_TRIM_1U.fastq.gz /home/projects/ossabaw/samples/SR413590_yaolei/trimmed/SR22799322_TRIM_2P.fastq.gz /home/projects/ossabaw/samples/SR413590_yaolei/trimmed/SRR22799322_TRIM_2U.fastq.gz ILLUMINACLIP:/ho
me/projects/ossabaw/samples/SR413590_yaolei/adaptersPE.fa:2:30:10 MINLEN:50
Using PrefixPair:
" AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA' and AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences,
o forward only sequences, 0 reverse only sequences
Input Read Pairs: 68170396 Both Surviving: 68088438 (99.88%) Forward Only Surviving: 81958 (0.12%) Reverse Only Surviving: 0 (0.00%) Dropped: 0 (0.00%)
TrimmomaticE: Completed successfully

And I do not understand why I have forward only surviving reads. My initial files (before running trimmomatic) had the same number of sequences when checked with fastqc.

Trimmomatic not exiting after job finishes

Hi guys,

When I run this program, it never exits when the job is done.
It keeps running indefinitely until I stop it. The data generated is correct, but it looks like the job freezes even if the instance is still running
As soon as I force the exit (ctlC), instead of simply closing the program it give the "broken pipe" error

Is there a way to have the program terminate when the job finishes?

Thank you,

Jesse

In what scenarios would you recommend using the SLIDINGWINDOW vs. MAXINFO modes?

Hi,

I have generally been using the SLIDINGWINDOW approach, but I was wondering in what scenarios you would recommend using the MAXINFO approach? Are these approaches always mutually exclusive?

Best wishes,
Lucy

installation manual needed

Hi,
My question is something most people would be ashamed to ask. I do not know where are installation zip files you mention in README. Would you please provide some links or so? I can handle the rest of the process ;)
Maciek

Exception ignored

Hi,

Trimmomatic (ILLUMINACLIP) exits with return code 0 even though an exception was raised while loading adapter sequences. This makes it hard to catch the error when running from scripts. I'd expect it to error and exit with a non-zero return code. Should the exception be re-raised in this section?

Trimmomatic/src/org/usadellab/trimmomatic/trim/IlluminaClippingTrimmer.java

Lines 69 to 76 in d89f8b7

 try 

 { 

 trimmer.loadSequences(seqs.getCanonicalPath()); 

 } 

 catch (IOException ex) 

 { 

 logger.handleException(ex); 

 }

...
...
WINDOW:7:20 TRAILING:3 MINLEN:10
java.io.FileNotFoundException: /usr/local/share/trimmomatic-0.39/adapters/NexteraPE-PE.fr (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at org.usadellab.trimmomatic.fasta.FastaParser.parse(FastaParser.java:54)
        at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.loadSequences(IlluminaClippingTrimmer.java:110)
        at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:71)
        at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:32)
        at org.usadellab.trimmomatic.Trimmomatic.createTrimmers(Trimmomatic.java:59)
        at org.usadellab.trimmomatic.TrimmomaticPE.run(TrimmomaticPE.java:552)
        at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:80)
Quality encoding detected as phred33
Input Read Pairs: 25000 Both Surviving: 25000 (100.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 0 (0.00%)
TrimmomaticPE: Completed successfully

-- Arjan

missing .jar file?

Good afternoon,
I'm probably blind, but I just cannot find the .jar file in the most recent trimmomatic release. Sorry, but do I need to do something to generate it?
I used git clone https://github.com/usadellab/Trimmomatic.git

Trimmomatic drops 50% of reads

Hello,
I am trying to trim adapters of my PacBio data with trimmomatic.
This is my command:

module load miniconda/miniconda3-4.7.12
module load idba

java -jar /dorotheeh/trimmomatic/trimmomatic-0.39.jar SE -phred33 PBrnaQ20.ccs.fastq.gz PBrnatrimmed.fq.gz ILLUMINACLIP:pacbio_vectors_db.fasta:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:400

And it trimmed more than half of my reads:
Using Long Clipping Sequence: 'TTCGTCACCATAGTTGCGTCTCATG'
ILLUMINACLIP: Using 0 prefix pairs, 3 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Reads: 4826329 Surviving: 2282550 (47.29%) Dropped: 2543779 (52.71%)
TrimmomaticSE: Completed successfully

Could someone explain what should I fix in the command?

Thank you!

Build issue with JAVA Version

I think the source and target version for compile target should be bumped to at least 1.6 (the best choice would be 1.8 corresponding to openjdk-8) for compatibility with openjdk-8.

I had also set an additional variable bootclasspath="/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar" in a Ubuntu 20.04 machine with openjdk-8 installed to solve the bootstrap class path not set in conjunction with -source x warning, but that's not required for compiling.

The current java version specified in the build XML file is quite old and unsupported. Newer compilers do not support it and simply running ant doesn't work.

Trimmomatic/build.xml

Line 34 in 93c7b12

 <javac srcdir="${src}" destdir="${dist_build}" debug="true" source="1.5" target="1.5" includeantruntime="false"> 

Build Errors

I have tried with openjdk-17, which gives me the following error for the compile target

compile:
    [javac] Compiling 65 source files to /home/bevan/Trimmomatic/dist/build
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 5
    [javac] error: Source option 5 is no longer supported. Use 7 or later.
    [javac] error: Target option 5 is no longer supported. Use 7 or later.

BUILD FAILED
/home/bevan/Trimmomatic/build.xml:34: Compile failed; see the compiler error output for details.

With openjdk-8 and openjdk-11 the error is

compile:
    [javac] Compiling 65 source files to /home/bevan/Trimmomatic/dist/build
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 5
    [javac] error: Source option 5 is no longer supported. Use 6 or later.
    [javac] error: Target option 1.5 is no longer supported. Use 1.6 or later.

BUILD FAILED
/home/bevan/Trimmomatic/build.xml:34: Compile failed; see the compiler error output for details.

Version 1.8

Setting source and target to 1.8 in the following line, gives

Trimmomatic/build.xml

Line 34 in 93c7b12

 <javac srcdir="${src}" destdir="${dist_build}" debug="true" source="1.5" target="1.5" includeantruntime="false"> 

openjdk-17

compile:
    [javac] Compiling 65 source files to /home/bevan/Trimmomatic/dist/build
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 8
    [javac] /home/bevan/Trimmomatic/src/org/usadellab/trimmomatic/trim/BaseCountTrimmer.java:30: warning: [removal] Integer(String) in Integer has been deprecated and marked for removal
    [javac] 				maxCount=new Integer(split[2]);
    [javac] 				         ^
    [javac] 2 warnings

dist:
    [unjar] Expanding: /home/bevan/Trimmomatic/dist/lib/jbzip2-0.9.1.jar into /home/bevan/projects/git/Trimmomatic/dist/unpack
   [delete] Deleting directory /home/bevan/Trimmomatic/dist/unpack/META-INF
   [delete] Deleting directory /home/bevan/Trimmomatic/dist/unpack/demo
     [move] Moving 1 file to /home/bevan/Trimmomatic/dist/unpack
     [move] Moving 1 file to /home/bevan/Trimmomatic/dist/unpack
     [copy] Copying 80 files to /home/bevan/Trimmomatic/dist/unpack
      [jar] Building jar: /home/bevan/Trimmomatic/dist/jar/trimmomatic-0.40-rc1.jar
      [zip] Building zip: /home/bevan/Trimmomatic/dist/Trimmomatic-0.40-rc1.zip
      [zip] Building zip: /home/bevan/Trimmomatic/dist/Trimmomatic-Src-0.40-rc1.zip

BUILD SUCCESSFUL
Total time: 6 seconds

openjdk-11

compile:
    [javac] Compiling 65 source files to /home/bevan/Trimmomatic/dist/build
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 8
    [javac] Note: /home/bevan/Trimmomatic/src/org/usadellab/trimmomatic/trim/BaseCountTrimmer.java uses or overrides a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] 1 warning

dist:
    [unjar] Expanding: /home/bevan/Trimmomatic/dist/lib/jbzip2-0.9.1.jar into /home/bevan/Applications/Trimmomatic/dist/unpack
   [delete] Deleting directory /home/bevan/Trimmomatic/dist/unpack/META-INF
   [delete] Deleting directory /home/bevan/Trimmomatic/dist/unpack/demo
     [move] Moving 1 file to /home/bevan/Trimmomatic/dist/unpack
     [move] Moving 1 file to /home/bevan/Trimmomatic/dist/unpack
     [copy] Copying 80 files to /home/bevan/Trimmomatic/dist/unpack
      [jar] Building jar: /home/bevan/Trimmomatic/dist/jar/trimmomatic-0.40-rc1.jar
      [zip] Building zip: /home/bevan/Trimmomatic/dist/Trimmomatic-0.40-rc1.zip
      [zip] Building zip: /home/bevan/Trimmomatic/dist/Trimmomatic-Src-0.40-rc1.zip

BUILD SUCCESSFUL
Total time: 0 seconds

openjdk-8

compile:
    [javac] Compiling 65 source files to /home/bevan/Trimmomatic/dist/build
    [javac] Note: /home/bevan/Trimmomatic/src/org/usadellab/trimmomatic/trim/BaseCountTrimmer.java uses or overrides a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.

dist:
    [unjar] Expanding: /home/bevan/Trimmomatic/dist/lib/jbzip2-0.9.1.jar into /home/bevan/Applications/Trimmomatic/dist/unpack
   [delete] Deleting directory /home/bevan/Trimmomatic/dist/unpack/META-INF
   [delete] Deleting directory /home/bevan/Trimmomatic/dist/unpack/demo
     [move] Moving 1 file to /home/bevan/Trimmomatic/dist/unpack
     [move] Moving 1 file to /home/bevan/Trimmomatic/dist/unpack
     [copy] Copying 78 files to /home/bevan/Trimmomatic/dist/unpack
      [jar] Building jar: /home/bevan/Trimmomatic/dist/jar/trimmomatic-0.40-rc1.jar
      [zip] Building zip: /home/bevan/Trimmomatic/dist/Trimmomatic-0.40-rc1.zip
      [zip] Building zip: /home/bevan/Trimmomatic/dist/Trimmomatic-Src-0.40-rc1.zip

BUILD SUCCESSFUL
Total time: 0 seconds

Question about dealing with 10x data on trimmomatic

I have a 10x experiment where R1 contains the cell barcode and UMI, and R2 contains the cDNA sequence. I want to trim R2 with a sliding window approach where a 10 bp window that has an average score of below 28 is my threshold: SLIDINGWINDOW:10:28

My initial approach was just to run trimmomatic as such on R2, and then use cutadapt to filter any reads of length 0 before mapping:

`
java -jar trimmomatic-0.39.jar SE -threads 16 -phred33 ${FASTQ_base}/FASTQ/L458_898_S2_L001_R2_001.fastq.gz ${FASTQ_base}/FASTQ_qa_28_tm/trimmed/trimmed_L458_898_S2_L001_R2_001.fastq.gz SLIDINGWINDOW:10:28

cutadapt -j 0 --minimum-length :1 -o ${FASTQ_base}/FASTQ_qa_28_tm/L458_898_S2_L001_R1_001.fastq.gz -p ${FASTQ_base}/FASTQ_qa_28_tm/L458_898_S2_L001_R2_001.fastq.gz ${FASTQ_base}/FASTQ/L458_898_S2_L001_R1_001.fastq.gz ${FASTQ_base}/FASTQ_qa_28_tm/trimmed/trimmed_L458_898_S2_L001_R2_001.fastq.gz
`

However, I notice that trimmomatic does remove a small subset of reads (Input Reads: 153435707 Surviving: 150653678 (98.19%) Dropped: 2782029 (1.81%)), which then makes quickly filtering using cutadapt impossible.

Should I be using a paired end approach? Or do you have another suggestion for how to tackle this? I don't want to do any read trimming on R1 as it just contains the barcode and UMI. Thanks so much,

Matteo

Specifying Adapter

Hello! I am suing Trimmomatic for the first time. I aim to use this on pair end reads. The adapters I wish to trim are illumina TruSeq single index adapters with the sequence below:

Read 1
AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

Read 2
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

I notice these sequences are different than the ones provided in Trimmomatic (TruSeq 2 and TrueSeq3). How can I create a file to feed these adapter sequences to Trimmomatic?

I've read the nicely detailed TrimmomaticManual_V0.32.pdf, section "Making custom clipping files" but I'm still confused as to how to specifically format this file since I see the TruSeq2 and 3 files that Trimmomatic uses have more than just the sequences to read 1 and read 2.

Apologies for the very basic question. I greatly appreciate your help!

Building on Ubuntu 20

System Specs:

Ubuntu20 (bionic beaver) Virtual Machine
VirtualBox 6.1
Windows 10 host

When trying to build the latest from source with ant

sudo snap install ant --classic
cd /source/dir
ant

An error reports as

compile:
    [javac] Compiling 41 source files to /home/chabab/Downloads/Trimmomatic-0.39/dist/build
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 6
    [javac] warning: [options] source value 6 is obsolete and will be removed in a future release
    [javac] warning: [options] target value 1.6 is obsolete and will be removed in a future release
    [javac] warning: [options] To suppress warnings about obsolete options, use -Xlint:-options.
    [javac] Note: /home/chabab/Downloads/Trimmomatic-0.39/src/org/usadellab/trimmomatic/trim/BaseCountTrimmer.java uses or overrides a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] 4 warnings

In order to fix this, edit file build.xml and replace 1.5 with 1.6 on line 34
vim build.xml
save and exit the file

Running ant again results in successful build

chabab@chabab-VirtualBox:~/Downloads/Trimmomatic-0.39$ ant
Buildfile: /home/chabab/Downloads/Trimmomatic-0.39/build.xml

init:

import:

compile:
    [javac] Compiling 41 source files to /home/chabab/Downloads/Trimmomatic-0.39/dist/build
    [javac] warning: [options] bootstrap class path not set in conjunction with -source 6
    [javac] warning: [options] source value 6 is obsolete and will be removed in a future release
    [javac] warning: [options] target value 1.6 is obsolete and will be removed in a future release
    [javac] warning: [options] To suppress warnings about obsolete options, use -Xlint:-options.
    [javac] Note: /home/chabab/Downloads/Trimmomatic-0.39/src/org/usadellab/trimmomatic/trim/BaseCountTrimmer.java uses or overrides a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] 4 warnings

dist:
    [unjar] Expanding: /home/chabab/Downloads/Trimmomatic-0.39/dist/lib/jbzip2-0.9.jar into /home/chabab/Downloads/Trimmomatic-0.39/dist/unpack
   [delete] Deleting directory /home/chabab/Downloads/Trimmomatic-0.39/dist/unpack/META-INF
   [delete] Deleting directory /home/chabab/Downloads/Trimmomatic-0.39/dist/unpack/demo
     [move] Moving 1 file to /home/chabab/Downloads/Trimmomatic-0.39/dist/unpack
     [move] Moving 1 file to /home/chabab/Downloads/Trimmomatic-0.39/dist/unpack
     [copy] Copying 50 files to /home/chabab/Downloads/Trimmomatic-0.39/dist/unpack
      [jar] Building jar: /home/chabab/Downloads/Trimmomatic-0.39/dist/jar/trimmomatic-0.39.jar
      [zip] Building zip: /home/chabab/Downloads/Trimmomatic-0.39/dist/Trimmomatic-0.39.zip
      [zip] Building zip: /home/chabab/Downloads/Trimmomatic-0.39/dist/Trimmomatic-Src-0.39.zip

BUILD SUCCESSFUL
Total time: 2 seconds

>90% forward surviving read pairs

Hello! I always use Trimmomatic to trim the Illumina sequencing data, It works well. But this time, the results showed that too many forward only surviving reads, and higher than 90%. I have modified many parameters, but no work. Could anyone help me?
original command:
trimmomatic PE -phred33 "$i"_R1.fq.gz "$i"_R2.fq.gz "$i"_R1_paired.fq.gz "$i"_R1_unpaired.fq.gz "$i"_R2_paired.fq.gz "$i"_R2_unpaired.fq.gz ILLUMINACLIP:/disk2/users/che/miniconda2/pkgs/trimmomatic-0.39-1/share/trimmomatic-0.39-1/adapters/NexteraPE-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

result:
TrimmomaticPE: Started with arguments:
-phred33 147_R1.fq.gz 147_R2.fq.gz 147_R1_0paired.fq.gz 147_R1_0unpaired.fq.gz 147_R2_0paired.fq.gz 147_R2_0unpaired.fq.gz ILLUMINACLIP:/disk2/users/che/miniconda2/pkgs/trimmomatic-0.39-1/share/trimmomatic-0.39-1/adapters/NexteraPE-PE.fa:2:2:2 MINLEN:2
Using PrefixPair: 'AGATGTGTATAAGAGACAG' and 'AGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC'
Using Long Clipping Sequence: 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'
ILLUMINACLIP: Using 1 prefix pairs, 4 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 4180392 Both Surviving: 199973 (4.78%) Forward Only Surviving: 3978732 (95.18%) Reverse Only Surviving: 136 (0.00%) Dropped: 1551 (0.04%)
TrimmomaticPE: Completed successfully

Trimmomatic seems not trimming adapters for short reads (illumina GAIIx dataset)

Hi,
I'm trying to remove adapter sequences in fastq generated from illumina GAIIx platform (small RNA-Seq dataset).

I used adapter sequence "CTGTAGGCACCATCAATCGTATGCCGTCTTCTGCTTG" and shorter version "CTGTAGGCACCATCAA".
Below is my commandline for trimmomatic (ver : 0.39) :
"java -jar trimmomatic-0.39.jar SE -threads 20 -phred33 ./{input}.fastq.gz ./{input}_trim.fastq.gz ILLUMINACLIP:./user_specified_adapter.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 AVGQUAL:30"
, and then do size-selection with cutadapt (18~26-nt reads)

Results from Trimmomatic process are below:
Input Reads: 22605850 Surviving: 19716093 (87.22%) Dropped: 2889757 (12.78%) (When using longer adapter sequence)
Input Reads: 22605850 Surviving: 21460433 (94.93%) Dropped: 1145417 (5.07%) (When using short adapter sequence)

Most of survived reads were 36-nt-length reads containing partial/full adapter sequences, which were removed by cutadapt length option (-M 26). Data processed with short version of adapter showed similar results (not shown)
Total reads processed: 19,716,093
Reads with adapters: 0 (0.0%)
Reads that were too short: 1,374,994 (7.0%)
Reads that were too long: 17,387,916 (88.2%) -> mostly ~36-nt reads containing adapters.
Reads written (passing filters): 953,183 (4.8%)

When using cutadapt for adapter trimming plus size-selection (18~26nt) :
Total reads processed: 22,605,850
Reads with adapters: 22,187,182 (98.1%)
Reads that were too short: 3,215,136 (14.2%)
Reads that were too long: 908,954 (4.0%)
Reads written (passing filters): 18,481,760 (81.8%)

When I used Trimmomatic for illumina 51-cycle single-end reads, there were no big differences between results from Trimmomatic and Cutadapt...

Input : 11,135,969
Trimmomatic (adapter-trimming + AVGQUAL:30) + Cutadapt (only size selection) : 8,456,739
Cutadapt (adapter-trimming + size selection) + Trimmomatic (AVGQUAL:30 only) : 8,605,374

How can I optimize trimmomatic option for 36-cy illumina single-end reads for adapter trimming?

Feature Request: Reads out of order + Seq , qual line length mismatch

We have a legacy script that is meant to handle some edge cases with corrupt data. I can't say how common these edge cases are, but it would be nice if Trimmomatic would handle these cases.

When reads are out of order, Trimmomatic does not associate mate pairs correctly. example below

The input files have two reads: 1628 and 1643

Normally read 1628 passes the filter requirements:

java -jar Trimmomatic-0.39/trimmomatic.jar PE ordered_R1_001.fastq ordered_R2_001.fastq ordered.R1.fastq ordered.R1.new.unp.fastq ordered.R2.fastq ordered.R2.new.unp.fastq ILLUMINACLIP:trimmomatic_0.38_adapters_ALL-PE.fa:1:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:35

Input Read Pairs: 2 Both Surviving: 1 (50.00%) Forward Only Surviving: 0 (0.00%) Reverse Only Surviving: 0 (0.00%) Dropped: 1 (50.00%)

But here if the reads are out of order, and the retained reads are in the unpaired files

java -jar Trimmomatic-0.39/trimmomatic.jar PE unordered_R1_001.fastq unordered_R2_001.fastq unordered.R1.fastq unordered.R1.new.unp.fastq unordered.R2.fastq unordered.R2.new.unp.fastq ILLUMINACLIP:trimmomatic_0.38_adapters_ALL-PE.fa:1:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:35

Input Read Pairs: 2 Both Surviving: 0 (0.00%) Forward Only Surviving: 1 (50.00%) Reverse Only Surviving: 1 (50.00%) Dropped: 0 (0.00%)

When the sequence line and quality line are different length, trimmomatic errors out. It would be nicer if it just removed the read.

java -jar Trimmomatic-0.39/trimmomatic.jar PE error_R1_001.fastq ordered_R2_001.fastq error.R1.fastq error.R1.new.unp.fastq error.R2.fastq error.R2.new.unp.fastq ILLUMINACLIP:trimmomatic_0.38_adapters_ALL-PE.fa:1:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:35

Exception in thread "main" java.lang.RuntimeException: Sequence and quality length don't match: 'TTTAGCAGCCATTTTAGCTTTCTGCCGGATTTTTGCAACGATACCTTTCATGTGAACATTGTGAACATTAAACTGGCGAGCCAGTTTTTTACCCATATGACCAGTGGCCCGTGAGGCCAACACCATGGTGGTGGTTACATCTCGTATGCCG' vs '1>>11111BA1DGGG31BGGGEFADB0A0EEGHH0F2BB?CG/AEGHHFFGEGD2DF1GFAFB2AF1FHF1FFG10AE//>>CHHHHHGGFFG0FGFHBFGHGAFFBCFEE///<GEF/C<CF</GB0<</?//BFDCBHFHD0FCC01<'

error_R1_001.fastq.gz
ordered_R1_001.fastq.gz
ordered_R2_001.fastq.gz
unordered_R1_001.fastq.gz
unordered_R2_001.fastq.gz

Feature request: exit if adapter file is missing

Hi,

Thank you for your hard work on this tool! I was recently debugging an issue where trimmomatic seemed unable to cut standard Illumina adapters that I wanted to bring to your attention. After some extensive investigation, I realized I was selecting a non-existent adapter file as the source of sequences to scan against the read data. No error was reported by Trimmomatic, so I'm guessing that adapter trimming was just skipped as a result.

Would it be possible to exit with an error code if the adapter file is missing? I know this is only the case with custom adapters, but may be a useful condition to prevent for other users.

Trimmomatic errors

Hello! I am having some issues with trimmomatic and biostars hasn't seemed to help much. Below are a couple different error messages I got. I am interested in any help you could give regarding what the error message is telling me!

Input:
java -jar /home/es180966/anaconda/share/trimmomatic-0.39-2/trimmomatic.jar PE -threads 12 -phred33 -basein SRR16646612_1.fastq -baseout SRR16646612_1.fastq ILLUMINACLIP:/home/es180966/anaconda/envs/transect_env/sra_files/Many.truSeq.PE.fa.2.20.10.4 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

Output:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1java -jar

Input:
/home/es180966/anaconda/share/trimmomatic-0.39-2/trimmomatic.jar PE -threads 12 -phred33 -basein SRR16646612_1.fastq SRR16646612_2.fastq -baseout SRR16646612_1.fastq SRR16646612_2.fastq SRR16646612_1_paired.fastq SRR166464612_1_unpaired.fastq SRR16646612_2_paired.fastq ILLUMINACLIP:/home/es180966/anaconda/envs/transect_env/sra_files/Many.truSeq.PE.fa.2.20.10.4 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

Output:
Using templated Input files: SRR16646612_1.fastq SRR16646612_2.fastq
Using templated Output files: SRR16646612_1_1P.fastq SRR16646612_1_1U.fastq SRR16646612_1_2P.fastq SRR16646612_1_2U.fastq
Exception in thread "main" java.lang.RuntimeException: Unknown trimmer: SRR16646612_2.fastq

	if (val < 0 && mergeIter.hasPrevious() && mergeIter.hasNext())
	{
	float prev = mergeIter.previous();
	mergeIter.next();
	float next = mergeIter.next();

	if ((prev > -val) && (next > -val))
	{
	mergeIter.remove();
	mergeIter.previous();
	mergeIter.remove();
	mergeIter.previous();
	mergeIter.set(prev + val + next);

	scanAgain = true;
	}
	else
	mergeIter.previous();
	}
	}

	try
	{
	trimmer.loadSequences(seqs.getCanonicalPath());
	}
	catch (IOException ex)
	{
	logger.handleException(ex);
	}