Re-open issue: invalid fastq files produced using --un #8 #88

elenichri · 2018-11-27T08:06:51Z

Hello,
I re-open this issue...I am mapping paired-end reads using bowtie2 and the --un option; therefore I retrieve two output fastq files, one for each paired-end read. I then use star aligner to map these fastq files to the human genome. Star stars running but I get the error ReadAlignChunk_processChunks.cpp:115:processChunks EXITING because of FATAL ERROR in input reads: unknown file format: the read ID should start with @ or >

I ran fastQValidator program to check if the fastq files that bowtie2 returns are valid.(https://genome.sph.umich.edu/wiki/FastQValidator)
./fastQValidator --file xxx.trimmed.2.fastq
Here is the output:

ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
Finished processing xxx.trimmed.2.fastq with 90418286 lines containing 22604486 sequences.
There were a total of 12073 errors.
Returning: 1 :
FASTQ_INVALID

So, it seems that bowtie2 generates invalid fastq files in my case. Do you have any idea on how I can fix this problem? My inputs (var2 and var3) are trimmed fastq files but I wouldn't like to use the non-trimmed fastq files.
I use 8 cores for running bowtie2 on 12 samples. My run command is
bowtie2 --dovetail --no-discordant -I 20 -p 8 -x _my reference sequence_ --un-conc "$var1" -1 "$var2" -2 "$var3" -S "$var4"
where var.i is taken from a parameters file

Thank you very much in advance!
Eleni

The text was updated successfully, but these errors were encountered:

mschilli87 · 2018-11-27T10:05:33Z

original issue: #8

ch4rr0 · 2018-11-29T15:55:37Z

How often does this happen? Every run, or sporadically? I am asking because I am trying to figure out whether this is a multi-threaded related issue or the wrapper script just not processing "trimmed" input correctly.

elenichri · 2018-12-01T03:26:30Z

Dear ch4rr0, thank you for your reply. It happens for all the fastq files of one dataset with 12 samples. All 12 fastq files are invalid. I run my bowtie2 command in multithread (12 threads) but I don't think that this is an issue; the exact same command, using threads, works perfectly fine for another dataset. I am certain that the 'trimmed.fastq' input files are correct because I have also mapped them with star and I had no problem at all.

ch4rr0 · 2019-06-01T01:13:54Z

I am looking into this one. I will update the thread if and when I am to recreate the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-open issue: invalid fastq files produced using --un #8 #88

Re-open issue: invalid fastq files produced using --un #8 #88

elenichri commented Nov 27, 2018

mschilli87 commented Nov 27, 2018

ch4rr0 commented Nov 29, 2018

elenichri commented Dec 1, 2018

ch4rr0 commented Jun 1, 2019

Re-open issue: invalid fastq files produced using --un #8 #88

Re-open issue: invalid fastq files produced using --un #8 #88

Comments

elenichri commented Nov 27, 2018

mschilli87 commented Nov 27, 2018

ch4rr0 commented Nov 29, 2018

elenichri commented Dec 1, 2018

ch4rr0 commented Jun 1, 2019