Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-open issue: invalid fastq files produced using --un #8 #88

Open
elenichri opened this issue Nov 27, 2018 · 4 comments
Open

Re-open issue: invalid fastq files produced using --un #8 #88

elenichri opened this issue Nov 27, 2018 · 4 comments

Comments

@elenichri
Copy link

Hello,
I re-open this issue...I am mapping paired-end reads using bowtie2 and the --un option; therefore I retrieve two output fastq files, one for each paired-end read. I then use star aligner to map these fastq files to the human genome. Star stars running but I get the error ReadAlignChunk_processChunks.cpp:115:processChunks EXITING because of FATAL ERROR in input reads: unknown file format: the read ID should start with @ or >

I ran fastQValidator program to check if the fastq files that bowtie2 returns are valid.(https://genome.sph.umich.edu/wiki/FastQValidator)
./fastQValidator --file xxx.trimmed.2.fastq
Here is the output:

ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
ERROR on Line 10414: Invalid character ('J') in base sequence.
Finished processing xxx.trimmed.2.fastq with 90418286 lines containing 22604486 sequences.
There were a total of 12073 errors.
Returning: 1 :
FASTQ_INVALID

So, it seems that bowtie2 generates invalid fastq files in my case. Do you have any idea on how I can fix this problem? My inputs (var2 and var3) are trimmed fastq files but I wouldn't like to use the non-trimmed fastq files.
I use 8 cores for running bowtie2 on 12 samples. My run command is
bowtie2 --dovetail --no-discordant -I 20 -p 8 -x _my reference sequence_ --un-conc "$var1" -1 "$var2" -2 "$var3" -S "$var4"
where var.i is taken from a parameters file

Thank you very much in advance!
Eleni

@mschilli87
Copy link

original issue: #8

@ch4rr0
Copy link
Collaborator

ch4rr0 commented Nov 29, 2018

How often does this happen? Every run, or sporadically? I am asking because I am trying to figure out whether this is a multi-threaded related issue or the wrapper script just not processing "trimmed" input correctly.

@elenichri
Copy link
Author

Dear ch4rr0, thank you for your reply. It happens for all the fastq files of one dataset with 12 samples. All 12 fastq files are invalid. I run my bowtie2 command in multithread (12 threads) but I don't think that this is an issue; the exact same command, using threads, works perfectly fine for another dataset. I am certain that the 'trimmed.fastq' input files are correct because I have also mapped them with star and I had no problem at all.

@ch4rr0
Copy link
Collaborator

ch4rr0 commented Jun 1, 2019

I am looking into this one. I will update the thread if and when I am to recreate the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants