You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One specific sample kept crashing in the SNP_calling step, see DRMAA log below:
Error in rule SNP_calling:
jobid: 0
output: data/scaffolds_filtered/4_S4_scaffolds_ge500nt.fasta.fai, data/scaffolds_filtered/4_S4_unfiltered.vcf, data/scaffolds_filtered/4_S4_filtered.vcf, data/scaffolds_filtered/4_S4_filtered.vcf.gz, data/scaffolds_filtered/4_S4_filtered.vcf.gz.tbi
log: logs/SNP_calling_4_S4.log
conda-env: /mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965
RuleException:
CalledProcessError in line 366 of /mnt/scratch_dir/plaatvdr/Jovian/Snakefile:
Command 'source /mnt/miniconda/bin/activate '/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965'; set -euo pipefail; samtools faidx -o data/scaffolds_filtered/4_S4_scaffolds_ge500nt.fasta.fai data/scaffolds_filtered/4_S4_scaffolds_ge500nt.fasta > logs/SNP_calling_4_S4.log 2>&1
lofreq call-parallel -d 20000 --no-default-filter --pp-threads 12 -f data/scaffolds_filtered/4_S4_scaffolds_ge500nt.fasta -o data/scaffolds_filtered/4_S4_unfiltered.vcf data/scaffolds_filtered/4_S4_sorted.bam >> logs/SNP_calling_4_S4.log 2>&1
lofreq filter -a 0.05 -i data/scaffolds_filtered/4_S4_unfiltered.vcf -o data/scaffolds_filtered/4_S4_filtered.vcf >> logs/SNP_calling_4_S4.log 2>&1
bgzip -c data/scaffolds_filtered/4_S4_filtered.vcf 2>> logs/SNP_calling_4_S4.log 1> data/scaffolds_filtered/4_S4_filtered.vcf.gz
tabix -p vcf data/scaffolds_filtered/4_S4_filtered.vcf.gz >> logs/SNP_calling_4_S4.log 2>&1' returned non-zero exit status 1.
File "/mnt/scratch_dir/plaatvdr/Jovian/Snakefile", line 366, in __rule_SNP_calling
File "/home/plaatvdr/envs/Jovian_master/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Removing output files of failed job SNP_calling since they might be corrupted:
data/scaffolds_filtered/4_S4_scaffolds_ge500nt.fasta.fai
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
See the log file below:
INFO [2019-10-25 14:46:08,446]: Using 12 threads with following basic args: lofreq call -d 20000 --no-default-filter -f data/scaffolds_filtered/4_S4_scaffolds_ge500nt.fasta data/scaffolds_filtered/4_S4_sorted.bam
INFO [2019-10-25 14:46:10,903]: Adding 157086 commands to mp-pool
Traceback (most recent call last):
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/bin/lofreq2_call_pparallel.py", line 746, in <module>
main()
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/bin/lofreq2_call_pparallel.py", line 669, in main
"##source=%s" % ' '.join(sys.argv))
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/bin/lofreq2_call_pparallel.py", line 174, in concat_vcf_files
subprocess.check_call(cmd)
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/lib/python3.6/subprocess.py", line 286, in check_call
retcode = call(*popenargs, **kwargs)
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/lib/python3.6/subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/mnt/scratch_dir/plaatvdr/Jovian/.snakemake/conda/e0281965/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 7] Argument list too long: 'lofreq'
Searching for this error on LoFreq's issues paged turned up the following issue CSB5/lofreq#79. Apparently, LoFreq has a hardcoded limit of only accepting 137072 contigs per sample. When I checked the number of trimmed scaffolds in this sample, it was 157086 contigs. So that is the cause of the problem.
The solution would be to write a checker that splits up files with more than 137072 contigs and later merging them back again. But it seems like such a corner-case that I'm giving it a low-priority.
A "work-around" would be to remove such samples from your analysis, at least then the entire Jovian analysis will finish. Another "work-around" would be to tweak the filtering parameters such that the number of contigs drops below the LoFreq limit, e.g. by increasing the minlen parameter (and thus filtering away more scaffolds).
Please, if you also encounter this error, mention it in this thread so I can reevaluate the priority.
The text was updated successfully, but these errors were encountered:
Other samples in @RozemarijnVanDerPlaats's run have the same problem. It seem to happen in environmental samples (e.g. surface water) where it makes sense that there are a great many organisms that are so diluted as to not generate enough reads to assemble into bigger scaffolds.
This has never been a problem in the hundreds of clinical samples processed thus-far, nor do I expect it to be in the future. Still, it's sloppy and hinders broader usage.
I've asked for the data so I can test a solution when I've got the time for it.
This issue was emailed to me by @RozemarijnVanDerPlaats.
One specific sample kept crashing in the
SNP_calling
step, see DRMAA log below:See the log file below:
Searching for this error on LoFreq's issues paged turned up the following issue CSB5/lofreq#79. Apparently, LoFreq has a hardcoded limit of only accepting 137072 contigs per sample. When I checked the number of trimmed scaffolds in this sample, it was 157086 contigs. So that is the cause of the problem.
The solution would be to write a checker that splits up files with more than 137072 contigs and later merging them back again. But it seems like such a corner-case that I'm giving it a low-priority.
A "work-around" would be to remove such samples from your analysis, at least then the entire Jovian analysis will finish. Another "work-around" would be to tweak the filtering parameters such that the number of contigs drops below the LoFreq limit, e.g. by increasing the
minlen
parameter (and thus filtering away more scaffolds).Please, if you also encounter this error, mention it in this thread so I can reevaluate the priority.
The text was updated successfully, but these errors were encountered: