Skip to content

Commit

Permalink
Adjust input length on nextclade runs
Browse files Browse the repository at this point in the history
Only run nextclade on sequences at least 1400 nt long, the approximate length
of the dengue E gene. This would avoid misclassification on short sequences.
  • Loading branch information
j23414 committed Dec 27, 2023
1 parent e255f69 commit ceae8c5
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions ingest/workflow/snakemake_rules/nextclade.smk
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,16 @@ rule nextclade_all:
output:
"data/nextclade_results/nextclade_all.tsv",
threads: 4
params:
min_length=1400, # approximately E gene length
shell:
"""
nextclade run \
--input-dataset {input.dataset} \
-j {threads} \
--output-tsv {output} \
--min-match-rate 0.01 \
--min-length {params.min_length} \
--silent \
{input.sequences}
"""
Expand Down

0 comments on commit ceae8c5

Please sign in to comment.