Skip to content

Commit

Permalink
feat(ci): Use cached ncbi dataset package to isolate tests
Browse files Browse the repository at this point in the history
Resolves #46
  • Loading branch information
corneliusroemer committed Jul 17, 2024
1 parent 158021a commit 567bccf
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 4 deletions.
4 changes: 2 additions & 2 deletions ingest/build-configs/ci/config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# TODO: If the ingest workflow ever runs too long, we should figure out a way
# to subset the ingest data. Currently, the CI just runs the default ingest workflow.
# Use cached ncbi datasets package to speed up tests and isolate from ncbi servers
mock_fetch: true

# Snakemake requires at least one top level key in a config file, so including
# a bogus key here that should not be used anywhere in the Snakemake workflow
Expand Down
13 changes: 11 additions & 2 deletions ingest/rules/fetch_from_ncbi.smk
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,18 @@ rule fetch_ncbi_dataset_package:
"""


def get_ncbi_dataset_package_path():
"""
Use cached data package in ci to isolate from ncbi server
"""
if config.get("mock_fetch", False):
return "test_data/ncbi_dataset.zip"
return "data/ncbi_dataset.zip"


rule extract_ncbi_dataset_sequences:
input:
dataset_package="data/ncbi_dataset.zip",
dataset_package=get_ncbi_dataset_package_path(),
output:
ncbi_dataset_sequences=temp("data/ncbi_dataset_sequences.fasta"),
benchmark:
Expand All @@ -61,7 +70,7 @@ rule extract_ncbi_dataset_sequences:

rule format_ncbi_dataset_report:
input:
dataset_package="data/ncbi_dataset.zip",
dataset_package=get_ncbi_dataset_package_path(),
output:
ncbi_dataset_tsv=temp("data/ncbi_dataset_report.tsv"),
params:
Expand Down
Binary file added ingest/test_data/ncbi_dataset.zip
Binary file not shown.

0 comments on commit 567bccf

Please sign in to comment.