Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Environment variables for KEGG, Pfam, and other data directories so sys admins can set a global --xxxx-data-dir value #2302

Open
ivagljiva opened this issue Jul 9, 2024 · 0 comments

Comments

@ivagljiva
Copy link
Contributor

A small project to improve anvi'o, based upon feedback/ideas @FlorianTrigodet and I heard from our colleagues at the QIB in Norwich.

The need

If you install function databases in a non-default location, as is often done on HPCs, it could be nice to set an environment variable for your entire working group so that everyone can use these function databases without having to always specify their location on the command line with parameters like --kegg-data-dir. For example, we already have one of these for --cog-data-dir, which is accessed via the environment variable $ANVIO_COG_DATA_DIR.

Here is how we use that variable in cogs.py:

elif 'ANVIO_COG_DATA_DIR' in os.environ:
            self.COG_base_dir = os.environ['ANVIO_COG_DATA_DIR']

The existence of this variable is not documented on the anvi-setup-ncbi-cogs help page (or elsewhere on COGs-related help pages), so one issue is that very few people don't know about it.

But adding these sorts of environment variables for the other programs that use data directories (ie, --kegg-data-dir, --pfam-data-dir, --scgs-taxonomy-data-dir, etc) could be valuable :) As long as they are properly documented, of course.

The solution

Add documentation for $ANVIO_COG_DATA_DIR on the related help pages. Create new variables to be accessed via os.environ for the other data directories (and document them).

Beneficiaries

Anyone who has to install databases that anvi'o depends on in non-default locations (especially HPC users).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant