Skip to content

Commit

Permalink
Merge branch 'automation' of https://github.com/alliance-genome/agr_b…
Browse files Browse the repository at this point in the history
…lastdb_manager into automation

* 'automation' of https://github.com/alliance-genome/agr_blastdb_manager:
  README added
  changes to automation
  Add GitHub Actions workflow for BLAST DB update
  • Loading branch information
nuin committed Aug 27, 2024
2 parents 3fbe514 + 8ed6af2 commit e59d54a
Show file tree
Hide file tree
Showing 2 changed files with 167 additions and 0 deletions.
106 changes: 106 additions & 0 deletions .github/workflows/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# BLAST DB Manager

## Overview
BLAST DB Manager is a tool designed to automate the creation and updating of BLAST (Basic Local Alignment Search Tool) databases. It utilizes GitHub Actions for workflow automation and runs on a self-hosted EC2 instance for enhanced control and customization.

## Features
- Automated BLAST database creation and updates
- Integration with GitHub Actions for CI/CD
- Custom runner setup on EC2 for specialized environment control
- Slack notifications for job status updates
- S3 synchronization for database storage and distribution

## Prerequisites
- An AWS account with EC2 and S3 access
- A GitHub repository
- Python 3.x installed on the EC2 instance
- Poetry for Python dependency management
- NCBI BLAST+ toolkit

## Setup

### 1. EC2 Instance Setup
1. Launch an EC2 instance with Amazon Linux 2023.
2. Install required software:
```bash
sudo yum update -y
sudo yum install -y python3 python3-pip ncbi-blast+
pip3 install poetry
```

### 2. GitHub Actions Runner Setup
1. On your GitHub repository, go to Settings > Actions > Runners.
2. Click "New self-hosted runner" and follow the installation instructions for Linux.
3. Start the runner on your EC2 instance:
```bash
cd actions-runner
./run.sh
```

### 3. Repository Setup
1. Clone your repository on the EC2 instance.
2. Create a `.github/workflows` directory in your repository.
3. Add the `update_blast_db.yml` workflow file to this directory.

### 4. Configuration
1. Create a `config/blast_config.yaml` file in your repository with the necessary configuration for your BLAST DB creation.
2. Set up the following secrets in your GitHub repository:
- `GITHUB_WEBHOOK_SECRET`
- `SLACK_TOKEN`
- `AWS_ACCESS_KEY_ID`
- `AWS_SECRET_ACCESS_KEY`

## Usage

### Running the Workflow
The BLAST DB update workflow can be triggered in two ways:
1. Automatically on push to the `main` branch.
2. Manually from the Actions tab in your GitHub repository.

### Workflow Steps
1. Checkout the repository
2. Set up Python and Poetry
3. Install dependencies
4. Ensure BLAST is installed
5. Run the BLAST DB update script
6. Upload logs as artifacts

## Project Structure
```
your-repo/
├── .github/
│ └── workflows/
│ └── update_blast_db.yml
├── src/
│ └── create_blast_db.py
├── config/
│ └── blast_config.yaml
├── README.md
└── pyproject.toml
```

## Customization
- Modify `src/create_blast_db.py` to adjust the BLAST DB creation process.
- Update `config/blast_config.yaml` to change BLAST DB configurations.
- Edit `.github/workflows/update_blast_db.yml` to alter the CI/CD workflow.

## Troubleshooting
- Check the GitHub Actions logs for detailed error messages.
- Ensure the EC2 instance has the necessary permissions to access required AWS services.
- Verify that all required secrets are correctly set in the GitHub repository settings.

## Contributing
Contributions to improve BLAST DB Manager are welcome. Please follow these steps:
1. Fork the repository
2. Create a new branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## License
[Specify your license here]

## Contact
[Your Name or Organization] - [Your Email]

Project Link: [https://github.com/your_username/repo_name](https://github.com/your_username/repo_name)
61 changes: 61 additions & 0 deletions .github/workflows/update_blast_db.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: Update BLAST DB on EC2

on:
push:
branches: [ main ]
workflow_dispatch:

jobs:
update-blast-db:
runs-on: [self-hosted, Linux, X64, blast]
steps:
- name: Checkout repository
uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Check working directory
run: |
echo "Current working directory: $(pwd)"
ls -la
- name: Set up Python
run: |
python3 -m pip install --upgrade pip
pip3 install poetry
- name: Install Poetry dependencies
run: |
poetry config virtualenvs.create false
poetry install --no-interaction --no-root
- name: Ensure BLAST is installed
run: |
if ! command -v blastn &> /dev/null
then
echo "BLAST is not installed. Installing..."
sudo yum update -y
sudo yum install -y ncbi-blast+
else
echo "BLAST is already installed"
fi
- name: Run BLAST DB update script
env:
GITHUB_WEBHOOK_SECRET: ${{ secrets.GITHUB_WEBHOOK_SECRET }}
SLACK_TOKEN: ${{ secrets.SLACK_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
python3 src/create_blast_db.py \
--config_yaml config/blast_config.yaml \
--environment ${{ github.ref == 'refs/heads/main' && 'prod' || 'dev' }} \
--update-slack \
--sync-s3
- name: Upload logs
uses: actions/upload-artifact@v3
if: always()
with:
name: blast-db-logs
path: logs/blast_db_creation.log

0 comments on commit e59d54a

Please sign in to comment.