-
Notifications
You must be signed in to change notification settings - Fork 204
CBW 2024 Advanced Module 1: Introduction to metagenomics and read‐based profiling
benfish404 edited this page Apr 25, 2024
·
33 revisions
This page will contain a tutorial
Bioinformatic Tool Citations
- FastQC
- Kneaddata
- Bowtie2
- Kraken2
- Bracken
- Kraken-biom
- MetaPhlAn 3.1
First, make your desired output directory (if it doesn't already exist). Then, run FastQC as follows:
fastqc -t 4 raw_data/*fastq.gz -o fastqc_out
Run Kneaddata.
parallel -j 1 --eta --link 'kneaddata -i1 {1} -i2 {2} -o kneaddata_out --db cbwdata/CourseData/MIC_data/tools/bowtie2_db/GRCh38_PhiX --bypass-trim' ::: raw_data/*R1_subsampled.fastq.gz ::: raw_data/*R2_subsampled.fastq.gz
Concatenate the reads into a single file.
perl ../tools/concat_paired_end.pl -p 4 --no_R_match -o cat_reads kneaddata_out/*_paired_contam*.fastq
If the above does not work, you may need to install Perl:
conda install conda-forge::perl
If it still does not work or you already have Perl installed, you may get an error saying you require Parallel::ForkManager. Fix by executing the following inside your conda environment:
cpan Parallel::ForkManager
Run Kraken.
parallel -j 2 --eta 'kraken2 --db cbwdata/CourseData/MIC_data/tools/kraken2_standard_08gb --output kraken2_outraw/{/.}.kraken --report kraken2_kreport/{/.}.kreport' {} ::: cat_reads/*.fastq
Run Bracken.
parallel -j 2 --eta 'bracken -d cbwdata/CourseData/MIC_data/tools/kraken2_standard_08gb -i {} -o bracken_out{/.}.species.bracken -r 100 -l S -t 1' ::: kraken2_kreport/*.kreport
run kraken-biom:
python ../tools/kraken-biom.py --fmt json -o mgs.biom -m mgs_metadata.tsv $(ls kraken2_outraw/*bracken*.kreport)
- Please feel free to post a question on the Microbiome Helper google group if you have any issues.
- General comments or inquires about Microbiome Helper can be sent to [email protected].