BAFExtract generates the B-allele frequency shifts from mapped reads alone without the need for a heterozygous variant call set. BAFExtract a part of CaSpER (https://github.com/akdess/CaSpER)
There are two steps:
- Generation of the pileup.
- Generation of the B-allele frequency profile.
Please open a new "issue" on github or contact authors [email protected] or [email protected] for questions.
Type make to build BAFExtract. The executable is built under bin directory. The code is tested on various Unix based systems.
Extract BAF values from RNA-Seq bam files
samtools view <bam_file> | ./BAFExtract -generate_compressed_pileup_per_SAM stdin <genome_list> <sample_dir> 50 0; ./BAFExtract -get_SNVs_per_pileup <genome_list> <sample_dir> <genome_fasta_pileup_dir> 20 4 0.1 <output_baf_file>
<sample_dir>: the name of sample directory <output_baf_file>: final output
You can download and unzip genome_fasta_pileup_dir files from :
Or you can create genome_fasta_pileup_dir files for other genomes using the following commands:
BAFExtract -preprocess_FASTA [FASTA file path] [Output directory]
for example:
wget -c http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz
tar -xvzf chromFa.tar.gz
mkdir ../mm10
FILES=./*fa
for f in $FILES
do
echo "Processing $f file..."
BAFExtract -preprocess_FASTA $f ../mm10
done
You can download genome_list files from :
generated using the following command:
fetchChromSizes hg38 > hg38.list
generated using the following command:
fetchChromSizes hg19 > hg19.list
mkdir test; samtools view SRR1295366.sorted.bam | ./bin/BAFExtract -generate_compressed_pileup_per_SAM stdin hg38.list test 50 0; ./bin/BAFExtract -get_SNVs_per_pileup hg38.list test ./hg38/ 20 4 0.1 test.snp
Note: In the example above ./BAFExtract -generate_compressed_pileup_per_SAM uses mimunum mapping quality threshold 50. Depending on the aligner you used the MAPQ-Values can differ a lot and the setting of [Minimum mapping quality] to 50 could mean that no reads are surviving the filtering, only due to the aligners implementation of MAPQ. (acknowledgements to Tobias Tekath)