Skip to content

Step 2: Prepare files and run PRS‐CS

Toni Boltz edited this page Oct 24, 2024 · 1 revision

The next step before running PRS-CS is to prepare the plink files with phenotype and sex information using the files created in the QC notebook. The fmt_files.sh provides commands using plink2 to do this as well as remove any samples with >5% missingness or SNPs with >5% missingness (the plink commands to do this are much faster than Hail). Then, we split the plink files up to have a separate fileset per chromosome.

This file also provides instructions on where to download and how to format the GWAS summary statistics per the PRS-CS guide, in this case the PGC3 Schizophrenia GWAS.

Then, we move onto the run_prscs_forWiki.sh script to run the PRS-CS python code. This can be run on any cluster and it is recommended to run each chromosome in parallel rather than sequentially for a faster run time. The total runtime for running this example in parallel was 1hr on a machine with up to 6GB of memory and 6GB of storage. Note that for bigger sample sizes (i.e. more than one cohort), the time, storage, and memory requirements will be higher.

Clone this wiki locally