README.Rmd

[![Travis-CI Build Status](https://travis-ci.org/evocellnet/ksea.svg?branch=master)](https://travis-ci.org/evocellnet/ksea)

### Introduction

This package contains an implementation of the Kinase Set Enrichment Analysis (KSEA) used to predict kinase activities based on quantitative phosphoproteomic studies. Using a quantitative phosphoproteomic profile and a list of known targets, the algorithms predicts the activity of the based on the enrichment on top regulated sites within the known targets of the kinase.

![kinase](./kinase_GSEA.png)

###Installation

The `ksea` package can be installed directly from github (if public) or locally using the devtools package.

```{r install_devtools, eval=FALSE}
install.packages('devtools')
```

#####Github installation

```{r install_ksea_github, eval=FALSE}
require(devtools)
install_github("evocellnet/ksea")

```

#####Local installation

You can clone this project and install it locally in your computer.

```{r install_ksea_locally, eval=FALSE}
require(devtools)
install("./ksea")

```

###Usage

First load the `ksea` package.

```{r load_ksea}
library("ksea")
```

Next create some fake data. In one hand, we create a list named `regulons` containing a vector per kinase with the names of the known substrates. Secondly, we create a vector sites with quantifications for the sites going from A to Z. Finally, we sort the quantifications.

```{r data}
regulons <- list(kinaseA=sample(LETTERS, 5))
regulons

sites <- rnorm(length(LETTERS))
names(sites) <- LETTERS
sites <- sites[order(sites, decreasing=TRUE)]
sites

```

The function `ksea` will run the enrichment analysis for the provided quantifications and known kinase targets.

```{r ksea, dpi=100}
ksea_result <- ksea(names(sites), sites, regulons[["kinaseA"]], trial=1000, significance = TRUE)
ksea_result

```

The function `ksea_batchKinases` calculates the KSEA p-value for a list of kinases. To improve the performance of the function, it uses as many cores as possible using the `parallell` package.

```{r ksea_batchKinases}

regulons[["kinaseB"]] <- sample(LETTERS, 3)
regulons[["kinaseC"]] <- sample(LETTERS, 7)
regulons

kinases_ksea <- ksea_batchKinases(names(sites), sites, regulons, trial=1000)
kinases_ksea

```

##### KSEA in parallel #####

Some functions such as `ksea_batchkinases` are optimized to run in parallel in multicore processors. By default they run in 2 cores (if available) but this number can be increased by changing the "mc.cores" variable.

```{r ksea_mccores, eval=FALSE}
options("mc.cores" = 4)

```

If you are working with an LSF cluster, you can also take the number of cores allocated in the `bsub` command from the `LSB_MCPU_HOSTS` environment variable.

```{r ksea_lsf, eval=FALSE}
## Setup multicore forking for mclapply based in bsub available cores
hosts <- Sys.getenv("LSB_MCPU_HOSTS")
if(length(hosts) >0){
ncores <- unlist(strsplit(hosts," "))[2]
}else{
ncores <- 1
}
options("mc.cores" = ncores)

```


### Developers

The package is documented using [roxygen2](http://cran.r-project.org/web/packages/roxygen2/index.html). After changing the code located in the `R/` folder remember to run 'make' on the main directory to create the documentation following the roxygen2 rules.

To work on the code you will need the `devtools` package installed (installation guide above).