-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
executable file
·106 lines (66 loc) · 3.26 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
[![Travis-CI Build Status](https://travis-ci.org/evocellnet/ksea.svg?branch=master)](https://travis-ci.org/evocellnet/ksea)
### Introduction
This package contains an implementation of the Kinase Set Enrichment Analysis (KSEA) used to predict kinase activities based on quantitative phosphoproteomic studies. Using a quantitative phosphoproteomic profile and a list of known targets, the algorithms predicts the activity of the based on the enrichment on top regulated sites within the known targets of the kinase.
![kinase](./kinase_GSEA.png)
###Installation
The `ksea` package can be installed directly from github (if public) or locally using the devtools package.
```{r install_devtools, eval=FALSE}
install.packages('devtools')
```
#####Github installation
```{r install_ksea_github, eval=FALSE}
require(devtools)
install_github("evocellnet/ksea")
```
#####Local installation
You can clone this project and install it locally in your computer.
```{r install_ksea_locally, eval=FALSE}
require(devtools)
install("./ksea")
```
###Usage
First load the `ksea` package.
```{r load_ksea}
library("ksea")
```
Next create some fake data. In one hand, we create a list named `regulons` containing a vector per kinase with the names of the known substrates. Secondly, we create a vector sites with quantifications for the sites going from A to Z. Finally, we sort the quantifications.
```{r data}
regulons <- list(kinaseA=sample(LETTERS, 5))
regulons
sites <- rnorm(length(LETTERS))
names(sites) <- LETTERS
sites <- sites[order(sites, decreasing=TRUE)]
sites
```
The function `ksea` will run the enrichment analysis for the provided quantifications and known kinase targets.
```{r ksea, dpi=100}
ksea_result <- ksea(names(sites), sites, regulons[["kinaseA"]], trial=1000, significance = TRUE)
ksea_result
```
The function `ksea_batchKinases` calculates the KSEA p-value for a list of kinases. To improve the performance of the function, it uses as many cores as possible using the `parallell` package.
```{r ksea_batchKinases}
regulons[["kinaseB"]] <- sample(LETTERS, 3)
regulons[["kinaseC"]] <- sample(LETTERS, 7)
regulons
kinases_ksea <- ksea_batchKinases(names(sites), sites, regulons, trial=1000)
kinases_ksea
```
##### KSEA in parallel #####
Some functions such as `ksea_batchkinases` are optimized to run in parallel in multicore processors. By default they run in 2 cores (if available) but this number can be increased by changing the "mc.cores" variable.
```{r ksea_mccores, eval=FALSE}
options("mc.cores" = 4)
```
If you are working with an LSF cluster, you can also take the number of cores allocated in the `bsub` command from the `LSB_MCPU_HOSTS` environment variable.
```{r ksea_lsf, eval=FALSE}
## Setup multicore forking for mclapply based in bsub available cores
hosts <- Sys.getenv("LSB_MCPU_HOSTS")
if(length(hosts) >0){
ncores <- unlist(strsplit(hosts," "))[2]
}else{
ncores <- 1
}
options("mc.cores" = ncores)
```
### Developers
The package is documented using [roxygen2](http://cran.r-project.org/web/packages/roxygen2/index.html). After changing the code located in the `R/` folder remember to run 'make' on the main directory to create the documentation following the roxygen2 rules.
To work on the code you will need the `devtools` package installed (installation guide above).