Skip to content

Commit

Permalink
update rocco.py docstring
Browse files Browse the repository at this point in the history
  • Loading branch information
nolan-h-hamilton committed Oct 19, 2024
1 parent a2b3d17 commit 9710031
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 25 deletions.
27 changes: 15 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,24 @@ ROCCO is an efficient algorithm for detection of "consensus peaks" in large data

### Example Behavior

In the image below, ROCCO is run on a set of heterogeneous ATAC-seq samples (lymphoblast) from independent donors (ENCODE). The samples' read density tracks are colored gray.
#### Input

* ROCCO consensus peaks (default parameters) are shown in blue
* MACS2 (pooled, `q=.01`) consensus peaks are shown in red.
* ENCODE cCREs are included as a rough reference of potentially active regions, but note that these regions are not specific to the data samples used in this analysis, nor are they derived from the same cell type or assay.
* ENCODE lymphoblastoid data (BEST5, WORST5)
* 10 real ATAC-seq alignment tracks of varying quality (TSS enrichment)

* Synthetic noisy data (NOISY5)
* 5 random alignments

<p align="center">
<img width="600" height="450" alt="example" src="docs/example_behavior.png">
</p>
#### Output

#### Additional Examples
* ROCCO consensus peaks (blue)
* Effectively separates true signal from noise across multiple samples
* Robust to noisy samples (e.g., NOISY5)
* High precision separation of enriched regions

* ROCCO offers several alternative features for preprocessing, scoring, and optimization
* [A visual characterization of different settings and their effects](docs/rocco_options.png) is available
* See documentation at <https://nolan-h-hamilton.github.io/ROCCO/> for additional details
<p align="center">
<img width="700" height="450" alt="example" src="docs/example_behavior.png">
</p>

## How

Expand All @@ -35,7 +38,7 @@ ROCCO models consensus peak calling as a constrained optimization problem with a
ROCCO offers several attractive features:

1. **Consideration of enrichment and spatial characteristics** of open chromatin signals
2. **Scaling to large sample sizes** with an asymptotic time complexity independent of sample size
2. **Scaling to large sample sizes (100+)** with an asymptotic time complexity independent of sample size
3. **No required training data** or a heuristically determined set of initial candidate peak regions
4. **No rigid thresholds** on the minimum number/width of supporting samples/replicates
5. **Mathematically tractable model** permitting worst-case analysis of runtime and performance
Expand Down
Binary file modified docs/example_behavior.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 22 additions & 13 deletions rocco/rocco.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,32 @@
ROCCO's repository is hosted on `GitHub <https://github.com/nolan-h-hamilton/ROCCO>`_
Example Behavior
~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^
**Input**
- ENCODE lymphoblastoid data (BEST5, WORST5)
- 10 real ATAC-seq alignment tracks of varying quality (TSS enrichment)
- Synthetic noisy data (NOISY5)
In the image below, ROCCO is run on a set of ten heterogeneous ATAC-seq samples (lymphoblast) from independent donors (ENCODE). The samples' tracks are colored gray.
- 5 random alignments
* ROCCO consensus peaks (default parameters) are shown in blue
* MACS2 (pooled, `q=.01`) consensus peaks are shown in red.
* ENCODE cCREs are included as a rough reference of potentially active regions, but note that these regions are not specific to the data samples used in this analysis, nor are they derived from the same cell type or assay.
**Output**
- ROCCO consensus peaks (blue)
- Effectively separates true signal from noise across multiple samples
- Robust to noisy samples (e.g., NOISY5)
- High resolution separation of enriched regions
.. image:: example_behavior.png
:width: 600px
:width: 700px
:height: 450px
:alt: example
:align: center
How
Expand Down Expand Up @@ -175,13 +190,7 @@
rocco -i sample1.bam sample2.bam [...] sampleM.bam -g hg38 --rescale_parsig
* Other relevant options `--transform_logpc`, `--scale_gamma`, etc.
See below for a visualization of the effects of several of ROCCO's fundamental options for preprocessing, scoring, optimization, etc.
.. image:: rocco_options.png
:width: 600px
:align: center
* See `here <https://github.com/nolan-h-hamilton/ROCCO/blob/main/docs/rocco_options.png>`_ for a visualization of the effects of several of ROCCO's fundamental options for preprocessing, scoring, optimization, etc.
"""

Expand Down

0 comments on commit 9710031

Please sign in to comment.