I have been analysing a lot of
ChIP-seq data lately. One of the steps in the analysis process is identifying Super-Enhancers from
H3K27ac ChIP-seq data.
Super-Enhancers are long stretches of genomic regions with clusters of enhancers within these stretches. Super-enhancers are known to have very high enhancer activity than the sum of its individual enhancers. While this post is not about the definition or biological importance of super-enhancers but showing the identified super enhancers through an aesthetic, publication quality plot.
Here are some references to understand super-enhancers in detail
To identify super enhancers, most published articles used the algorithm
ROSE (Rank Ordering of Super Enhancers) from
Young lab. You can grab the code from here or here. One can find the instruction on
how to run this tool on their
bitbucket page (Instructions are also hardcoded within
The super-enhancer code provided above also generates a plot by default. The color chosen for super-enhancers and typical-enhancers is same and it would be helpful to differentiate them with different colors. Adding different colors to super-enhancers will also come in handy to separate different conditions (e.g. Different sub-groups of Breast cancer).
I wrote a
naive R function to generate super-enhancer plot from the results of
ROSE algorithm. It was so naive that I need to change the code if I want a different color on the plot.
Few days back, I came across an impressive thought from David Robinson which said
When you’ve written the same code 3 times, write a function— David Robinson (@drob) November 9, 2017
When you’ve given the same in-person advice 3 times, write a blog post
Above tweet from David made a lot of sense for me to sit for sometime and write a generalized function to produce super-enhancer plot, with a flexibility to choose colors, labels etc and returns a
ggplot2 object to customize according to the requirements. Here, I am sharing the function and instructions on how to use it to generate a super-enhancer plot.
ROSE algorithm generates a couple of output files. The one with
*_AllEnhancers.table.txt extension will be used for the plots.
For convinience, I pushed the code to
GitHub, so one can directly
source it from
Arguments and Example Usage
seFile - path to ROSE result (*_AllEnhancers.table.txt) seCol - color for super-enhancers (Default - red) teCol - color for typical-enhancers (Default - black) bg - (TRUE|FALSE) - whether background is used during ROSE run (Default - TRUE) mark - Which ChIP-seq data is used to generate super-enhancer data (Default - H3K27ac)
comments powered by Disqus