Skip to content
Permalink
ebb5956304
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
73 lines (57 sloc) 3.9 KB
# pHDee: Pipeline for processing HiChIP data from end-to-end
This is an easy-to-use pipeline for pre-processing of HiChIP (or HiC) data from end-to-end using various existing tools.
The pipeline aims to enable focussing on the biological question to be answered by seamlessly processing the raw data in the background.
The pipeline uses
- HiC-Pro for pre-processing the raw data and generate raw and ICE normalized contact matrices
- HiCPlotter for static visualization of contact matrices with/without additional annotations
- FitHiC for calling significant interactions
- Juicebox/Juicer for dynamic visualization of contact matrices with/without additional annotations
- HiCcompare for a differential comparison of two contact matrices
Output from the pipeline can be directly used for further downstream analyses.
## Requirements
1. HiC-Pro (v2.9.0; https://github.com/nservant/HiC-Pro)
2. Juicer (tested with v1.8.8 on Unix; https://github.com/theaidenlab/juicebox/wiki/Download)
3. Fit-HiC R package (tested with v.1.2.0; higher versions should work; https://doi.org/doi:10.18129/B9.bioc.FitHiC)
4. HiCPlotter (v0.7.1; https://github.com/kcakdemir/HiCPlotter)
5. HiCcompare (v1.1.3; https://doi.org/doi:10.18129/B9.bioc.HiCcompare)
_Kindly note that, due to HiCPlotter which requires Python 2.7.\*, this pipeline currently works with Python 2.7.\*_
The generated `.hic` files can be viewed using Juicebox (desktop or the web version).
More specifically,
- For installing HiC-Pro v2.9.0, follow instructions at [https://github.com/nservant/HiC-Pro]. Once done, configure it as per corresponding instructions (e.g., set paths to softwares in HiC-Pro's config-hicpro.txt), and copy the configured config-hicpro.txt into this folder.
- For installing FitHiC (latest version) and HiCcompare
```R
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite(c("FitHiC", "HiCcompare"))
```
## Usage
```bash
$ python run_pipeline.py -h
usage: run_pipeline.py -c CONFIG_FILE -st STAGE [STAGE ...] [-ne] [-h]
Pipeline for processing HiChIP data from end-to-end.
Usage example: $python run_pipeline.py -c umbrella-config-file.txt -s do_all
Required arguments:
-c CONFIG_FILE, --config_file CONFIG_FILE
Specify the umbrella config file.
-st STAGE [STAGE ...], --stage STAGE [STAGE ...]
Specify what to run. Possible values: [do_hicpro,
do_hicplotter, do_fithic, do_juicebox,
do_differential, suggest_resolution, do_all]. Non-
relevant details from the config file will be ignored.
When *do_all*, we perform sequentially do_hicpro,
do_fithic, and do_juicebox. For others, run the
pipeline individually.
Optional arguments:
-ne, --no_execute Specify this if you do not want to execute commands
right away; just print them in the commands logfile.
-h, --help Show this message and exit.
```
```bash
$ python run_pipeline.py -c umbrella-config-file.txt -st do_hicpro
```
Further important points:
- The `umbrella-config-file.txt` is used to set all the necessary the params for the pipeline.
- Use the `umbrella-config-file.txt` file for setting the various params required by the individual tools.
- Each pipeline run provides information messages on the console, and also logs it together with all the commands to a file. This file is named with keyword 'pHDee' suffixed with a 4-digit number identifying the particular run. See this file for the precise commands that are run in case you have to debug something.
- The `-ne` option (ne:no\_execute) lets you perform a dry run of the pipeline. In this mode, the pipeline forms and logs all commands and messages for a particular stage but doesn't execute them.
In case of any questions or feedback, please write to snikumbh@mpi-inf.mpg.de