Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
pHDee/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
73 lines (57 sloc)
3.9 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# pHDee: Pipeline for processing HiChIP data from end-to-end | |
This is an easy-to-use pipeline for pre-processing of HiChIP (or HiC) data from end-to-end using various existing tools. | |
The pipeline aims to enable focussing on the biological question to be answered by seamlessly processing the raw data in the background. | |
The pipeline uses | |
- HiC-Pro for pre-processing the raw data and generate raw and ICE normalized contact matrices | |
- HiCPlotter for static visualization of contact matrices with/without additional annotations | |
- FitHiC for calling significant interactions | |
- Juicebox/Juicer for dynamic visualization of contact matrices with/without additional annotations | |
- HiCcompare for a differential comparison of two contact matrices | |
Output from the pipeline can be directly used for further downstream analyses. | |
## Requirements | |
1. HiC-Pro (v2.9.0; https://github.com/nservant/HiC-Pro) | |
2. Juicer (tested with v1.8.8 on Unix; https://github.com/theaidenlab/juicebox/wiki/Download) | |
3. Fit-HiC R package (tested with v.1.2.0; higher versions should work; https://doi.org/doi:10.18129/B9.bioc.FitHiC) | |
4. HiCPlotter (v0.7.1; https://github.com/kcakdemir/HiCPlotter) | |
5. HiCcompare (v1.1.3; https://doi.org/doi:10.18129/B9.bioc.HiCcompare) | |
_Kindly note that, due to HiCPlotter which requires Python 2.7.\*, this pipeline currently works with Python 2.7.\*_ | |
The generated `.hic` files can be viewed using Juicebox (desktop or the web version). | |
More specifically, | |
- For installing HiC-Pro v2.9.0, follow instructions at [https://github.com/nservant/HiC-Pro]. Once done, configure it as per corresponding instructions (e.g., set paths to softwares in HiC-Pro's config-hicpro.txt), and copy the configured config-hicpro.txt into this folder. | |
- For installing FitHiC (latest version) and HiCcompare | |
```R | |
## try http:// if https:// URLs are not supported | |
source("https://bioconductor.org/biocLite.R") | |
biocLite(c("FitHiC", "HiCcompare")) | |
``` | |
## Usage | |
```bash | |
$ python run_pipeline.py -h | |
usage: run_pipeline.py -c CONFIG_FILE -st STAGE [STAGE ...] [-ne] [-h] | |
Pipeline for processing HiChIP data from end-to-end. | |
Usage example: $python run_pipeline.py -c umbrella-config-file.txt -s do_all | |
Required arguments: | |
-c CONFIG_FILE, --config_file CONFIG_FILE | |
Specify the umbrella config file. | |
-st STAGE [STAGE ...], --stage STAGE [STAGE ...] | |
Specify what to run. Possible values: [do_hicpro, | |
do_hicplotter, do_fithic, do_juicebox, | |
do_differential, suggest_resolution, do_all]. Non- | |
relevant details from the config file will be ignored. | |
When *do_all*, we perform sequentially do_hicpro, | |
do_fithic, and do_juicebox. For others, run the | |
pipeline individually. | |
Optional arguments: | |
-ne, --no_execute Specify this if you do not want to execute commands | |
right away; just print them in the commands logfile. | |
-h, --help Show this message and exit. | |
``` | |
```bash | |
$ python run_pipeline.py -c umbrella-config-file.txt -st do_hicpro | |
``` | |
Further important points: | |
- The `umbrella-config-file.txt` is used to set all the necessary the params for the pipeline. | |
- Use the `umbrella-config-file.txt` file for setting the various params required by the individual tools. | |
- Each pipeline run provides information messages on the console, and also logs it together with all the commands to a file. This file is named with keyword 'pHDee' suffixed with a 4-digit number identifying the particular run. See this file for the precise commands that are run in case you have to debug something. | |
- The `-ne` option (ne:no\_execute) lets you perform a dry run of the pipeline. In this mode, the pipeline forms and logs all commands and messages for a particular stage but doesn't execute them. | |
In case of any questions or feedback, please write to snikumbh@mpi-inf.mpg.de |