Skip to content

snikumbh/pHDee

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

pHDee: Pipeline for processing HiChIP data from end-to-end

This is an easy-to-use pipeline for pre-processing of HiChIP (or HiC) data from end-to-end using various existing tools. The pipeline aims to enable focussing on the biological question to be answered by seamlessly processing the raw data in the background.

The pipeline uses

  • HiC-Pro for pre-processing the raw data and generate raw and ICE normalized contact matrices
  • HiCPlotter for static visualization of contact matrices with/without additional annotations
  • FitHiC for calling significant interactions
  • Juicebox/Juicer for dynamic visualization of contact matrices with/without additional annotations
  • HiCcompare for a differential comparison of two contact matrices

Output from the pipeline can be directly used for further downstream analyses.

Requirements

  1. HiC-Pro (v2.9.0; https://github.com/nservant/HiC-Pro)
  2. Juicer (tested with v1.8.8 on Unix; https://github.com/theaidenlab/juicebox/wiki/Download)
  3. Fit-HiC R package (tested with v.1.2.0; higher versions should work; https://doi.org/doi:10.18129/B9.bioc.FitHiC)
  4. HiCPlotter (v0.7.1; https://github.com/kcakdemir/HiCPlotter)
  5. HiCcompare (v1.1.3; https://doi.org/doi:10.18129/B9.bioc.HiCcompare)

Kindly note that, due to HiCPlotter which requires Python 2.7.*, this pipeline currently works with Python 2.7.*

The generated .hic files can be viewed using Juicebox (desktop or the web version).

More specifically,

  • For installing HiC-Pro v2.9.0, follow instructions at [https://github.com/nservant/HiC-Pro]. Once done, configure it as per corresponding instructions (e.g., set paths to softwares in HiC-Pro's config-hicpro.txt), and copy the configured config-hicpro.txt into this folder.
  • For installing FitHiC (latest version) and HiCcompare
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite(c("FitHiC", "HiCcompare"))

Usage

$ python run_pipeline.py -h
usage: run_pipeline.py -c CONFIG_FILE -st STAGE [STAGE ...] [-ne] [-h]

Pipeline for processing HiChIP data from end-to-end.
Usage example: $python run_pipeline.py -c umbrella-config-file.txt -s do_all
				

Required arguments:
  -c CONFIG_FILE, --config_file CONFIG_FILE
                        Specify the umbrella config file.
  -st STAGE [STAGE ...], --stage STAGE [STAGE ...]
                        Specify what to run. Possible values: [do_hicpro,
                        do_hicplotter, do_fithic, do_juicebox,
                        do_differential, suggest_resolution, do_all]. Non-
                        relevant details from the config file will be ignored.
                        When *do_all*, we perform sequentially do_hicpro,
                        do_fithic, and do_juicebox. For others, run the
                        pipeline individually.

Optional arguments:
  -ne, --no_execute     Specify this if you do not want to execute commands
                        right away; just print them in the commands logfile.
  -h, --help            Show this message and exit.
$ python run_pipeline.py -c umbrella-config-file.txt -st do_hicpro

Further important points:

  • The umbrella-config-file.txt is used to set all the necessary the params for the pipeline.
  • Use the umbrella-config-file.txt file for setting the various params required by the individual tools.
  • Each pipeline run provides information messages on the console, and also logs it together with all the commands to a file. This file is named with keyword 'pHDee' suffixed with a 4-digit number identifying the particular run. See this file for the precise commands that are run in case you have to debug something.
  • The -ne option (ne:no_execute) lets you perform a dry run of the pipeline. In this mode, the pipeline forms and logs all commands and messages for a particular stage but doesn't execute them.

In case of any questions or feedback, please write to snikumbh@mpi-inf.mpg.de

About

Pipeline for processing HiChIP data from end-to-end

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published