From a1778b5a93d0724cf0a65ca2aea5d77ada7735d0 Mon Sep 17 00:00:00 2001 From: Peter Ebert Date: Mon, 11 Sep 2017 19:29:49 +0200 Subject: [PATCH] ENH: update to CHPv5, still not finalized --- docs/quantification/chip-seq/CHPv5.xml | 225 ++++++++++++++++--------- 1 file changed, 150 insertions(+), 75 deletions(-) diff --git a/docs/quantification/chip-seq/CHPv5.xml b/docs/quantification/chip-seq/CHPv5.xml index 9a828a6..3b97b0e 100644 --- a/docs/quantification/chip-seq/CHPv5.xml +++ b/docs/quantification/chip-seq/CHPv5.xml @@ -68,25 +68,82 @@ - DEEPID.PROC.DATE.ext - EXT + DEEPID.PROC.DATE.raw.bamcov + bigwig collection - Process output not yet defined + Signal coverage track generated from raw BAM files + + DEEPID.PROC.DATE.filt.bamcov + bigwig + collection + + Signal coverage track generated from filtered BAM files. -F 3844 / q >= 5 / blacklist removed + + + + DEEPID.PROC.DATE.ses-fc + bigwig + collection + SES normalized fold-change signal + + + DEEPID.PROC.DATE.cnt-fc + bigwig + collection + Read-count normalized fold-change signal + + + DEEPID.PROC.DATE.gcfreq + svg + collection + GC bias plot based on raw BAM files + + + DEEPID.PROC.DATE.gcfreq + txt + collection + Obs./exp. GC read frequencies based on raw BAM files + + + DEEPID.PROC.DATE.hhmm.emfit + PDF + collection + histoneHMM output visualizing the EM fit. Check this before using the histoneHMM output + + + DEEPID.PROC.DATE.hhmm.out + zip + collection + Zip archive containing other histoneHMM output files (raw data files not needed by most users) + + + DEEPID.PROC.DATE.fgpr + SVG + single + Fingerprint plots based on raw BAM files + + + DEEPID.PROC.DATE.qm-fgpr + txt + single + Fingerprint quality metrics based on raw BAM files + + + - plotFingerprint + bamCoverage 2.5.3 - no looping - Compute fingerprint on raw BAM files + GALvX_Histone, GALvX_Input + Generate read coverage signal normalized to 1x depth for raw BAM files computeGCBias @@ -94,7 +151,7 @@ @@ -118,16 +175,17 @@ - bamCoverage + plotFingerprint 2.5.3 - GALvX_Histone, GALvX_Input - Generate read coverage signal normalized to 1x depth for raw BAM files + no looping + Compute fingerprint on raw BAM files @@ -135,7 +193,7 @@ 0.6.6 = 5" {GALvX_*} ]]> @@ -145,31 +203,21 @@ Apply IHEC ChIP QC standard filtering to all BAM files (equivalent to bitflag 3844). The resulting BAM files are temporary and discarded after the analysis. - - - MACS2 - 2.1.1.20160309 - - - - GALvX_Histone - MACS2 peak calling on filtered BAM files. Parameter "--broad" for libraries H3K4me1/H3K27me3/H3K36me/H3K9me3 - - histoneHMM - 1.7 - + sambamba + 0.6.6 + DEEPID.mapped.readcount ]]> - GALvX_Histone - HistoneHMM peak calling on filtered BAM files for broad marks: H3K4me1/H3K27me3/H3K9me3/H3K36me3 + DEEPID.tmp.filt.bam + + Due to the previous filtering step, counting simply all reads in the filtered BAM + file is equivalent to counting only mapped reads. The number of mapped reads is needed + to compute the FRiP score in a later stage. + bamCoverage @@ -177,66 +225,66 @@ DEEPID.tmp.filt.bam - Generate read coverage signal normalized to 1x depth for filtered BAM files + + Generate read coverage signal normalized to 1x depth for filtered BAM files. + Remove blacklist regions on-the-fly and consider only autosomes for normalization step. + - - multiBamSummary + plotFingerprint 2.5.3 no looping - Create data matrix for correlation plot on filtered BAM files; remove blacklist regions on the fly + Compute fingerprint on filtered BAM files to compute IHEC QC measures - plotCorrelation - 2.5.3 + MACS2 + 2.1.1.20160309 - no looping - Create heatmap correlation plot using Spearman and Pearson correlation + GALvX_Histone + MACS2 peak calling on filtered BAM files. Parameter "--broad" for libraries H3K27me3/H3K36me/H3K9me3 - - plotFingerprint - 2.5.3 + + + histoneHMM + 1.7 - no looping - Compute fingerprint on filtered BAM files; remove blacklist regions on the fly + GALvX_Histone + HistoneHMM peak calling on filtered BAM files for broad marks: H3K4me1/H3K27me3/H3K9me3/H3K36me3 - sambamba - 0.6.6 + cut, sort, mv + 8.13 DEEPID.hmm.bed && + mv DEEPID-zinba-emfit.pdf DEEPID.PROC.DATE.hhmm.emfit.pdf ]]> - DEEPID.tmp.filt.bam - - Get flagstat output for filtered BAM files, specifically number of mapped reads in these files. - This is done to compute the IHEC QC metrics as part of this process. - + DEEPID-regions.gff + Make histoneHMM output BED-like for blacklist intersection and standardize name of EM fit PDF. + sambamba 0.6.6 @@ -263,19 +311,46 @@ Values input from the two previous steps. + - bedtools - 2.26.0 + sambamba + 0.6.6 {DEEPID.PROC.DATE.peaks} - ]]> - - All peak files + sambamba view --format=bam --nthreads={sambamba_parallel} --output-filename DEEPID.tmp.auto.bam + --regions={autosome_regions} DEEPID.tmp.filt.bam + ]]> + + DEEPID.tmp.filt.bam - Discard peaks overlapping with a known blacklist region after computing FRiP score. - There does not seem to be an IHEC standard concerning this, correct? + Restrict filtered BAM files to autosomal regions. These BAM files will be used to plot the correlation heatmaps. + + multiBamSummary + 2.5.3 + + + + no looping + Create data matrix for correlation plot on filtered BAM files; remove blacklist regions on the fly + + + plotCorrelation + 2.5.3 + + + + no looping + Create heatmap correlation plot using Spearman and Pearson correlation + +