Permalink
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
comp-metadata/docs/quantification/transcriptome/EXPv1.xml
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
146 lines (146 sloc)
6.04 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0"?> | |
<?xml-stylesheet type="text/css" href="http://deep.mpi-inf.mpg.de/DAC/files/style/deep_process_style.css"?> | |
<process> | |
<name>EXP</name> | |
<version>1</version> | |
<author> | |
<name>Matthias Barann</name> | |
<email>m.barann@ikmb.uni-kiel.de</email> | |
</author> | |
<description> | |
* bam2wig.py: Conversion of BAM file to BigWig coverage tracks. One track per strand will be generated. | |
* htseq-count: Generates read counts on the gene level. | |
* cufflinks: Generates FPKM values for genes and transcript isoforms. | |
</description> | |
<inputs> | |
<filetype> | |
<identifier>.bam</identifier> | |
<format></format> | |
<quantity>single</quantity> | |
<comment>Unfiltered aligned reads</comment> | |
</filetype> | |
<filetype> | |
<identifier>.bai</identifier> | |
<format></format> | |
<quantity>single</quantity> | |
<comment>Index file to bam file</comment> | |
</filetype> | |
</inputs> | |
<references> | |
<filetype> | |
<identifier>chromInfo.txt</identifier> | |
<format>text file</format> | |
<quantity>single</quantity> | |
<comment>Tab delimited file containing the name and length of the reference sequences: [name][tab][length].</comment> | |
</filetype> | |
<filetype> | |
<identifier>gencode.v19.annotation.gtf</identifier> | |
<format>GTF</format> | |
<quantity>single</quantity> | |
<comment>Gencode gene annotation file in gene transfer format.</comment> | |
</filetype> | |
<filetype> | |
<identifier>reference.fa</identifier> | |
<format>multi fasta</format> | |
<quantity>single</quantity> | |
<comment>The reference genome file; see aspera.dkfz.de > download > results > references > genomes > human > WholeGenome</comment> | |
</filetype> | |
</references> | |
<outputs> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].bamcov.Forward.wig</identifier> | |
<format>wiggle</format> | |
<quantity>single</quantity> | |
<comment>Forward strand wiggle file. Usually it is not necessary to keep this file.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].bamcov.Reverse.wig</identifier> | |
<format>wiggle</format> | |
<quantity>single</quantity> | |
<comment>Reverse strand wiggle file Usually it is not necessary to keep this file.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].bamcov.Forward.bw</identifier> | |
<format>BigWig</format> | |
<quantity>single</quantity> | |
<comment>Forward strand BigWig file. This file will only be generated if the UCSC program bamToBigWig can be found in $PATH.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].bamcov.Reverse.bw</identifier> | |
<format>BigWig</format> | |
<quantity>single</quantity> | |
<comment>Reverse strand BigWig file. This file will only be generated if the UCSC program bamToBigWig can be found in $PATH.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].readcounts.txt</identifier> | |
<format>text file</format> | |
<quantity>single</quantity> | |
<comment>This file contains the read counts on the gene level.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].genes.fpkm.tracking</identifier> | |
<format>text file</format> | |
<quantity>single</quantity> | |
<comment>Output file containing the FPKM counts on the gene level.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].isoforms.fpkm.tracking</identifier> | |
<format>text file</format> | |
<quantity>single</quantity> | |
<comment>Output file containing the FPKM counts on the isoform level.</comment> | |
</filetype> | |
<filetype> | |
<identifier>[sampleID].EXPv1.[DATE].transcripts.gtf</identifier> | |
<format>gene transfer format</format> | |
<quantity>single</quantity> | |
<comment>This file contains assembled transcripts.</comment> | |
</filetype> | |
</outputs> | |
<software> | |
<tool> | |
<name>Python</name> | |
<version>2.7</version> | |
<command_line><![CDATA[ CMDLINE ]]></command_line> | |
<loop>no looping</loop> | |
<comment></comment> | |
</tool> | |
<tool> | |
<name>Samtools</name> | |
<version>0.1.19-44428cd</version> | |
<command_line><![CDATA[ CMDLINE ]]></command_line> | |
<loop>no looping</loop> | |
<comment></comment> | |
</tool> | |
<tool> | |
<name>bam2wig.py</name> | |
<version>2.3.9</version> | |
<command_line><![CDATA[ python bam2wig.py -i ${sample}.bam -s ChromInfo.txt -o ${_sample} -d "1+-,1-+,2++,2--" ]]></command_line> | |
<loop>no looping</loop> | |
<comment>The python script is part of the RSeQC software. It will convert a bam file into two wig files (one for each strand). \ | |
If the UCSC program wigToBigWig can be located by the python script, the generated wig files will automatically be converted to bigWig. \ | |
Please note that for some samples the wigToBigWig command might exit with errors. In this case, manually invoking the wigToBigWig \ | |
command on the generated wig files can solve the problem: \ | |
wigToBigWig ${_sample}_Forward.wig -s ChromInfo.txt > ${_sample}_Forward.bw</comment> | |
</tool> | |
<tool> | |
<name>htseq-count</name> | |
<version>0.5.4p3</version> | |
<command_line><![CDATA[ samtools sort -n -@ 8 -m 4G ${_sample}.bam ${_sample}_sorted | |
samtools/samtools view -F 256 ${_sample}_sorted.bam > ${_sample}.sam | |
htseq-count -s reverse -m intersection-strict -a 20 ${_sample}.sam gencode.v19.annotation.gtf > ${_sample}_htseq.txt ]]> | |
</command_line> | |
<loop>no looping</loop> | |
<comment>DESeq requires bam files sorted by read name (step 1). After sorting, all non-primary alignments are removed during the bam to sam conversion. \ | |
Invoking htseq-count counts the number of reads per gene. Using the mode 'intersection-strict' results in a rather conservative read count. \ | |
Please see http://www-huber.embl.de/users/anders/HTSeq/doc/count.html#count for further information.</comment> | |
</tool> | |
<tool> | |
<name>cufflinks</name> | |
<version>v2.0.2</version> | |
<command_line><![CDATA[ cufflinks -p 8 --frag-bias-correct reference.fa --multi-read-correct --library-type fr-firststrand --compatible-hits-norm -G gencode.v19.annotation_transcripts_only.gtf ${_sample}.bam ]]> | |
</command_line> | |
<loop>no looping</loop> | |
<comment>Please see http://cufflinks.cbcb.umd.edu/manual.html for further information.</comment> | |
</tool> | |
</software> | |
</process> |