Skip to content

How to setup the TOuCAN configuration file

renewiegandt edited this page Jul 25, 2018 · 23 revisions

Table of Contents

Environment variables
Commandline Parameter
Full Example

General information about nextflow configuration files

Before you start writing the TOuCAN configuration file, it is recommended to read the documentation of nextflow configuration files.

Environment variables

  • path_bowtie2

    • Path to the directory where bowtie2 is installed. For example: /mnt/software/x86_64/packages/bowtie2/2.3.3.1/. Do not add the executable to the path.
  • path_python

    • Path to the directory where python is installed. For example: /mnt/software/x86_64/packages/python/2.7.8/bin.
      Do not add the executable to the path.
  • path_bwa

    • Path to the directory where bwa is installed. For example: /mnt/software/x86_64/packages/bwa/0.7.12/bin.
      Do not add the executable to the path.
  • path_bwa

    • Path to the directory where bwa is installed. For example: /mnt/software/x86_64/packages/bwa/0.7.12/bin.
      Do not add the executable to the path.
  • path_samtools

    • Path to directory where SAMtools is installed. For example: /mnt/software/x86_64/packages/samtools/1.3.1/bin.
      Do not add the executable to the path.
  • path_bedtools

    • Path to directory where BEDtools is installed. For example: /mnt/software/x86_64/packages/bedtools/2.27.1/bin.
      Do not add the executable to the path.
  • path_R

    • Path to the directory where R is installed. For example: /usr/R/bin.
      Do not add the executable to the path.
  • path_bin

    • Path to bin directory of TOuCAN. For example: ./TOuCAN/bin.
  • path_genome

    • Path to the full genome in fasta format + bwa index. The name of the fasta file has to be the basename of the bwa index. For example:
      ./index_bwa/
          ./GRCm38.p5.genome_whitelist.fa
          ./GRCm38.p5.genome_whitelist.fa.amb
          ./GRCm38.p5.genome_whitelist.fa.ann
          ./GRCm38.p5.genome_whitelist.fa.bwt
          ./GRCm38.p5.genome_whitelist.fa.pac
          ./GRCm38.p5.genome_whitelist.fa.rbwt
          ./GRCm38.p5.genome_whitelist.fa.rpac
          ./GRCm38.p5.genome_whitelist.fa.rsa
          ./GRCm38.p5.genome_whitelist.fa.sa
  • path_gtf

    • Path to gencode gtf file.
  • path_T2C_restriction_maps

    • Path to restriction maps. If String is empty TOuCAN will generate the restriction maps.
  • uropa_feature, uropa_anchor, uropa_dist_1, uropa_dist_2, uropa_strand, uropa_direction, uropa_filter_attr, uropa_attr_value, uropa_show_attr

    • Parameter for uropa configuration file. It is recommended to read the uropa documentation. Especially about the uropa configuration file. If a uropa parameter contains a list. Separate it with a comma.
      For example: uropa_show_attr = "gene_id,gene_type,gene_name"

Params / Commandline parameter

  • sample_extension
    • Regex for sample extension (e.g "_R[12]_001" or "_R[12]"). The regex has to match two cases for the forward and reversed fastq file. Like: "_R1" and "_R2" or "_A" and "_B" This parameter is only required if the input files are in the fastq format.

Enzyme Information

  • enzyme_a_name
    • Name of first restriction enzyme. (e.g. Hindiii)
  • enzyme_a_sequence
    • Sequence of first restriction enzyme cutting site. (e.g. AAGCTT)
  • enzyme_b_name
    • Name of second restriction enzyme cutting site. (T2C only)
  • enzyme_b_sequence
    • Sequence of second restriction enzyme. (T2C only)

Minor fixed parameters for BWA and SAMtools

  • bwa_T2C_options
    • bwa options for T2C Analysis. The alignment is done with bwa aln. For further information follow this link.
  • sort_options
    • Parameter for SAMtools sort. If more than one parameter should be set, separate them with a whitespace. For example: --threads 4 or --threads 4 --par1 val1.
      usually you do not have to change these
  • library_label = "capture"
  • platform_label = "ILLUMINA"
  • center_label = "ECB"

Parameter for normalization and plotting T2C

  • plot_options_T2C
    • Parameter for plotting the interaction matrix. [Insert documentation here!]
  • norm_method
    • Select which normalisation is used for the T2C Matrix. Select one of: "FPM", "log", "fpm", "array" and "none".

Parameter for uropa

  • uropa_threads
    • Number of threads for the uropa run. Keep in mind that for each sample uropa will be executed twice.

Parameter for HiC matrix

  • hicBuildMatrix_options
    • Parameter for hicPlotMatrix from the Tool HiCExplorer. If more than one parameter should be set, separate them with a whitespace. For example: --threads 4 --inputBufferSize 100000.
      For further information, follow this link.
  • bwa_HiC_options
  • bwa options for HiC Analysis. The alignment is done with bwa mem. Multiple parameters should be separated via whitespace. For further information follow this link.

Full Example

For a full Example of a TOuCAN configuration file follow this link

Clone this wiki locally