Skip to content

How to setup the TOuCAN configuration file

renewiegandt edited this page Oct 15, 2018 · 23 revisions

Table of Contents

Environment variables
Commandline Parameter
Full Example

General information about nextflow configuration files

Before you start writing the TOuCAN configuration file, it is recommended to read the documentation of nextflow configuration files.

Environment variables

  • path_python

    • Path to the directory where python is installed. For example: /mnt/software/x86_64/packages/python/2.7.8/bin.
      Do not add the executable to the path.
  • path_R

    • Path to the directory where R is installed. For example: /usr/R/bin.
      Do not add the executable to the path.
  • path_bin

    • Path to bin directory of TOuCAN. For example: ./TOuCAN/.
  • path_working

    • Path to working directory, containing the Nextflow script.
  • path_genome

    • Path to the full genome in fasta format + bwa index. The name of the fasta file has to be the basename of the bwa index. For example:
      ./index_bwa/
          ./GRCm38.p5.genome_whitelist.fa
          ./GRCm38.p5.genome_whitelist.fa.amb
          ./GRCm38.p5.genome_whitelist.fa.ann
          ./GRCm38.p5.genome_whitelist.fa.bwt
          ./GRCm38.p5.genome_whitelist.fa.pac
          ./GRCm38.p5.genome_whitelist.fa.rbwt
          ./GRCm38.p5.genome_whitelist.fa.rpac
          ./GRCm38.p5.genome_whitelist.fa.rsa
          ./GRCm38.p5.genome_whitelist.fa.sa
  • path_gtf

    • Path to gencode gtf file.
  • path_T2C_restriction_maps

    • Path to restriction maps. If String is "none" TOuCAN will generate the restriction maps. The restriction maps need to be generated by TOuCAN. So, for your first run you can leave the path empty.

Params / Commandline parameter

  • sample_extension
    • Regex for sample extension (e.g "_R[12]_001" or "_R[12]"). The regex has to match two cases for the forward and reversed fastq file. Like: "_R1" and "_R2" or "_A" and "_B" This parameter is only required if the input files are in the fastq format.

Enzyme Information

  • enzyme_a_name
    • Name of first restriction enzyme. (e.g. Hindiii)
  • enzyme_a_sequence
    • Sequence of first restriction enzyme cutting site. (e.g. AAGCTT)
  • enzyme_b_name
    • Name of second restriction enzyme cutting site. (T2C only)
  • enzyme_b_sequence
    • Sequence of second restriction enzyme. (T2C only)

Minor fixed parameters for BWA and SAMtools

  • bwa_T2C_options
    • bwa options for T2C Analysis. The alignment is done with bwa aln. For further information follow this link.
  • sort_options
    • Parameter for SAMtools sort. If more than one parameter should be set, separate them with a whitespace. For example: --threads 4 or --threads 4 --par1 val1.
  • usually you do not have to change these:
    • library_label = "capture"
    • platform_label = "ILLUMINA"
    • center_label = "ECB"

Parameter for normalization and plotting T2C

  • plot_options_T2C
    • Parameter for plotting the interaction matrix. [Insert documentation here!]
  • norm_method
    • Select which normalisation is used for the T2C Matrix. Select one of: "FPM", "log", "fpm", "array" and "none".

Parameter for uropa

  • uropa_threads
    • Number of threads for the uropa run. Keep in mind that for each sample uropa will be executed twice.

Parameter for HiC matrix

  • hicBuildMatrix_threads
    • Threads used for hicPlotMatrix from the Tool HiCExplorer.
  • inputBufferSize
    • Set Buffersize for hicPlotMatrix from the Tool HiCExplorer. For further information, follow this link.
  • bwa_HiC_options
  • bwa options for HiC Analysis. The alignment is done with bwa mem. Multiple parameters should be separated via whitespace. For further information follow this link.

Parameter for the UROPA configuration file

  • uropa_feature, uropa_anchor, uropa_dist_1, uropa_dist_2, uropa_strand, uropa_direction, uropa_filter_attr, uropa_attr_value, uropa_show_attr
    • Parameter for uropa configuration file. It is recommended to read the uropa documentation. Especially about the uropa configuration file. If a uropa parameter contains a list. Separate it with a comma.
      For example: uropa_show_attr = "gene_id,gene_type,gene_name"

Full Example

For a full Example of a TOuCAN configuration file follow this link