Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
# Configuring LSTrAP
## config.ini
Before running LSTrAP make sure **all paths in config.ini match your system's setup**, e.g. Trimmomatic adapters and jar will
need to be adapted.
### qsub parameters
Additional parameters can be added to the qsub commands at the bottom,
this allows users to submit jobs to specific queues, with specific
options, ... Furthermore, while the template is designed for Oracle/Sun
Grid Engine this can be set up to work with other job management systems
such as PBS and Torque. At the bottom section there is an example on how to
set the number of nodes/cores on PBS/Torque and how to add a walltime (if
required).
**Match the number of cores** to the number of cores the job needs. When
starting TopHat with **-p 3**, the job will require 4 cores (3 worker
threads and a background thread are active when a job is started this
way).
### Environment modules
In case environment modules are not used, all software needs to be installed on the cluster + nodes. You also need
to set all modules in the config.ini to **None** !
In your config file, module names need to be specified. To see which modules are available on your system type:
module avail
Add the module name for each of the tools to your config.ini if your system is using environmental modules.
### Tweaking parameters of individual tools (expert feature)
In case you would like to tweak parameters passed to tools, this would be the place to do so. Note however that the tools
will run with the same settings for each file. Modifying parameters that would **change the output name or format will
cause the pipeline to break**. Arguments with a name like *${var}* should **not** be changed as this is how the pipeline
defines the input and output for each tool.
Example config.ini:
```ini
[TOOLS]
; Tool Configuration
;
; Some tools require additional files or might require a hard coded path to the script.
; Please make sure these are set up correctly.
; Trimmomatic Path
; ADJUST THIS
trimmomatic_path=/home/sepro/tools/Trimmomatic-0.36/trimmomatic-0.36.jar
; COMMANDS to run tools
;
; Here the commands used to start different steps are defined, ${name} are variables that will be set by LSTrAP for
; each job.
; Note that in some cases hard coded paths were required, adjust these to match the location of these files on
; your system
bowtie_cmd=bowtie2-build ${in} ${out}
hisat2_build_cmd=hisat2-build ${in} ${out}
; ADJUST PATHS TO ADAPTERS
trimmomatic_se_command=java -jar ${jar} SE -threads 1 ${in} ${out} ILLUMINACLIP:/home/sepro/tools/Trimmomatic-0.36/adapters/TruSeq3-SE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
trimmomatic_pe_command=java -jar ${jar} PE -threads 1 ${ina} ${inb} ${outap} ${outau} ${outbp} ${outbu} ILLUMINACLIP:/home/sepro/tools/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
tophat_se_cmd=tophat -p 3 -o ${out} ${genome} ${fq}
tophat_pe_cmd=tophat -p 3 -o ${out} ${genome} ${forward},${reverse}
hisat2_se_cmd=hisat2 -p 3 -x ${genome} -U ${fq} -S ${out} 2> ${stats}
hisat2_pe_cmd=hisat2 -p 3 -x ${genome} -1 ${forward} -2 ${reverse} -S ${out} 2> ${stats}
htseq_count_cmd=htseq-count -s no -f ${itype} -t ${feature} -i ${field} ${bam} ${gff} > ${out}
interproscan_cmd=interproscan.sh -i ${in_dir}/${in_prefix}${SGE_TASK_ID} -o ${out_dir}/${out_prefix}${SGE_TASK_ID} -f tsv -dp -iprlookup -goterms --tempdir /tmp
pcc_cmd=python3 ./scripts/pcc.py ${in} ${out} ${mcl_out}
mcl_cmd=mcl ${in} --abc -o ${out} -te 4
; ADJUST THIS
mcxdeblast_cmd=perl /apps/biotools/mcl-14.137/bin/mcxdeblast --m9 --line-mode=abc ${blast_in} > ${abc_out}
; ADJUST THIS
orthofinder_cmd=python /home/sepro/OrthoFinder-0.4/orthofinder.py -f ${fasta_dir} -t 8
; qsub parameters (OGE)
qsub_indexing=''
qsub_trimmomatic=''
qsub_tophat='-pe cores 4'
qsub_htseq_count=''
qsub_interproscan='-pe cores 5'
qsub_pcc=''
qsub_mcl='-pe cores 4'
qsub_orthofinder='-pe cores 8'
qsub_mcxdeblast=''
; qsub parameters (PBS/Torque)
; qsub_indexing=''
; qsub_trimmomatic=''
; qsub_tophat='-l nodes=1,ppn=4'
; qsub_htseq_count=''
; qsub_interproscan='-l nodes=1,ppn=5'
; qsub_pcc=''
; qsub_mcl='-l nodes=1,ppn=4'
; qsub_orthofinder='-l nodes=1,ppn=8'
; qsub_mcxdeblast=''
; qsub parameters (PBS/Torque with walltimes)
; qsub_indexing='-l walltime=00:10:00'
; qsub_trimmomatic='-l walltime=00:10:00'
; qsub_tophat='-l nodes=1,ppn=4 -l walltime=00:10:00'
; qsub_htseq_count=' -l walltime=00:02:00'
; qsub_interproscan='-l nodes=1,ppn=5 -l walltime=00:10:00'
; qsub_pcc=' -l walltime=00:10:00'
; qsub_mcl='-l nodes=1,ppn=4 -l walltime=00:10:00'
; qsub_orthofinder='-l nodes=1,ppn=8 -l walltime=01:00:00'
; qsub_mcxdeblast='-l walltime=00:10:00'
; Module names
; These need to be configured if the required tools are installed in the environment modules.
; You can find the modules installed on your system using
;
; module avail
;
; In case there is no module load system on the system set the module name to None
bowtie_module=biotools/bowtie2-2.2.6
samtools_module=biotools/samtools-1.3
sratoolkit_module=biotools/sratoolkit-2.5.7
tophat_module=biotools/tophat-2.1.0
hisat2_module=
interproscan_module=biotools/interproscan-5.16-55.0
blast_module=biotools/ncbi-blast-2.3.0+
mcl_module=biotools/mcl-14.137
python_module=devel/Python-2.7.10
python3_module=devel/Python-3.5.1
```
## data.ini
The location of your data needs to be defined in your data.ini file.
Example data.ini file:
```ini
[GLOBAL]
; add all genomes, use semi-colons to separate multiple cfr. zma;ath
genomes=zma
; enter email to receive status updates from the cluster
; setting the email to None will disable this
email=None
; orthofinder settings (runs on all species)
orthofinder_output=./output/orthofinder
[zma]
cds_fasta=
protein_fasta=
genome_fasta=
gff_file=
gff_feature=CDS
gff_id=Parent
fastq_dir=./data/zma/fastq
tophat_cutoff=65
htseq_cutoff=40
indexing_output=./output/bowtie-build/zma
trimmomatic_output=./output/trimmed_fastq/zma
alignment_output=./tmp/tophat/zma
htseq_output=./output/htseq/zma
exp_matrix_output=./output/zma/exp_matrix.txt
exp_matrix_tpm_output=./output/zma/exp_matrix.tpm.txt
exp_matrix_rpkm_output=./output/zma/exp_matrix.rpkm.txt
interpro_output=./output/interpro/zma
pcc_output=./output/zma/pcc.std.txt
pcc_mcl_output=./output/zma/pcc.mcl.txt
mcl_cluster_output=./output/zma/mcl.clusters.txt
```