Skip to content

Version 1.0 #94

Open
wants to merge 51 commits into
base: master
Choose a base branch
from
Open

Version 1.0 #94

wants to merge 51 commits into from

Conversation

renewiegandt
Copy link
Collaborator

No description provided.

…es; if compareBed.sh is not executable chmod +x is called
@msbentsen
Copy link
Member

Have you tried to run the test data? :-)

run pipeline.nf --bigwig ./demo/buenrostro50k_chr1_fp.bw --bed ./demo/buenrostro50k_chr1_peaks.bed --genome_fasta ./demo/hg38_chr1.fa --motif_db ./demo/jaspar_vertebrates.meme --out ./demo/buenrostro50k_chr1_out/ --organism hg38
gives me:

N E X T F L O W  ~  version 19.01.0
Launching `pipeline.nf` [adoring_cuvier] - revision: 47924cb4dd

        Usage: nextflow run pipeline.nf --bigwig [BigWig-file] --bed [BED-file] --genome_fasta [FASTA-file] --motif_db [MEME-file] --config [UROPA-config-file]

        Required arguments:
                --bigwig                 Path to BigWig-file
                --bed                    Path to BED-file
                --genome_fasta           Path to genome in FASTA-format
                --motif_db               Path to motif-database in MEME-format
                --config                 Path to UROPA configuration file
                --gtf_annotation        Path to gtf annotation file
                --organism               Input organism [hg38 | hg19 | mm9 | mm10]
                --out                    Output Directory (Default: './out/')

        Optional arguments:

                --help [0|1]            1 to show this help message. (Default: 0)
                --gtf_merged            Path to gtf-file. If path is set the process which creates a gtf-file is skipped.
                --tfbs_path             Path to directory with tfbsscan output. If given tfbsscan will be skipped.

                Footprint extraction:
                --window_length INT     This parameter sets the length of a sliding window. (Default: 200)
                --step INT              This parameter sets the number of positions to slide the window forward. (Default: 100)
                --percentage INT        Threshold in percent (Default: 0)
                --min_gap INT           If footprints are less than X bases apart the footprints will be merged (Default: 6)

                Filter motifs:
                --min_size_fp INT       Minimum sequence length threshold. Smaller sequences are discarded. (Default: 10)
                --max_size_fp INT       Maximum sequence length threshold. Discards all sequences longer than this value. (Default: 200)
                --tfbsscan_method [moods|fimo] Method used by tfbsscan. (Default: moods)

                Cluster:
                Sequence preparation/ reduction:
                --kmer INT              K-mer length (Default: 10)
                --aprox_motif_len INT   Motif length (Default: 10)
                --motif_occurrence FLOAT        Percentage of motifs over all sequences. Use 1 (Default) to assume every sequence contains a motif.
                --min_seq_length Interations    Remove all sequences below this value. (Default: 10)
                Clustering:
                --global INT            Global (=1) or local (=0) alignment. (Default: 0)
                --identity FLOAT        Identity threshold. (Default: 0.8)
                --sequence_coverage INT Minimum aligned nucleotides on both sequences. (Default: 8)
                --memory INT            Memory limit in MB. 0 for unlimited. (Default: 800)
                --throw_away_seq INT    Remove all sequences equal or below this length before clustering. (Default: 9)
                --strand INT            Align +/+ & +/- (= 1). Or align only +/+ (= 0). (Default: 0)

                Motif estimation:
                --min_seq INT           Sets the minimum number of sequences required for the FASTA-files given to GLAM2. (Default: 100)
                --motif_min_key INT     Minimum number of key positions (aligned columns) in the alignment done by GLAM2. (Default: 8)
                --motif_max_key INT     Maximum number of key positions (aligned columns) in the alignment done by GLAM2. (Default: 20)
                --iteration INT         Number of iterations done by GLAM2. More Iterations: better results, higher runtime. (Default: 10000)
                --tomtom_treshold FLOAT Threshold for similarity score. (Default: 0.01)
                --best_motif INT        Get the best X motifs per cluster. (Default: 3)
                --gap_penalty INT       Set penalty for gaps in GLAM2 (Default: 1000)
                --seed Set seed for GLAM2 (Default: 123456789)
                Moitf clustering:
                --cluster_motif Boolean If 1 pipeline clusters motifs. If its 0 it does not. (Defaul: 0)
                --edge_weight INT       Minimum weight of edges in motif-cluster-graph (Default: 5)
                --motif_similarity_thresh FLOAT Threshold for motif similarity score (Default: 0.00001)

                Creating GTF:
                --tissues List/String   List of one or more keywords for tissue-/category-activity, categories must be specified as in JSON
                                        config
                Evaluation:
                --max_uropa_runs INT     Maximum number UROPA runs running parallelized (Default: 10)
        All arguments can be set in the configuration files

Nextflow log contains:

Apr-07 10:53:03.160 [main] DEBUG nextflow.cli.Launcher - $> nextflow run pipeline.nf --bigwig ./demo/buenrostro50k_chr1_fp.bw --bed ./demo/buenrostro50k_chr1_peaks.bed --genome_fasta ./demo/hg38_chr1.fa --motif_db ./demo/jaspar_vertebrates.meme --out ./demo/buenrostro50k_chr1_out/ --organism hg38
Apr-07 10:53:03.397 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 19.01.0
Apr-07 10:53:03.438 [main] INFO  nextflow.cli.CmdRun - Launching `pipeline.nf` [adoring_cuvier] - revision: 47924cb4dd
Apr-07 10:53:03.470 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /mnt/agnerds/mette.bentsen/masterJLU2018/nextflow.config
Apr-07 10:53:03.471 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /mnt/agnerds/mette.bentsen/masterJLU2018/nextflow.config
Apr-07 10:53:03.559 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard`
Apr-07 10:53:04.656 [main] DEBUG nextflow.Session - Session uuid: c9cc6bb9-e1ff-44b2-a947-31a9c0c98840
Apr-07 10:53:04.657 [main] DEBUG nextflow.Session - Run name: adoring_cuvier
Apr-07 10:53:04.659 [main] DEBUG nextflow.Session - Executor pool size: 64
Apr-07 10:53:04.743 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 19.01.0 build 5050
  Modified: 22-01-2019 11:19 UTC (12:19 CEST)
  System: Linux 4.9.0-8-amd64
  Runtime: Groovy 2.5.5 on OpenJDK 64-Bit Server VM 10.0.2+13
  Encoding: UTF-8 (UTF-8)
  Process: 10062@KI-V0290 [172.16.12.72]
  CPUs: 64 - Mem: 62.9 GB (18.2 GB) - Swap: 4 GB (4 GB)
Apr-07 10:53:04.821 [main] DEBUG nextflow.Session - Work-dir: /mnt/agnerds/mette.bentsen/masterJLU2018/work [cifs]
Apr-07 10:53:05.206 [main] DEBUG nextflow.Session - Session start invoked
Apr-07 10:53:05.216 [main] DEBUG nextflow.processor.TaskDispatcher - Dispatcher > start
Apr-07 10:53:05.217 [main] DEBUG nextflow.script.ScriptRunner - > Script parsing
Apr-07 10:53:06.454 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Apr-07 10:53:06.482 [main] INFO  nextflow.Nextflow - 
	Usage: nextflow run pipeline.nf --bigwig [BigWig-file] --bed [BED-file] --genome_fasta [FASTA-file] --motif_db [MEME-file] --config [UROPA-config-file]

	Required arguments:
		--bigwig		 Path to BigWig-file
		--bed			 Path to BED-file

@renewiegandt
Copy link
Collaborator Author

Have you tried to run the test data? :-)

No... I added a new file to the demo folder and added the new required parameter to the demo run/call. It should be working now.

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants