Skip to content
Merged
merged 12 commits into from
Dec 14, 2018
Prev Previous commit
Next Next commit
Update README.md
renewiegandt committed Dec 14, 2018
commit aed0e60532a1e35b0d7f61f11b40e6c70a62d4b6
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -53,38 +53,48 @@ Required arguments:
--motif_db Path to motif-database in MEME-format
--config Path to UROPA configuration file
--create_known_tfbs_path Path to directory where output from tfbsscan (known motifs) are stored.
Path can be set as tfbs_path in next run. (Default: './')
Path can be set as tfbs_path in next run. (Default: './')
Optional arguments:

--tfbs_path Path to directory with output from tfbsscan. If given tfbsscan will not be run.

Footprint extraction:
--window_length INT (Default: 200)
--step INT (Default: 100)
--percentage INT (Default: 0)

Filter unknown motifs:
--min_size_fp INT (Default: 10)
--max_size_fp INT (Default: 100)

Clustering:
Sequence preparation/ reduction:
--kmer INT Kmer length (Default: 10)
--aprox_motif_len INT Motif length (Default: 10)
--motif_occurence FLOAT Percentage of motifs over all sequences. Use 1 (Default) to assume every sequence contains a motif.
--min_seq_length Interations Remove all sequences below this value. (Default: 10)

Clustering:
--global INT Global (=1) or local (=0) alignment. (Default: 0)
--identity FLOAT Identity threshold. (Default: 0.8)
--sequence_coverage INT Minimum aligned nucleotides on both sequences. (Default: 8)
--memory INT Memory limit in MB. 0 for unlimited. (Default: 800)
--throw_away_seq INT Remove all sequences equal or below this length before clustering. (Default: 9)
--strand INT Align +/+ & +/- (= 1). Or align only +/+ (= 0). (Default: 0)

Motif estimation:
--min_seq INT Sets the minimum number of sequences required for the FASTA-files given to GLAM2. (Default: 100)
--motif_min_len INT Minimum length of Motif (Default: 8)
--motif_max_len INT Maximum length of Motif (Default: 20)
--iteration INT Number of iterations done by glam2. More Iterations: better results, higher runtime. (Default: 10000)
--iteration INT Number of iterations done by glam2. More Iterations: better results, higher runtime. (Default: 10000)
--tomtom_treshold float Threshold for similarity score. (Default: 0.01)
--best_motif INT Get the best X motifs per cluster. (Default: 3)

Moitf clustering:
--cluster_motif Boolean If 1 pipeline clusters motifs. If its 0 it does not. (Defaul: 0)
--edge_weight INT Minimum weight of edges in motif-cluster-graph (Default: 5)
--motif_similarity_thresh FLOAT Threshold for motif similarity score (Default: 0.00001)

Creating GTF:
--organism [hg38 | hg19 | mm9 | mm10]
--tissues