Skip to content

Commit

Permalink
Update HELP.md
Browse files Browse the repository at this point in the history
Dependencies and workflow.
  • Loading branch information
jbayer committed Jul 7, 2015
1 parent 3314677 commit a7fcb2c
Showing 1 changed file with 28 additions and 77 deletions.
105 changes: 28 additions & 77 deletions HELP.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,12 @@ With a list of UniProt accessions and their ranking values (e.g. expression prof

### Dependencies

Linux
Python 2.7.8
matplotlib
numpy
R 3.1.1
VennDiagram
- Linux
- Python 2.7.8
- matplotlib
- numpy
- R 3.1.1
- VennDiagram

### Database background

Expand All @@ -43,82 +43,33 @@ The LimiTT workflow is shown below:

<img width="600" src="https://bioinformatics.mpi-bn.mpg.de/static/images/limitt-workflow.jpg"/>

The input (grey) is composed of an optional list of miRNAs and an optional annotation file with a transcriptome or proteome mapped to the UniProt Knowledgebase. If an annotation file was submitted, the **black** path represents the processing steps of miRNAim, otherwise, the process is described by the **red path**.
The input (grey) is composed of an optional list of miRNAs and an optional annotation file with a transcriptome or proteome mapped to the UniProt Knowledgebase. If an annotation file was submitted, the **black path** represents the processing steps of miRNAim, otherwise, the process is described by the **red path**.

**A)** The workflow starts with the selection of MTIs from the four MTI DBs in consideration of the miRNAs given in the list, if submitted. At this, the user additionally has the opportunity to choose the DBs of interest, and to filter MTIs by several of their properties, like their occurrence over the DBs, the species they belong to, the experimental methods they were validated with or their stringency in case of starBase.

B) Next, all target gene symbols of the selected MTIs are mapped to UniProtAccs, while

C) all UniProtAccs are filtered from the annotation file simultaneously.

D) Subsequently, both lists are overlapped, resulting in those MTIs which can be linked to the submitted data.
- **A** The workflow starts with the selection of MTIs from the four MTI DBs in consideration of the miRNAs given in the list, if submitted. At this, the user additionally has the opportunity to choose the DBs of interest, and to filter MTIs by several of their properties, like their occurrence over the DBs, the species they belong to, the experimental methods they were validated with or their stringency in case of starBase.
- **B** Next, all target gene symbols of the selected MTIs are mapped to UniProtAccs, while
- **C** all UniProtAccs are filtered from the annotation file simultaneously.
- **D** Subsequently, both lists are overlapped, resulting in those MTIs which can be linked to the submitted data.
In case of a missing annotation file, steps (C) and (D) are ignored, and the resulting MTIs rely on the miRNA list or just on the adjustable properties.

E) As an option, an enrichment analysis of the identified MTI sets is realized by submitting a ranked list with UniProtAccs. The analysis is based on the R implementation of GSEA (Subramanian, et al., 2005).
- **E** As an option, an enrichment analysis of the identified MTI sets is realized by submitting a ranked list with UniProtAccs. The analysis is based on the R implementation of GSEA (Subramanian, et al., 2005).

### Parameters

-cl
Use the cluster parameter if you have no miRNA input but want the miRNAs to be
clustered over species and hairpin arm information (hsa-miR-123a-5p -> miR-123a).
Otherwise miRNAs are distinguished by their full nomenclature.

-base
Expects the MTI DBs (abbreviated with numbers) of interest separated by space.
Default is "-base 1 2 3 4" with 1: TarBase, 2: miRTarBase, 3: miRecords,
4: starBase.

-occ
If more than one DB was selected, the occurrence parameter can be used to define
the minimum number of DBs the MTIs have to occur in. Due to four possible MTI
DBs, the value range is between 1 and 4 with a default value of 2 DBs. If the
manually set value is higher than the number of selected DBs, it is
automatically changed to the number of DBs.

-exp
Experimental methods parameter to select the methods of interest.
The following categories are existent:
Western blot
Reporter assay
qPCR
Microarray
NGS
Other
Additionally it is possible to select experiments by distinctive substrings like
"race" or "chip". Separate categories/substrings with space and surround phrases
with a space in it by quotes (e.g. "Western blot").The comparison is not case
sensitive.

-spec
Species parameter to select the species and/or species category of interest,
delimited by space. Single species need to be named by their miRNA specific
abbreviation (e.g. hsa for Homo Sapiens). Additionally it is possible to select
from the following species categories by passing the single letter abbreviation:
a - animals (14 species)
p - plants ( 6 species)
v - viruses ( 4 species)
f - fungi ( 1 species)
e - excavata ( 1 species)
To ignore species information, pass the letter i. By ignoring species, target to
UniProt accession mapping will be done without species consideration.

-nspec
Parameter for ignoring species or species categories.\nSee above for more information.

-str
Expects a number between 1 and 3 or 5 which describes the number of CLIP-Seq
experiments supporting the MTIs within starBase.

-perm
Integer describing the number of permutations to calculate the Normalized
Enrichment Score (NES) and the False Discovery Rate (FDR) q-value for the MTI
set enrichment analysis. If the number of permutations is too small, NES and
FDR q-value of sets might result in NaN values.

-p
Integer describing the weighting of the ranking values to calculate the
Enrichment Score (ES) for MTI set enrichment analysis. The original GSEA paper
recommends to use a weighting of 0 if the ranking values are not normalized.
A list of parameters can be obtained by calling 'LimiTT -h'

Parameter | Explanation
----------|------------
-ia | Tab separated annotation file.
-im | Tab separated miRNA file.
-ir | Tab separated ranking file.
-cl | Use the cluster parameter if you have no miRNA input but want the miRNAs to be clustered over species and hairpin arm information (hsa-miR-123a-5p -> miR-123a). Otherwise miRNAs are distinguished by their full nomenclature.
-base | MTI DBs (abbreviated with numbers) of interest separated by space. Default is "-base 1 2 3 4" with 1: TarBase, 2: miRTarBase, 3: miRecords, 4: starBase.
-occ | If more than one DB was selected, the occurrence parameter can be used to define the minimum number of DBs the MTIs have to occur in. Due to four possible MTI DBs, the value range is between 1 and 4 with a default value of 2 DBs. If the manually set value is higher than the number of selected DBs, it is automatically changed to the number of DBs.
-exp | Experimental methods parameter to select the methods of interest. The following categories are existent: Western blot, Reporter assay, qPCR, Microarray, NGS and Other. Additionally it is possible to select experiments by distinctive substrings like "race" or "chip". Separate categories/substrings with space and surround phrases with a space in it by quotes (e.g. "Western blot").The comparison is not case sensitive.
-spec | Species parameter to select the species and/or species category of interest, delimited by space. Single species need to be named by their miRNA specific abbreviation (e.g. hsa for Homo Sapiens). Additionally it is possible to select from the following species categories by passing the single letter abbreviation: a - animals (14 species), p - plants (6 species), v - viruses (4 species), f - fungi (1 species) amd e - excavata (1 species). To ignore species information, pass the letter i. By ignoring species, target to UniProt accession mapping will be done without species consideration.
-nspec | Parameter for ignoring species or species categories.\nSee above for more information.
-str | Expects a number between 1 and 3 or 5 which describes the number of CLIP-Seq experiments supporting the MTIs within starBase.
-perm | Integer describing the number of permutations to calculate the Normalized Enrichment Score (NES) and the False Discovery Rate (FDR) q-value for the MTI set enrichment analysis. If the number of permutations is too small, NES and FDR q-value of sets might result in NaN values.
-p | Integer describing the weighting of the ranking values to calculate the Enrichment Score (ES) for MTI set enrichment analysis. The original GSEA paper recommends to use a weighting of 0 if the ranking values are not normalized.

### Input

Expand Down

0 comments on commit a7fcb2c

Please sign in to comment.