Skip to content
This repository has been archived by the owner. It is now read-only.

Commit

Permalink
added dependency and adjusted usage
Browse files Browse the repository at this point in the history
  • Loading branch information
afust committed Sep 5, 2017
1 parent 2ba4a87 commit 3380739
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 8 deletions.
28 changes: 21 additions & 7 deletions docs_rst/custom.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
UROPA to GTF utility
====================
The GTF file is a common format used for annotation. UROPA accepts all GTF files downloaded from any online databases,
such as UCSC, ensembl, or gencode. Additionally, custom GTF files can also be used.
such as UCSC, ensembl, or gencode. Additionally, custom GTF files can be used.


The gencode v19 annotation GTF is illustrated in Table 7.1.
Expand All @@ -14,7 +14,7 @@ The gencode v19 annotation GTF is illustrated in Table 7.1.

**Table 7.1:** First row of gencode v19 GTF file

The UROPAtoGTF-tool transforms annotation files that are not in GTF format.
The uropa2gtf-tool transforms annotation files that are not in GTF format.
Files for annotation can be generated for instance by the `UCSC Table Browser`_ , or many other data bases.
For the conversion, the input annotation file needs to have a header, and there need to be columns with information about the
location: 'chr', 'start', and 'end' . Additionally, the file should be tab separated.
Expand All @@ -28,21 +28,35 @@ There are two variations of the GTF file generator:

1. One file should be converted and used for annotation. The GTF file keeps the same as the base file name.
2. Several files should be used for annotation. In this case the input should be a folder with all annotation files included (but no others).
The files will be converted one by one; additionally one merged GTF file (called UROPAtoGTF_merged.GTF) will be created.
The files will be converted one by one; additionally one merged GTF file (called uropa2gtf_'basename of the input dir'.gtf) will be created.
For the merged file, the explicit file names are important for distinguishing the annotated features.

The generated files will be stored in the same directory as the input file is located.

Furthermore, there are three optional arguments that can be given for the transformation. Those are source, feature and threads.
If an optional argument will be used, it should be used with ``source=yourSource``, ``feature=yourFeature``, and ``threads=#threads``, e.g. ``source=UCSC feature=tfbs threads=5``.
Beside the mandatory input, there are three optional arguments that can be given for the transformation. Those are the source, feature and number of threads.
The usage of this utility is very simple:

.. code:: bash
uropa2gtf.R -i input
This is the basic usage, by using further parameters, further features can be specified with

.. code:: bash
uropa2gtf.R -i input -s yourSource -f yourFeature -t number-threads
e.g.
uropa2gtf.R -i wgEncodeAwgTfbsBroadHuvecCtcfUniPk.txt -s ucsc -f tfbs -t 5
The two arguments source and feature are used for the GTF reformatting itself. The argument threads can be used if multiprocessing should be used.
There should be no spaces between the character and the equal sign when using the parameters in the command line call.

If optional arguments are given, they will overwrite information from the input file(s).
Within the transformation, the input file is checked for information that is necessary for the GTF file format, like chr, start, end, strand, and others.
If the information is present in the input file, it will be adopted to the new GTF file.
The optional arguments are used in the GTF file for the corresponding columns, if one optional argument is not given and this information is also not present in the input file,
the column will be filled with UNDEFINED. For other information that is not present in the input file, the column will be filled with dots.
the column will be filled with *undefined*. For other information that is not present in the input file, the column will be filled with dots.
All additional columns presented in the input file will be merged in the attributes column of the new GTF file. All that information can be shown as annotation specification using the ``show.attribute`` key using UROPA.
Furthermore, these are the attributes which can be filtered for specific values with the two linked keys ``filter.attribute`` and ``attribute.value``.

Expand All @@ -61,7 +75,7 @@ For instance, this is handy for an ATAC-seq peak annotation.

**Table 7.2:** Downloaded table from UCSC Table Browser (wgEncodeAwgTfbsBroadHuvecCtcfUniPk) for CTCF transcription factor from Uniform TFBS track.

After transformation with ``feature=tfbs`` and ``source=ucsc``, the GTF format annotation file will look as displayed in Table 7.3.
After transformation with ``-f tfbs`` and ``-s ucsc``, the GTF format annotation file will look as displayed in Table 7.3.

+------+------+------+---------+---------+------+---+---+------------------------------------------------------------------------------------------------------------+
| chr1 | ucsc | tfbs | 1310465 | 1310835 | 244 | . | . | bin 74; signalvalue 372.141; pvalue -1; qvalue 482.217; peak 185; table wgEncodeAwgTfbsBroadHuvecCtcfUniPk |
Expand Down
2 changes: 1 addition & 1 deletion docs_rst/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ For running UROPA locally, the following prerequisites have to be met:
- `R/Rscript`_, v3.3.0 or higher (follow the instructions on url)
Install packages:

- ``install.packages(c("ggplot2", "devtools", "gplots", "gridExtra", "jsonlite", "VennDiagram", "snow"))``
- ``install.packages(c("ggplot2", "devtools", "gplots", "gridExtra", "jsonlite", "VennDiagram", "snow", "getopt"))``

## choose mirrow

Expand Down

0 comments on commit 3380739

Please sign in to comment.