Skip to content

Commit

Permalink
Merge branch 'dev' into tfbsscan-shebang
Browse files Browse the repository at this point in the history
  • Loading branch information
HendrikSchultheis committed Jan 10, 2019
2 parents e3dc513 + 13a610f commit 0c22a5e
Show file tree
Hide file tree
Showing 48 changed files with 2,099 additions and 1,014 deletions.
206 changes: 206 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# Created by .ignore support plugin (hsz.mobi)
### JetBrains template
# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm
# Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839

# User-specific stuff
.idea
.idea/**/tasks.xml
.idea/**/usage.statistics.xml
.idea/**/dictionaries
.idea/**/shelf

# Sensitive or high-churn files
.idea/**/dataSources/
.idea/**/dataSources.ids
.idea/**/dataSources.local.xml
.idea/**/sqlDataSources.xml
.idea/**/dynamic.xml
.idea/**/uiDesigner.xml
.idea/**/dbnavigator.xml

# Gradle
.idea/**/gradle.xml
.idea/**/libraries

# Gradle and Maven with auto-import
# When using Gradle or Maven with auto-import, you should exclude module files,
# since they will be recreated, and may cause churn. Uncomment if using
# auto-import.
# .idea/modules.xml
# .idea/*.iml
# .idea/modules

# CMake
cmake-build-*/

# Mongo Explorer plugin
.idea/**/mongoSettings.xml

# File-based project format
*.iws

# IntelliJ
out/

# mpeltonen/sbt-idea plugin
.idea_modules/

# JIRA plugin
atlassian-ide-plugin.xml

# Cursive Clojure plugin
.idea/replstate.xml

# Crashlytics plugin (for Android Studio and IntelliJ)
com_crashlytics_export_strings.xml
crashlytics.properties
crashlytics-build.properties
fabric.properties

# Editor-based Rest Client
.idea/httpRequests
### R template
# History files
.Rhistory
.Rapp.history

# Session Data files
.RData

# Example code in package build process
*-Ex.R

# Output files from R CMD build
/*.tar.gz

# Output files from R CMD check
/*.Rcheck/

# RStudio files
.Rproj.user/

# produced vignettes
vignettes/*.html
vignettes/*.pdf

# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth

# knitr and R markdown default cache directories
/*_cache/
/cache/

# Temporary files created by R markdown
*.utf8.md
*.knit.md

# Shiny token, see https://shiny.rstudio.com/articles/shinyapps.html
rsconnect/
### Python template
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/


# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
/bin/3.1_create_gtf/data/
63 changes: 33 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,29 @@
# masterJLU2018

De novo motif discovery and evaluation based on footprints identified by TOBIAS
De novo motif discovery and evaluation based on footprints identified by TOBIAS.

For further information read the [documentation](https://github.molgen.mpg.de/loosolab/masterJLU2018/wiki)
For further information read the [documentation](https://github.molgen.mpg.de/loosolab/masterJLU2018/wiki).

## Dependencies
* [conda](https://conda.io/docs/user-guide/install/linux.html)
* [Nextflow](https://www.nextflow.io/)
* [MEME-Suite](http://meme-suite.org/doc/install.html?man_type=web)

## Installation
Start with installing all dependencies listed above. It is required to set the [enviroment paths for meme-suite](http://meme-suite.org/doc/install.html?man_type=web#installingtar).
Start with installing all dependencies listed above (Nextflow, conda, MEME-Suite) and downloading all files from the [GitHub repository](https://github.molgen.mpg.de/loosolab/masterJLU2018).
It is required to set the [enviroment paths for meme-suite](http://meme-suite.org/doc/install.html?man_type=web#installingtar).
this can be done with following commands:
```
export PATH=[meme-suite instalation path]/libexec/meme-[meme-suite version]:$PATH
export PATH=[meme-suite instalation path]/bin:$PATH
```


Download all files from the [GitHub repository](https://github.molgen.mpg.de/loosolab/masterJLU2018).
The Nextflow-script needs a conda enviroment to run. Nextflow can create the needed enviroment from the given yaml-file.
On some systems Nextflow exits the run with following error:
```
Caused by:
Failed to create Conda environment
command: conda env create --prefix --file env.yml
status : 143
message:
```
If this error occurs you have to create the enviroment before starting the pipeline.
To create this enviroment you need the yml-file from the repository.
Run the following commands to create the enviroment:
```console
path=[Path to given masterenv.yml file]
conda env create --name masterenv -f=$path
```
When the enviroment is created, set the variable 'path_env' in the configuration file as the path to it.
Every other dependency will be automatically installed by Nextflow using conda. For that a new conda enviroment will be created, which can be found in the from Nextflow created work directory after the first pipeline run.
It is **not** required to create and activate the enviroment from the yaml-file beforehand.

**Important Note:** For conda the channel bioconda needs to be set as highest priority! This is required due to two differnt packages with the same name in different channels. For the pipeline the package jellyfish from the channel bioconda is needed and **NOT** the jellyfisch package from the channel conda-forge!


## Quick Start
```console
nextflow run pipeline.nf --bigwig [BigWig-file] --bed [BED-file] --genome_fasta [FASTA-file] --motif_db [MEME-file] --config [UROPA-config-file]
Expand All @@ -52,14 +37,16 @@ Required arguments:
--genome_fasta Path to genome in FASTA-format
--motif_db Path to motif-database in MEME-format
--config Path to UROPA configuration file
--create_known_tfbs_path Path to directory where output from tfbsscan (known motifs) are stored.
Path can be set as tfbs_path in next run. (Default: './')
--out Output Directory (Default: './out/')
--organism Input organism [hg38 | hg19 | mm9 | mm10]
--out Output Directory (Default: './out/')
Optional arguments:
--help [0|1] 1 to show this help message. (Default: 0)
--tfbs_path Path to directory with output from tfbsscan. If given tfbsscan will not be run.
--create_known_tfbs_path Path to directory where output from tfbsscan (known motifs) are stored.
Path can be set as tfbs_path in next run. (Default: './')
--gtf_path Path to gtf-file. If path is set the process which creats a gtf-file is skipped.
Footprint extraction:
--window_length INT This parameter sets the length of a sliding window. (Default: 200)
Expand Down Expand Up @@ -99,12 +86,28 @@ Optional arguments:
--motif_similarity_thresh FLOAT Threshold for motif similarity score (Default: 0.00001)
Creating GTF:
--organism [hg38 | hg19 | mm9 | mm10] Input organism
--tissues List/String List of one or more keywords for tissue-/category-activity, categories must be specified as in JSON
config
All arguments can be set in the configuration files
```

For further information read the [documentation](https://github.molgen.mpg.de/loosolab/masterJLU2018/wiki).


For further information read the [documentation](https://github.molgen.mpg.de/loosolab/masterJLU2018/wiki)
## Known issues
The Nextflow-script needs a conda enviroment to run. Nextflow creates the needed enviroment from the given yaml-file.
On some systems Nextflow exits the run with following error:
```
Caused by:
Failed to create Conda environment
command: conda env create --prefix --file env.yml
status : 143
message:
```
If this error occurs you have to create the enviroment before starting the pipeline.
To create this enviroment you need the yml-file from the repository.
Run the following commands to create the enviroment:
```console
path=[Path to given masterenv.yml file]
conda env create --name masterenv -f $path
```
When the enviroment is created, set the variable 'path_env' in the configuration file as the path to it.
Loading

0 comments on commit 0c22a5e

Please sign in to comment.