Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
proost committed Jul 25, 2017
1 parent a35b62c commit e7f8ddc
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 40 deletions.
46 changes: 6 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,51 +57,17 @@ Furthermore, steps can be skipped (to avoid re-running steps unnecessarily). Use

./run.py -h

## Obtaining and preparing data
## Further reading

Scripts to download and prepare data from the [Sequence Read Archive](https://www.ncbi.nlm.nih.gov/sra) are included in
LSTrAP in the folder **helper**. Furthermore, it is recommended to remove splice variants from the GFF3 files, a script
to do that is included there as well. Detailed instructions for each script provided to obtain and prepare data can be
found [here](docs/helper.md)
* [LSTrAP output](docs/example_output.md)
* [Quality statistics](docs/quality.md): How to check the quality of samples and remove problematic samples
* [Helper Scripts](docs/helper.md): To acquire data from the [Sequence Read Archive](https://www.ncbi.nlm.nih.gov/sra)
and process results.

## Quality report

After running LSTrAP a log file (*lstrap.log*) is written, in which samples which failed a quality measure
are reported. Note that __no samples are excluded from the final network__. In case certain samples need to be excluded
from the final network remove the htseq file for the sample you which to exclude and re-run the pipeline skipping all
steps prior to building the network.

./run.py config.ini data.ini --skip-interpro --skip-orthology --skip-bowtie-build --skip-trim-fastq --skip-tophat --skip-htseq --skip-qc

More information on how the quality of samples is determined can be found [here](docs/quality.md).

## Output

Apart from the output all tools included generate, LSTrAP will generate raw and normalized expression matrices, a
co‑expression network and co‑expression clusters.

A detailed overview of files produces, including examples, can be found [here](docs/example_output.md).

## Helper Scripts

LSTrAP comes with a few additional scripts to assist users to download and process data from the [Sequence Read Archive](http://www.ncbi.nlm.nih.gov/sra),
repeat analyses and the case study reported in the manuscript (Proost et al., *under preparation*).

Details for each script can be found [here](docs/helper.md)

## Running LSTrAP on transcriptome data

To use LSTrAP on a *de novo* assembled transcriptome a little pre-processing is required. Instead of the genome a fasta
file containing **coding** sequences can be used (remove UTRs). Using the helper script fasta_to_gff.py a gff file suited
for LSTrAP can be generated.

python3 fasta_to_gff.py /path/to/transcript.cds.fasta > output.gff



## Contact

LSTrAP was developed by [Sebastian Proost](mailto:proost@mpimp-golm.mpg.de) and [Marek Mutwil](mailto:mutwil@mpimp-golm.mpg.de) at the [Max-Planck Institute for Molecular Plant Physiology](http://www.mpimp-golm.mpg.de/2168/en)
LSTrAP was developed by [Sebastian Proost](mailto:proost@mpimp-golm.mpg.de) and [Marek Mutwil](mailto:mutwil@gmail.com) at the [Max-Planck Institute for Molecular Plant Physiology](http://www.mpimp-golm.mpg.de/2168/en)

## Acknowledgements and Funding

Expand Down
5 changes: 5 additions & 0 deletions docs/helper.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ Script to convert sra files into fastq. Sratools is required.

python3 sra_to_fastq.py /sra/files/directory /fastq/output/directory

## Running LSTrAP on transcriptome data

To use LSTrAP on a *de novo* assembled transcriptome a little pre-processing is required. Instead of the genome a fasta
file containing **coding** sequences can be used (remove UTRs). Using the helper script fasta_to_gff.py a gff file suited
for LSTrAP can be generated.

### parse_gff.py

Expand Down

0 comments on commit e7f8ddc

Please sign in to comment.