Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
mpip_cc_smk_tutorial/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
47 lines (29 sloc)
1.36 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Code Club: Gentle Introduction to Snakemake | |
## Exercises | |
### Exercise ex1: | |
Write a Snakemake rule __generate_data__ using Python language to generate two vectors of random numbers of size 100 and save each vector to a separate text file: _x.txt_ and _y.txt_. | |
**Hint 1.** Import numpy library | |
```{py} | |
import numpy as np | |
``` | |
**Hint 2.** Define the names of the output files before the rules | |
```{py} | |
file_ids = ['x', 'y'] | |
``` | |
**Hint 3.** Use special Snakemake function to create a list of output filenames from a pattern (for example, in the rule all) | |
```{} | |
expand("data/{file}.txt", file = file_ids) | |
``` | |
___ | |
### Exercise ex2: | |
Write a snakemake rule **subset_data** using shell to create two subsets from the generated files, so that each new file will contain the value less or equal to **0.5**. Save each new generated vector to a separate text file: _x_subset.txt_ and _y_subset.txt_. | |
**Hint 1.** Use the ```awk``` shell command to subset the data and ```>``` to redirect the output: | |
```{sh} | |
awk ‘$1 <= 0.5’ {input} > {output} | |
``` | |
____ | |
### Exercise ex3: | |
Write a snakemake rule using the custom R script (scripts/plot.R) to plot two vectors generated in the **ex2**. | |
____ | |
### Exercise ex4: | |
Combine all rules into one pipeline, delete all before generated files and run the whole pipeline again specifying the final output. | |