ANGIE is an analytic workflow to investigate CRISPR-Cas-induced genomic excisions. This repository provides a set of scripts that are focused on analyzing CRISPR-Cas9 multiplexed plasmid libraries, which aim to excise genomic regions (preferably exons) of a specified gene. The excisions can be identified by extracting the mRNA. The mRNA needs to be converted to cDNA and sequenced using Oxford NanoPore. Additional to the NanoPore sequencing, the guides present in the cell need to be sequenced using NGS.
The repository provides a YAML file from which a conda environment can be created, which includes all dependencies.
conda env create -f conda_env.yml
The final data needed to run this Pipeline is:
- A FASTQ-file containing all NanoPore reads
- The reference gene in FASTA format
- Exon gene annotation file in BED3 format
- All guide sequences in FASTA format
- A TSV-file with guide pair NGS counts
- Example file:
GuideA GuideB Rb_Tiling_ctrl_A Rb_tiling_lib Rb_Tiling_ctrl_C Rb_Tiling_end_B Rb_Tiling_end_D exon1_1234_0.60 exon2_1234_0.60 100 50 112 345 367 exon1_1234_0.60 exon3_1234_0.60 20 25 10 490 510 exon1_1234_0.60 exon4_1234_0.60 101 120 39 1123 1000
- Example file:
Note: The guide names need to be in following format: exon[X]_[locus]_[doench score]
1 Scatterplot of Guide Exon-Exon pairs. Dot size indicates number of pairs. Color indicates log2fold change from library to end state. 2 Scatterplot of Excision pairs found in NanoPore reads. Size indicates normalized count. Color indicates mean excision length. 3 Scatterplot of Excision pairs not found in NanoPore reads. Color indicates mean excision length. Quadrants are indicated by named boxes A, B and C.
Percentage of found excisions compared to possible excisions per quadrant.
Boxplots of mean excision length and normalized NanoPore count per quadrant.