RNA-Seq

Gene Expression

Gene and transcript level expression analysis represents the backbone for most omics research.

Challenges

High-throughput RNA-Seq analysis of the cellular transcriptome has become largely standardized over the years. Nonetheless, many options exist to deal with various steps and issues along the way, that can have a dramatic impact on the interpretation and reliability of the results.

  • Quality control (Contamination with other organisms, rRNA depletion, mtRNA depletion, removal of sequencing adapter/primer)
  • Optimal mapping of spliced reads (multi-mapping, PCR duplicates)
  • Optional batch correction (count matrix, contrast)
  • Incremental annotation including multiple databases (Ensembl/Gencode, UniProt, Pathways/Ontologies)
  • Production of abstract visualizations to assess data quality and depict results (PCA, GSEA, Locus plots, Venn/UpsetR)
  • Assembly of tables/figures/methods into concise and user-friendly collections (Excel, Powerpoint)

Pipeline

Results

Typical plots produced by the pipeline. A) Scatterplot of all expressed gene counts including spearman correlation, B) Hierarchical clustering of all rlog-transformed gene counts per sample, C) PCA, D) Volcano plot for each contrast, E) Coverage of differentially expressed genes, F) Gene set enrichment analyses.