This repository has been archived by the owner. It is now read-only.
Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
ReporteR.scRNAseq/inst/content/03-normalization-B-TMM.Rmd
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
79 lines (66 sloc)
5.62 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
```{r parameters-and-defaults, include = FALSE} | |
module <- "scRNAseq" | |
section <- "normalization" | |
``` | |
```{r parameter-merge, include = FALSE} | |
local_params <- module %>% | |
options() %>% | |
magrittr::extract2(module) %>% | |
magrittr::extract2(section) %>% | |
ReporteR.base::validate_params(parameters_and_defaults) | |
``` | |
```{r scRNAseq-normalization-B-TMM-checks, include = FALSE} | |
num_figure_rows <- 3 | |
assertive.sets::assert_is_subset(local_params$features, colnames(SummarizedExperiment::colData(object_filtered))) | |
if (assertive.properties::is_empty(local_params$features)) { | |
local_params$features <- c(NULL) | |
num_figure_rows <- 2 | |
} | |
``` | |
### TMM normalization | |
Scaling to library size as a form of normalization makes intuitive sense, given it is expected that sequencing a sample to half the depth will give, on average, half the number of reads mapping to each gene. We believe this is appropriate for normalizing between replicate samples of an RNA population. However, library size scaling is too simple for many biological applications. The number of tags expected to map to a gene is not only dependent on the expression level and length of the gene, but also the composition of the RNA population that is being sampled. Thus, if a large number of genes are unique to, or highly expressed in, one experimental condition, the sequencing 'real estate' available for the remaining genes in that sample is decreased. If not adjusted for, this sampling artifact can force the DE analysis to be skewed towards one experimental condition. | |
TMM normalization [@robinson_tmm_2010] essentially determines a global fold change between the relative RNA production of two samples. They propose an empirical strategy that equates the overall expression levels of genes between samples under the assumption that the majority of them are not differentially expressed. The simple yet robust way to estimate the ratio of RNA production uses a weighted trimmed mean of the log expression ratios (trimmed mean of M values (TMM)). Figure \@ref(fig:scRNAseq-normalization-B-TMM-figure) depicts effects of the normalization strategy. | |
```{r scRNAseq-normalization-B-TMM-processing, include = FALSE, echo = FALSE} | |
object_filtered <- scater::normalizeExprs(object_filtered, method="TMM") | |
SummarizedExperiment::assay(object_filtered, "norm_TMM") <- SummarizedExperiment::assay(object_filtered, "logcounts") | |
# Reset logcounts | |
SummarizedExperiment::assay(object_filtered, "logcounts") <- log2(SummarizedExperiment::assay(object_filtered, "counts") + 1) | |
``` | |
```{r scRNAseq-normalization-B-TMM-figure-params, include = FALSE, echo = FALSE} | |
fig_height <- ReporteR.base::estimate_figure_height( | |
height_in_panels = num_figure_rows, | |
panel_height_in_in = params$formatting_defaults$figures$panel_height_in, | |
axis_space_in_in = params$formatting_defaults$figures$axis_space_in, | |
mpf_row_space = as.numeric(grid::convertUnit(grid::unit(5, 'mm'), 'in')), | |
max_height_in_in = params$formatting_defaults$figures$max_height_in) | |
``` | |
```{r scRNAseq-normalization-B-TMM-figure, echo = FALSE, message=FALSE, warning=FALSE, fig.height = fig_height$global, fig.cap = paste("Results of TMM normalization.", caption_norm_pca, ifelse(num_figure_rows == 3, caption_norm_pca_extra, ""))} | |
figure_normalization_TMM <- multipanelfigure::multi_panel_figure(height = fig_height$sub, columns = 3, rows = num_figure_rows, unit = "in") | |
# Based on raw counts | |
figure_normalization_TMM <- multipanelfigure::fill_panel(figure_normalization_TMM, | |
scater::plotPCASCE(object_filtered, ntop = 10, exprs_values = "logcounts", colour_by = local_params$features[1], add_ticks = FALSE) + | |
theme_norm_pca) | |
figure_normalization_TMM <- multipanelfigure::fill_panel(figure_normalization_TMM, | |
scater::plotExplanatoryVariables(object_filtered, exprs_values = "logcounts", variables = local_params$features) + | |
theme_norm_pca, | |
column = 2:3) | |
# Based on normalized values | |
figure_normalization_TMM <- multipanelfigure::fill_panel(figure_normalization_TMM, | |
scater::plotPCASCE(object_filtered, ntop = 10, exprs_values = "norm_TMM", colour_by = local_params$features[1], add_ticks = FALSE) + | |
ggplot2::guides(colour = FALSE) + | |
theme_norm_pca) | |
figure_normalization_TMM <- multipanelfigure::fill_panel(figure_normalization_TMM, | |
scater::plotExplanatoryVariables(object_filtered, exprs_values = "norm_TMM", variables = local_params$features) + | |
theme_norm_pca, | |
column = 2:3) | |
# Additional panels for first three variables | |
if(num_figure_rows == 3) { | |
for(i in 1:min(3, length(local_params$features))) { | |
figure_normalization_TMM <- multipanelfigure::fill_panel(figure_normalization_TMM, | |
scater::plotPCASCE(object_filtered, ntop = 10, exprs_values = "norm_TMM", colour_by = local_params$features[i], add_ticks = FALSE) + | |
ggplot2::guides(colour = FALSE) + | |
theme_norm_pca) | |
} | |
} | |
figure_normalization_TMM | |
``` |