Skip to content
This repository has been archived by the owner. It is now read-only.
Permalink
64fce7794e
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
41 lines (32 sloc) 2.57 KB
```{r parameters-and-defaults, include = FALSE}
module <- "scRNAseq"
section <- "normalization"
```
```{r parameter-merge, include = FALSE}
local_params <- module %>%
options() %>%
magrittr::extract2(module) %>%
magrittr::extract2(section) %>%
ReporteR.base::validate_params(parameters_and_defaults)
```
```{r scRNAseq-normalization-Z-compare-checks, include = FALSE}
if (assertive.properties::is_non_empty(local_params$batch)) {
batch <- local_params$batch
} else {
if (assertive.properties::is_empty(local_params$features)) {
batch <- NULL
} else {
batch <- local_params$features[1]
}
}
```
### Comparison
Essentially, a gene expression normalization procedure tries to remove unwanted variation that is introduced by technical confounders, e.g. sequencing depth, or by processing in different laboratories, on different days, using different machines (so-called *batch effects*). In order to assess whether a particular normalization strategy has been successful in removing unwanted variation, **R**elative **L**og **E**xpression (RLE) plots [@gandolfo_rle_2018] can be used.
The RLE values are computed by calculating the deviation between the expression of a feature and the median expression of this feature across all samples of the experiment. Assuming that $Y_{i,j}$ represents the (potentially normalized) log expression of gene $i$ in cell $j$, the deviations of the relative log expression (RLE values) are calulated for each sample:
$RLE_{j} = Y_{i,j} - median(Y_{i})$
An RLE plot shows the deviations for each sample in a boxplot (Figure \@ref(fig:scRNAseq-normalization-Z-compare-figure)). Under the basic assumption that most features are not changed across samples and no unwanted variation is present (anymore), deviations would be centered around 0 and have a similar spread (random variation). Other behavior would be a sign of large between sample heterogeneity and/or failed removal of unwanted variation.
```{r scRNAseq-normalization-Z-compare-figure, echo=FALSE, message=FALSE, warning=FALSE, fig.cap="Relative log expression (RLE) values for different normalizations. Boxplots represent the log deviations from the mean and should be centered near Zero with a similar spread. Watch out for other behavior, which can be a sign for failed or inappropriate normalization."}
methods <- c("sumfactor", "TMM")
exprs_mat_list <- setNames(as.list(paste0("norm_", methods)), methods)
scater::plotRLE(object_filtered, exprs_mats = c(list(tpm = 'abundance', counts = 'counts'), exprs_mat_list), exprs_logged = c(F, F, rep(T, times = length(exprs_mat_list))), colour_by = batch)
```