This repository has been archived by the owner. It is now read-only.
Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
ReporteR.scRNAseq/inst/content/03-normalization-Z-compare.Rmd
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
41 lines (32 sloc)
2.57 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
```{r parameters-and-defaults, include = FALSE} | |
module <- "scRNAseq" | |
section <- "normalization" | |
``` | |
```{r parameter-merge, include = FALSE} | |
local_params <- module %>% | |
options() %>% | |
magrittr::extract2(module) %>% | |
magrittr::extract2(section) %>% | |
ReporteR.base::validate_params(parameters_and_defaults) | |
``` | |
```{r scRNAseq-normalization-Z-compare-checks, include = FALSE} | |
if (assertive.properties::is_non_empty(local_params$batch)) { | |
batch <- local_params$batch | |
} else { | |
if (assertive.properties::is_empty(local_params$features)) { | |
batch <- NULL | |
} else { | |
batch <- local_params$features[1] | |
} | |
} | |
``` | |
### Comparison | |
Essentially, a gene expression normalization procedure tries to remove unwanted variation that is introduced by technical confounders, e.g. sequencing depth, or by processing in different laboratories, on different days, using different machines (so-called *batch effects*). In order to assess whether a particular normalization strategy has been successful in removing unwanted variation, **R**elative **L**og **E**xpression (RLE) plots [@gandolfo_rle_2018] can be used. | |
The RLE values are computed by calculating the deviation between the expression of a feature and the median expression of this feature across all samples of the experiment. Assuming that $Y_{i,j}$ represents the (potentially normalized) log expression of gene $i$ in cell $j$, the deviations of the relative log expression (RLE values) are calulated for each sample: | |
$RLE_{j} = Y_{i,j} - median(Y_{i})$ | |
An RLE plot shows the deviations for each sample in a boxplot (Figure \@ref(fig:scRNAseq-normalization-Z-compare-figure)). Under the basic assumption that most features are not changed across samples and no unwanted variation is present (anymore), deviations would be centered around 0 and have a similar spread (random variation). Other behavior would be a sign of large between sample heterogeneity and/or failed removal of unwanted variation. | |
```{r scRNAseq-normalization-Z-compare-figure, echo=FALSE, message=FALSE, warning=FALSE, fig.cap="Relative log expression (RLE) values for different normalizations. Boxplots represent the log deviations from the mean and should be centered near Zero with a similar spread. Watch out for other behavior, which can be a sign for failed or inappropriate normalization."} | |
methods <- c("sumfactor", "TMM") | |
exprs_mat_list <- setNames(as.list(paste0("norm_", methods)), methods) | |
scater::plotRLE(object_filtered, exprs_mats = c(list(tpm = 'abundance', counts = 'counts'), exprs_mat_list), exprs_logged = c(F, F, rep(T, times = length(exprs_mat_list))), colour_by = batch) | |
``` |