Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/TCGAFunctions.R
\name{getFilteredData}
\alias{getFilteredData}
\title{getFilteredData
Processes one cancer group with filtering steps}
\usage{
getFilteredData(oneCancerGroupExpressionmatrix, maxModality = 3,
algorithm = "mclust", minClusterSize = 50, minSamples = NULL,
pathToClinicalData = "pathToClinicalData",
pathToExpressionmatrix = "pathToExpressionmatrix",
minExpressedPercentage = 2, SilvermanP = 0.05, minPercentage = 10,
minMean = 2, minFoldChange = 2, coresToUse = 6, parallel = FALSE,
updateToEnsembleGeneNames = FALSE, verbose = TRUE)
}
\arguments{
\item{oneCancerGroupExpressionmatrix}{TCGA cancer expression data of one cancer group}
\item{maxModality}{An integer specifying the highest modality to calculate models in with mclust and flexmix}
\item{algorithm}{"mclust", "hdbscan" or "flexmix". Defines which algorithm should be used to process the cancer data}
\item{minClusterSize}{Integer; The minimum number of samples
in a group for that group to be considered a cluster in hdbscan}
\item{minSamples}{Integer for hdbscan; The number of samples
in a neighborhood for a point to be considered as a core point.
This includes the point itself. If NULL: defaults to the min_cluster_size.}
\item{pathToClinicalData}{path to the clinical data}
\item{pathToExpressionmatrix}{path to the expression matrix}
\item{minExpressedPercentage}{integer specifying the percentage of patients that should at least be expressed for the gene to be analysed}
\item{SilvermanP}{The p-value that is used to reject Silvermans Test for unimodality
(given by k=1 using the Hall/York adjustments)}
\item{minPercentage}{minimum percentage of groups.
Groups with lower percentages will be sorted into the other groups}
\item{minMean}{minimum mean that has to exist in any mean of the multimodal genes}
\item{minFoldChange}{minimum fold change that has to exist between any adjacent means of multimodal genes}
\item{coresToUse}{The number of cores to use for parallel computing of cleanOutput() if parallel is TRUE}
\item{parallel}{logical. Whether cleanOutput() shall be computed in parallel}
\item{updateToEnsembleGeneNames}{logical. Whether to convert Ensemble gene ids to gene names}
\item{verbose}{logical. Whether to print progress messages}
}
\description{
getFilteredData
Processes one cancer group with filtering steps
}
\details{
Processes one cancer group by
a) filtering 0 sum rows,
b) using Silverman's test for unimodality,
c) filtering genes with <2% of patients that have expression values greater 0,
d) use useMclust on cancer data,
e) cleaning output of useMclust
f) filtering for multimodal genes
g) filtering for existing minimum mean of any mean of the multimodal genes
h) filtering for existing minimum fold change between any adjacent means of multimodal genes
i) getting gene names from Ensembl
j) returning a list of the filtered output and the filtered expression matrix
}