Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
multimodalR/man/getFilteredData.Rd
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
70 lines (55 sloc)
2.92 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
% Generated by roxygen2: do not edit by hand | |
% Please edit documentation in R/TCGAFunctions.R | |
\name{getFilteredData} | |
\alias{getFilteredData} | |
\title{getFilteredData | |
Processes one cancer group with filtering steps} | |
\usage{ | |
getFilteredData(oneCancerGroupExpressionmatrix, maxModality = 3, | |
algorithm = "mclust", minClusterSize = 50, minSamples = NULL, | |
pathToClinicalData = "pathToClinicalData", | |
pathToExpressionmatrix = "pathToExpressionmatrix", | |
minExpressedPercentage = 2, SilvermanP = 0.05, minPercentage = 10, | |
minMean = 2, minFoldChange = 2, coresToUse = 6, parallel = FALSE, | |
updateToEnsembleGeneNames = FALSE, verbose = TRUE) | |
} | |
\arguments{ | |
\item{oneCancerGroupExpressionmatrix}{TCGA cancer expression data of one cancer group} | |
\item{maxModality}{An integer specifying the highest modality to calculate models in with mclust and flexmix} | |
\item{algorithm}{"mclust", "hdbscan" or "flexmix". Defines which algorithm should be used to process the cancer data} | |
\item{minClusterSize}{Integer; The minimum number of samples | |
in a group for that group to be considered a cluster in hdbscan} | |
\item{minSamples}{Integer for hdbscan; The number of samples | |
in a neighborhood for a point to be considered as a core point. | |
This includes the point itself. If NULL: defaults to the min_cluster_size.} | |
\item{pathToClinicalData}{path to the clinical data} | |
\item{pathToExpressionmatrix}{path to the expression matrix} | |
\item{minExpressedPercentage}{integer specifying the percentage of patients that should at least be expressed for the gene to be analysed} | |
\item{SilvermanP}{The p-value that is used to reject Silvermans Test for unimodality | |
(given by k=1 using the Hall/York adjustments)} | |
\item{minPercentage}{minimum percentage of groups. | |
Groups with lower percentages will be sorted into the other groups} | |
\item{minMean}{minimum mean that has to exist in any mean of the multimodal genes} | |
\item{minFoldChange}{minimum fold change that has to exist between any adjacent means of multimodal genes} | |
\item{coresToUse}{The number of cores to use for parallel computing of cleanOutput() if parallel is TRUE} | |
\item{parallel}{logical. Whether cleanOutput() shall be computed in parallel} | |
\item{updateToEnsembleGeneNames}{logical. Whether to convert Ensemble gene ids to gene names} | |
\item{verbose}{logical. Whether to print progress messages} | |
} | |
\description{ | |
getFilteredData | |
Processes one cancer group with filtering steps | |
} | |
\details{ | |
Processes one cancer group by | |
a) filtering 0 sum rows, | |
b) using Silverman's test for unimodality, | |
c) filtering genes with <2% of patients that have expression values greater 0, | |
d) use useMclust on cancer data, | |
e) cleaning output of useMclust | |
f) filtering for multimodal genes | |
g) filtering for existing minimum mean of any mean of the multimodal genes | |
h) filtering for existing minimum fold change between any adjacent means of multimodal genes | |
i) getting gene names from Ensembl | |
j) returning a list of the filtered output and the filtered expression matrix | |
} |