| Title: | EWAS using Reference-Free DNA Methylation Mixture Deconvolution |
|---|---|
| Description: | Reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. The older method (Houseman et al., 2014,<doi:10.1093/bioinformatics/btu029>) is similar to surrogate variable analysis (SVA and ISVA), except that it makes additional use of a biological mixture assumption. The newer method (Houseman et al., 2016, <doi:10.1186/s12859-016-1140-4>) is similar to non-negative matrix factorization, with additional constraints and additional utilities. |
| Authors: | E. Andres Houseman, Sc.D. |
| Maintainer: | E. Andres Houseman <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 2.2 |
| Built: | 2026-05-16 06:57:31 UTC |
| Source: | https://github.com/cran/RefFreeEWAS |
Bootstrap generation procedure for reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types.
BootOneRefFreeEwasModel(mod)BootOneRefFreeEwasModel(mod)
mod |
model object of class RefFreeEwasModel (generated with smallOutput=FALSE). |
Generates one bootstrapped data set for the reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. Typically not run by user.
A matrix representing a bootstrap sample of an DNA methylation assay matrix.
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (2013), Reference-Free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Currently a tech report, in revision for publication.
Bootstrap procedure for reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types.
BootRefFreeEwasModel(mod, nboot)BootRefFreeEwasModel(mod, nboot)
mod |
model object of class RefFreeEwasModel (generated with smallOutput=FALSE). |
nboot |
Number of bootstrap samples to generate |
Generates the bootstrap samples for the reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types.
An array object of class “BootRefFreeEwasModel”. Bootstraps are generated for both Beta and Bstar.
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (2014), Reference-Free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Bioinformatics, doi: 10.1093/bioinformatics/btu029.
data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- (rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot)data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- (rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot)
Bootstrap generation procedure for sampling paired data (e.g. twin data)
bootstrapPairs(obs, pairID)bootstrapPairs(obs, pairID)
obs |
Observation ids (numeric vector). |
pairID |
Pair IDs (one unique value per pair). |
Generates one bootstrapped set of ids corresponding to pairs for the method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. Typically not run by user.
A vector of IDs corresponding to bootstrapped pairs
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (Bioinformatics, 2014).
BootRefFreeEwasModel,PairsBootRefFreeEwasModel
Deviance method for objects of type RefFreeCellMix.
## S3 method for class 'RefFreeCellMix' deviance(object, Y, Y.oob=NULL, EPSILON=1E-9, bootstrapIterations=0, bootstrapIndices=NULL, ...)## S3 method for class 'RefFreeCellMix' deviance(object, Y, Y.oob=NULL, EPSILON=1E-9, bootstrapIterations=0, bootstrapIndices=NULL, ...)
object |
RefFreeCellMix object to summarize |
Y |
Methylation matrix on which x was based |
Y.oob |
Alternate ("out-of-box") methylation matrix for which to calculate deviance, based on x |
EPSILON |
Minimum value of variance (zero variances will be reset to this value) |
bootstrapIterations |
Number of RefFreeCellMix iterations to use in bootstrap (see details) |
bootstrapIndices |
Bootstrap indices (see details) |
... |
(Unused). |
Deviance based on normal distribution applied to errors of Y after accounting for cell mixture effect,
Mu Omega^T. Since RefFreeCellMix does not save the original data Y in the resulting object x, Y must
be supplied here. However, deviance may be calculated for an alternative "out-of-bag" methylation matrix,
Y.oob. If bootstrapIterations=0, this is what is done. If bootstrapIterations>0, then x$Mu is used to
initialize a new value of x via RefFreeCellMix executed on a bootstrap sample of Y
with the number of indicated iterations. If bootstrapIndices is provided, the bootstrap will be based on
these indices, otherwise the indices will be sampled randomly with replacement from 1:ncol(Y).
See RefFreeCellMix for example.
Method for estimating latent dimension by AIC and BIC.
EstDimIC(Rmat,Krange=0:25)EstDimIC(Rmat,Krange=0:25)
Rmat |
Residual matrix for which to estimate latent dimension. |
Krange |
Vector of integers representing candidate dimensions to consider |
Method for estimating latent dimension by AIC and BIC. Inferior to the RMT method in the isva package, but it appears here because it's mentioned in our paper.
A list containing AIC and BIC for candidate dimensions, as well as the best dimension for each.
E. Andres Houseman
HOUSEMAN, Eugene Andres, MOLITOR, John, et MARSIT, Carmen J. Reference-free cell mixture adjustments in analysis of DNA methylation data. Bioinformatics, 2014, vol. 30, no 10, p. 1431-1439.
data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- rfEwasExampleBetaValues EstDimIC(rfEwasExampleBetaValues-tmpBstar ## End(Not run)data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- rfEwasExampleBetaValues EstDimIC(rfEwasExampleBetaValues-tmpBstar ## End(Not run)
Method for estimating latent dimension by Random Matrix Theory.
EstDimRMT(Rmat)EstDimRMT(Rmat)
Rmat |
Residual matrix for which to estimate latent dimension. |
Method for estimating latent dimension by Random Matrix Theory. This function originated in the package isva, authored by A. Teschendorff. Previous versions of RefFreeEWAS used the isva version of the function. However, because of dependency issues in that package, the present version of RefFreeEWAS simply reproduces the function found in version 1.9 of isva and removes the dependency on the isva package. Documentation from isva: Given a data matrix, it estimates the number of significant components of variation by comparing the observed distribution of spectral eigenvalues to the theoretical one under a Gaussian Orthogonal Ensemble (GOE). Specifically, a spectral decomposition of the data covariance matrix is performed and the number of eigenvalues larger than the theoretical maximum predicted by the GOE is taken as an estimate of the number of significant components.
A list with following objects:
cor |
Data covariance matrix. |
dim |
Estimated intrinsic dimensionality of data. |
estdens |
Empirical density of eigenvalues. |
thdens |
Theoretical density of eigenvalues. |
E. Andres Houseman
Random matrix approach to cross correlations in financial data. Plerou et al. Physical Review E (2002), Vol.65.
Independent Surrogate Variable Analysis to deconvolve confounding factors in large-scale microarray profiling studies. Teschendorff AE, Zhuang JJ, Widschwendter M. Bioinformatics. 2011 Jun 1;27(11):1496-505.
data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run)data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run)
Simple method for imputing missing values by row-mean
ImputeByMean(Y)ImputeByMean(Y)
Y |
Matrix to impute. |
Matrix with missing values replaced by imputed values
Support for bootstrap-based omnibus test of significance accounting for correlation.
omnibusBoot(est, boots, denDegFree)omnibusBoot(est, boots, denDegFree)
est |
Vector of m estimates, one for each of m features. |
boots |
Matrix (m x R) of bootstrap samples corresponding to the estimates |
denDegFree |
Single number representing the denominator degrees-of-freedom for computing p-values |
Returns one omnibus p-value based on Kolmogorov-Smirnov distance from a uniform distribution
A single number representing the p-value for the omnibus test over all features.
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (2014), Reference-Free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Bioinformatics, doi: 10.1093/bioinformatics/btu029.
data(RefFreeEWAS) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot) omnibusBoot(test$Beta[,2], testBoot[,2,"B",],-diff(dim(test$X))) omnibusBoot(test$Bstar[,2], testBoot[,2,"B*",],-diff(dim(test$X)))data(RefFreeEWAS) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot) omnibusBoot(test$Beta[,2], testBoot[,2,"B",],-diff(dim(test$X))) omnibusBoot(test$Bstar[,2], testBoot[,2,"B*",],-diff(dim(test$X)))
Bootstrap generation procedure for reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. This version accounts for paired data (e.g. twin data)
PairsBootOneRefFreeEwasModel(mod, pairID)PairsBootOneRefFreeEwasModel(mod, pairID)
mod |
model object of class RefFreeEwasModel (generated with smallOutput=FALSE). |
pairID |
Pair IDs (one unique value per pair). |
Generates one bootstrapped data set for the reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. This version facilitates the estimation of robust standard errors to account for paired data (e.g. twin data) using a strategy similar to that employed by Generalized Estimating Equations (GEEs). Specifically, in bootstrapping the errors, the pairs are sampled rather than individual arrays. Typically not run by user.
A matrix representing a bootstrap sample of an DNA methylation assay matrix.
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (Bioinformatics,2014), Reference-Free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Bioinformatics, doi: 10.1093/bioinformatics/btu029.
BootRefFreeEwasModel,BootOneRefFreeEwasModel
Bootstrap procedure for reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. This version accounts for paired data (e.g. twin data)
PairsBootRefFreeEwasModel(mod, nboot, pairID)PairsBootRefFreeEwasModel(mod, nboot, pairID)
mod |
model object of class RefFreeEwasModel (generated with smallOutput=FALSE). |
nboot |
Number of bootstrap samples to generate. |
pairID |
Pair IDs (one unique value per pair). |
Generates the bootstrap samples for the reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. This paired version facilitates the estimation of robust standard errors to account for paired data (e.g. twin data) using a strategy similar to that employed by Generalized Estimating Equations (GEEs). Specifically, in bootstrapping the errors, the pairs are sampled rather than individual arrays. An error will be generated unless each cluster has exactly two members (i.e. exactly two observations correspond to the same unique ID given in pairID).
An array object of class “BootRefFreeEwasModel”. Bootstraps are generated for both Beta and Bstar.
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (Bioinformatics, 2014), Reference-Free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Bioinformatics, doi: 10.1093/bioinformatics/btu029.
RefFreeEwasModel,BootRefFreeEwasModel
data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- (rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot)data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- (rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot)
Print method for objects of type BootRefFreeEwasModel
## S3 method for class 'BootRefFreeEwasModel' print(x,...)## S3 method for class 'BootRefFreeEwasModel' print(x,...)
x |
BootRefFreeEwasModel object to print |
... |
(Unused). |
See RefFreeEwasModel for example.
Print method for objects of type RefFreeCellMix
## S3 method for class 'RefFreeCellMix' print(x,...)## S3 method for class 'RefFreeCellMix' print(x,...)
x |
RefFreeCellMix object to print |
... |
(Unused). |
See RefFreeCellMix for example.
Print method for objects of type RefFreeEwasModel
## S3 method for class 'RefFreeEwasModel' print(x,...)## S3 method for class 'RefFreeEwasModel' print(x,...)
x |
RefFreeEwasModel object to print |
... |
(Unused). |
See RefFreeEwasModel for example.
Print method for objects of type summaryBootRefFreeEwasModel
## S3 method for class 'summaryBootRefFreeEwasModel' print(x,...)## S3 method for class 'summaryBootRefFreeEwasModel' print(x,...)
x |
summaryBootRefFreeEwasModel object to print |
... |
(Unused). |
See RefFreeEwasModel for example.
Constrained linear projection for estimating cell mixture or related coefficients.
projectMix(Y, Xmat, nonnegative=TRUE, sumLessThanOne=TRUE, lessThanOne=!sumLessThanOne)projectMix(Y, Xmat, nonnegative=TRUE, sumLessThanOne=TRUE, lessThanOne=!sumLessThanOne)
Y |
Matrix (m CpGs x n Subjects) of DNA methylation beta values |
Xmat |
Matrix (m CpGs x K cell types) of cell-type specific methylomes |
nonnegative |
All coefficients >=0? |
sumLessThanOne |
Coefficient rows should sum to less than one? |
lessThanOne |
Every value should be less than one (but possibly sum to value greater than one)? |
Function for projecting methylation values (Y) onto space of methyomes (Xmat), with various constraints. This is the reference-based method described in Houseman et al. (2012) and also appearing in the minfi package.
Projection coefficients resulting from constrained projection
E. Andres Houseman
Houseman EA, Accomando WP et al. DNA methylation arrays as surrogate measures of cell mixture distribution, BMC Bioinformatics, 2012.
Reference-free cell-mixture decomposition of DNA methylation data set
RefFreeCellMix(Y,mu0=NULL,K=NULL,iters=10,Yfinal=NULL,verbose=TRUE)RefFreeCellMix(Y,mu0=NULL,K=NULL,iters=10,Yfinal=NULL,verbose=TRUE)
Y |
Matrix (m CpGs x n Subjects) of DNA methylation beta values |
mu0 |
Matrix (m CpGs x K cell types) of *initial* cell-type specific methylomes |
K |
Number of cell types (ignored if mu0 is provided) |
iters |
Number of iterations to execute |
Yfinal |
Matrix (m* CpGs x n Subjects) of DNA methylation beta values on which to base final methylomes |
verbose |
Report summary of errors after each iteration? |
Reference-free decomposition of DNA methylation matrix into cell-type distributions and cell-type methylomes, Y = Mu Omega^T. Either an initial estimate of Mu must be provided, or else the number of cell types K, in which case RefFreeCellMixInitialize will be used to initialize. Note that the decomposition will be based on Y, but Yfinal (=Y by default) will be used to determine the final value of Mu based on the last iterated value of Omega.
Object of S3 class RefFreeCellMix, containing the last iteration of Mu and Omega.
E. Andres Houseman
Houseman, E. Andres, Kile, Molly L., Christiani, David C., et al. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC bioinformatics, 2016, vol. 17, no 1, p. 259.
data(HNSCC) # Typical use Y.shortTest <- Y.HNSCC.averageBetas[1:500,] Y.shortTest.final <- Y.HNSCC.averageBetas[1:1000,] testArray1 <- RefFreeCellMixArray(Y.shortTest,Klist=1:3,iters=5,Yfinal=Y.shortTest.final) testArray1 lapply(testArray1,summary) sapply(testArray1,deviance,Y=Y.shortTest.final) # Example with explicit initialization testKeq2 <- RefFreeCellMix(Y.shortTest, mu0=RefFreeCellMixInitialize(Y.shortTest,K=2)) testKeq2 head(testKeq2$Mu) head(testKeq2$Omega)data(HNSCC) # Typical use Y.shortTest <- Y.HNSCC.averageBetas[1:500,] Y.shortTest.final <- Y.HNSCC.averageBetas[1:1000,] testArray1 <- RefFreeCellMixArray(Y.shortTest,Klist=1:3,iters=5,Yfinal=Y.shortTest.final) testArray1 lapply(testArray1,summary) sapply(testArray1,deviance,Y=Y.shortTest.final) # Example with explicit initialization testKeq2 <- RefFreeCellMix(Y.shortTest, mu0=RefFreeCellMixInitialize(Y.shortTest,K=2)) testKeq2 head(testKeq2$Mu) head(testKeq2$Omega)
Array of reference-free cell-mixture decompositions of a DNA methylation data set
RefFreeCellMixArray(Y,Klist=1:5,iters=10,Yfinal=NULL,verbose=FALSE, dist.method = "euclidean",...)RefFreeCellMixArray(Y,Klist=1:5,iters=10,Yfinal=NULL,verbose=FALSE, dist.method = "euclidean",...)
Y |
Matrix (m CpGs x n Subjects) of DNA methylation beta values |
Klist |
List of K values (each K = assumed number of cell types) |
iters |
Number of iterations to execute for each value of K |
Yfinal |
Matrix (m* CpGs x n Subjects) of DNA methylation beta values on which to base final methylomes |
verbose |
Report summary of errors after each iteration for each fit? |
dist.method |
Method for calculating distance matrix for methylome initialization |
... |
Additional parameters for hclust function for methylome initialization |
List of Reference-free decompositions for a range of K values. For each value of K, the decomposition is initialized by hierarchical clutering as specified by the parameters dist.method, etc. Note that for each K, the decomposition will be based on Y, but Yfinal (=Y by default) will be used to determine the final value of Mu based on the last iterated value of Omega.
List, each element is an object of S3 class RefFreeCellMix, containing the last iteration of Mu and Omega.
E. Andres Houseman
Houseman, E. Andres, Kile, Molly L., Christiani, David C., et al. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC bioinformatics, 2016, vol. 17, no 1, p. 259.
data(HNSCC) Y.shortTest <- Y.HNSCC.averageBetas[1:500,] testArray2 <- RefFreeCellMixArray(Y.shortTest,Klist=1:5,iters=5) sapply(testArray2,deviance,Y=Y.shortTest) ## Not run: testBootDevs <- RefFreeCellMixArrayDevianceBoots(testArray2,Y.shortTest,R=10) testBootDevs apply(testBootDevs[-1,],2,mean,trim=0.25) which.min(apply(testBootDevs[-1,],2,mean,trim=0.25)) ## End(Not run)data(HNSCC) Y.shortTest <- Y.HNSCC.averageBetas[1:500,] testArray2 <- RefFreeCellMixArray(Y.shortTest,Klist=1:5,iters=5) sapply(testArray2,deviance,Y=Y.shortTest) ## Not run: testBootDevs <- RefFreeCellMixArrayDevianceBoots(testArray2,Y.shortTest,R=10) testBootDevs apply(testBootDevs[-1,],2,mean,trim=0.25) which.min(apply(testBootDevs[-1,],2,mean,trim=0.25)) ## End(Not run)
Vector of bootstrapped deviances corresponding to an array of reference-free cell-mixture decompositions
RefFreeCellMixArrayDevianceBoot(rfArray, Y, EPSILON=1E-9, bootstrapIterations=5)RefFreeCellMixArrayDevianceBoot(rfArray, Y, EPSILON=1E-9, bootstrapIterations=5)
rfArray |
list of RefFreeCellMix objects (e.g. from RefFreeCellMixArray) |
Y |
Methylation matrix on which x was based |
EPSILON |
Minimum value of variance (zero variances will be reset to this value) |
bootstrapIterations |
Number of RefFreeCellMix iterations to use in bootstrap |
Vector of bootstrapped deviances corresponding to an array of reference-free cell-mixture decompositions,
used to determine optimal number of cell types. This function returns one bootstrapped vector.
See RefFreeCellMixArrayDevianceBoots for more than one bootstrapped vector.
The bootstrapped deviance is based on normal distribution applied to errors of Y after accounting for cell mixture effect, Mu Omega^T.
See RefFreeCellMixArray for example.
Matrix of bootstrapped deviances corresponding to an array of reference-free cell-mixture decompositions
RefFreeCellMixArrayDevianceBoots(rfArray, Y, R=5, EPSILON=1E-9, bootstrapIterations=5)RefFreeCellMixArrayDevianceBoots(rfArray, Y, R=5, EPSILON=1E-9, bootstrapIterations=5)
rfArray |
list of RefFreeCellMix objects (e.g. from RefFreeCellMixArray) |
Y |
Methylation matrix on which x was based |
R |
Number of bootstrapped vectors to return |
EPSILON |
Minimum value of variance (zero variances will be reset to this value) |
bootstrapIterations |
Number of RefFreeCellMix iterations to use in bootstrap |
Matrix (multiple vectors) of bootstrapped deviances corresponding to an array of reference-free cell-mixture decompositions,
used to determine optimal number of cell types. This function returns one bootstrapped vector.
The bootstrapped deviance is based on normal distribution applied to errors of Y after accounting for cell mixture effect, Mu Omega^T.
See RefFreeCellMixArray for example.
Array of reference-free cell-mixture decompositions of a DNA methylation data set, with custom initialization
RefFreeCellMixArrayWithCustomStart(Y,mu.start,Klist=1:5,iters=10, Yfinal=NULL,verbose=FALSE)RefFreeCellMixArrayWithCustomStart(Y,mu.start,Klist=1:5,iters=10, Yfinal=NULL,verbose=FALSE)
Y |
Matrix (m CpGs x n Subjects) of DNA methylation beta values |
mu.start |
matrix of starting values for Mu: number of columns must be at least the maximum in Klist |
Klist |
List of K values (each K = assumed number of cell types) |
iters |
Number of iterations to execute for each value of K |
Yfinal |
Matrix (m* CpGs x n Subjects) of DNA methylation beta values on which to base final methylomes |
verbose |
Report summary of errors after each iteration for each fit? |
List of Reference-free decompositions for a range of K values. For each value of K, the decomposition is initialized by using the first K columns of mu.start. Note that for each K, the decomposition will be based on Y, but Yfinal (=Y by default) will be used to determine the final value of Mu based on the last iterated value of Omega.
List, each element is an object of S3 class RefFreeCellMix, containing the last iteration of Mu and Omega.
E. Andres Houseman
Houseman, E. Andres, Kile, Molly L., Christiani, David C., et al. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC bioinformatics, 2016, vol. 17, no 1, p. 259.
RefFreeCellMix,
RefFreeCellMixInitializeBySVD
Initializes the methylome matrix "Mu" for RefFreeCellMix
RefFreeCellMixInitialize(Y,K=2,Y.Distance=NULL, Y.Cluster=NULL, largeOK=FALSE, dist.method = "euclidean", ...)RefFreeCellMixInitialize(Y,K=2,Y.Distance=NULL, Y.Cluster=NULL, largeOK=FALSE, dist.method = "euclidean", ...)
Y |
Matrix (m CpGs x n Subjects) of DNA methylation beta values |
K |
Number of cell types |
Y.Distance |
Distance matrix (object of class "dist") to use for clustering. |
Y.Cluster |
Hiearchical clustering object (from hclust function) |
largeOK |
OK to calculate distance matrix for large number of subjects? (See details.) |
dist.method |
Method for calculating distance matrix |
... |
Additional parameters for hclust function |
Initializes the methylome matrix "Mu" for RefFreeCellMix by computing the mean methylation (from Y) over K clusters of Y, determined by the Y.Cluster object. If Y.Cluster object does not exist, it will be created from Y.Distance (using additional clustering parameters if supplied). If Y.Distance does not exist, it will be created from t(Y). As a protection against attempting to fit a very large distance matrix, the program will stop if the number of columns of Y is > 2500, unless largeOK is explicitly set to TRUE.
An m x K matrix of mean methylation values.
E. Andres Houseman
Initialize Reference-Free Cell Mixture Projection by SVD
RefFreeCellMixInitializeBySVD(Y, type=1)RefFreeCellMixInitializeBySVD(Y, type=1)
Y |
Matrix (m CpGs x n Subjects) of DNA methylation beta values |
type |
See details |
This method initializes the reference-free cell mixture deconvolution using an ad-hoc method based on singular value decomposition. Type=1 will attempt to discretize Mu to 0/1, Type=2 will attempt to find a continuous range using column ranks. However, neither of these strategies is guaranteed to result in stable starting values for K larger than the "true" value of K.
Matrix of starting values for Mu.
E. Andres Houseman
RefFreeCellMix,
RefFreeCellMixArrayWithCustomStart
data(HNSCC) Y.shortTest <- Y.HNSCC.averageBetas[1:500,] mu.start.svd <- RefFreeCellMixInitializeBySVD(Y.shortTest) testArray2 <- RefFreeCellMixArrayWithCustomStart(Y.shortTest, mu.start=mu.start.svd, Klist=1:3,iters=5) sapply(testArray2,deviance,Y=Y.shortTest) ## Not run: testBootDevs <- RefFreeCellMixArrayBySVDDevianceBoots(testArray2,Y.shortTest,R=10) testBootDevs apply(testBootDevs[-1,],2,mean,trim=0.25) which.min(apply(testBootDevs[-1,],2,mean,trim=0.25)) ## End(Not run)data(HNSCC) Y.shortTest <- Y.HNSCC.averageBetas[1:500,] mu.start.svd <- RefFreeCellMixInitializeBySVD(Y.shortTest) testArray2 <- RefFreeCellMixArrayWithCustomStart(Y.shortTest, mu.start=mu.start.svd, Klist=1:3,iters=5) sapply(testArray2,deviance,Y=Y.shortTest) ## Not run: testBootDevs <- RefFreeCellMixArrayBySVDDevianceBoots(testArray2,Y.shortTest,R=10) testBootDevs apply(testBootDevs[-1,],2,mean,trim=0.25) which.min(apply(testBootDevs[-1,],2,mean,trim=0.25)) ## End(Not run)
Reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types.
RefFreeEwasModel(Y, X, K, smallOutput=FALSE)RefFreeEwasModel(Y, X, K, smallOutput=FALSE)
Y |
Matrix of DNA methylation beta values (CpGs x subjects). Missing values *are* supported. |
X |
Design matrix (subjects x covariates). |
K |
Latent variable dimension (d in Houseman et al., 2013, technical report) |
smallOutput |
Smaller output? (Should be FALSE if you intend to run bootstraps.) |
Reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. This method is similar to surrogate variable analysis (SVA and ISVA), except that it makes additional use of a biological mixture assumption. Returns mixture-adjusted Beta and unadjusted Bstar, as well as estimates of various latent quantities.
A list object of class “RefFreeEwasModel”. The most important elements are Beta and Bstar.
E. Andres Houseman
Houseman EA, Molitor J, and Marsit CJ (2014), Reference-Free Cell Mixture Adjustments in Analysis of DNA Methylation Data. Bioinformatics, doi: 10.1093/bioinformatics/btu029.
data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- (rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot)data(RefFreeEWAS) ## Not run: tmpDesign <- cbind(1, rfEwasExampleCovariate) tmpBstar <- (rfEwasExampleBetaValues EstDimRMT(rfEwasExampleBetaValues-tmpBstar ## End(Not run) test <- RefFreeEwasModel( rfEwasExampleBetaValues, cbind(1,rfEwasExampleCovariate), 4) testBoot <- BootRefFreeEwasModel(test,10) summary(testBoot)
1000 CpG sites x 250 subjects. First 250 CpGs are DMRs for the cell types, although the idea is that this would not be known in practice.
rfEwasExampleBetaValuesrfEwasExampleBetaValues
1000 CpG sites x 250 subjects.
Vector of covariates corresponding to 250 subjects.
rfEwasExampleCovariaterfEwasExampleCovariate
Numeric vector of length 250
1000 intercept values; these may not match exactly due to cell mixtures.
rfEwasExampleTRUEAlpharfEwasExampleTRUEAlpha
1000 intercept values.
1000 coefficient values
rfEwasExampleTRUEBetarfEwasExampleTRUEBeta
1000 coefficient values.
1000 x 4 matrix of beta values
rfEwasExampleTRUEMethDMRrfEwasExampleTRUEMethDMR
1000 x 4 matrix
250 x 4 matrix of mixing weights
rfEwasExampleTRUEOmegarfEwasExampleTRUEOmega
250 x 4 matrix
Summary method for objects of type BootRefFreeEwasModel; calculates bootstrap mean and standard deviation.
## S3 method for class 'BootRefFreeEwasModel' summary(object,...)## S3 method for class 'BootRefFreeEwasModel' summary(object,...)
object |
BootRefFreeEwasModel object to summarize |
... |
(Unused). |
See RefFreeEwasModel for example.
Summary method for objects of type RefFreeCellMix.
## S3 method for class 'RefFreeCellMix' summary(object,...)## S3 method for class 'RefFreeCellMix' summary(object,...)
object |
RefFreeCellMix object to summarize |
... |
(Unused). |
See RefFreeCellMix for example.
SVD that traps errors and switches to QR when necessary
svdSafe(X)svdSafe(X)
X |
Matrix to decompose |
This function traps errors in the svd function due to numerically zero singular values, and replaces the operation with a QR decomposition. Technically, the R component of the decomposition fails the orthogonality constraint required for the SVD decomposition, but this function exists to save bootstraps from rudely failing; since the critical component of the SVD (in this application) is the left orthogonal matrix, this is a reasonable approximation for bootstrap purposes. If there are too many svd failures (which will will be reported by the function) then it is worth looking into the design matrix.
A list as in what svd produces: U and V matrices as well as the d vector of singular values.
E. Andres Houseman
Compute singular value decomposition on a matrix with missing values, using a naive/simple method for imputing missing values by row-mean
SVDwithMissing(Y)SVDwithMissing(Y)
Y |
Matrix for which to compute SVD. |
Computes singular value decomposition on a matrix with missing values, using a naive/simple method for imputing missing values by row-mean. Not recommended for matrices with very large numbers of missing values.
singular value decomposition (as returned by svd function)
Case status (0=control, 1=case) and age (as Z-score) for HNSCC data set
X.HNSCC.caseStatusAgeX.HNSCC.caseStatusAge
Numeric matrix of dimension 182 x 3
Peripheral blood from 92 head and neck squamous cell carcinoma (HNSCC) patients and 92 controls. GEO Accession #GSE32393 with 2 outlier cases removed.
Y.HNSCC.averageBetasY.HNSCC.averageBetas
Numeric matrix of dimension 26486 by 182