Bayesian Multiple Regression with Mixture-of-Normals Prior (RSS)
Source:R/mr.ash.rss.R
mr.ash.rss.RdThis function performs Bayesian multiple regression with a mixture-of-normals prior using summary statistics (RSS: Regression with Summary Statistics). It uses a C++ implementation for efficient computation.
Arguments
- bhat
Numeric vector of observed effect sizes (standardized).
- shat
Numeric vector of standard errors of effect sizes.
- R
Numeric matrix of the correlation matrix.
- var_y
Numeric value of the variance of the outcome. If NULL, it is set to Inf (effects on standardized scale).
- n
Integer value of the sample size.
- s0
Numeric vector of prior variances for the mixture components.
- w0
Numeric vector of prior weights for the mixture components.
- sigma2_e
Numeric value of the initial error variance estimate. If
NULL(default), initialized tovar_y(matchingmr.ashbehavior of using residual variance with zero initialization), or 1 whenvar_y = Inf.- mu1_init
Numeric vector of initial values for the posterior mean of the coefficients. Default is
numeric(0)(initialize to zero).- tol
Numeric value of the convergence tolerance. Default is 1e-8.
- max_iter
Integer value of the maximum number of iterations. Default is 1e5.
- z
Numeric vector of Z-scores. If not provided, computed as
bhat / shat.- update_w0
Logical value indicating whether to update the mixture weights. Default is TRUE.
- update_sigma
Logical value indicating whether to update the error variance. Default is TRUE.
- compute_ELBO
Logical value indicating whether to compute the Evidence Lower Bound (ELBO). Default is TRUE.
- standardize
Logical value indicating whether to standardize the input data. Default is FALSE.
- ncpu
An integer specifying the number of CPU cores to use for parallel computation. Default is 1.
Value
A list containing the following components:
- beta
Numeric vector of posterior mean coefficients (same as mu1).
- sigma2
Numeric value of the residual variance (same as sigma2_e).
- pi
Numeric vector of mixture weights (same as w0).
- iter
Integer, number of iterations performed.
- varobj
Numeric vector of ELBO values per iteration.
- mu1
Numeric vector of the posterior mean of the coefficients.
- sigma2_1
Numeric vector of the posterior variance of the coefficients.
- w1
Numeric matrix of the posterior assignment probabilities.
- sigma2_e
Numeric value of the error variance.
- w0
Numeric vector of the mixture weights.
- ELBO
Numeric value of the Evidence Lower Bound (if
compute_ELBO = TRUE).
Examples
# Generate example data
set.seed(985115)
n <- 350
p <- 16
sigmasq_error <- 0.5
zeroes <- rbinom(p, 1, 0.6)
beta.true <- rnorm(p, 1, sd = 4)
beta.true[zeroes] <- 0
X <- cbind(matrix(rnorm(n * p), nrow = n))
X <- scale(X, center = TRUE, scale = FALSE)
y <- X %*% matrix(beta.true, ncol = 1) + rnorm(n, 0, sqrt(sigmasq_error))
y <- scale(y, center = TRUE, scale = FALSE)
# Set the prior
K <- 9
sigma0 <- c(0.001, .1, .5, 1, 5, 10, 20, 30, .005)
omega0 <- rep(1 / K, K)
# Calculate summary statistics
b.hat <- sapply(1:p, function(j) {
summary(lm(y ~ X[, j]))$coefficients[-1, 1]
})
s.hat <- sapply(1:p, function(j) {
summary(lm(y ~ X[, j]))$coefficients[-1, 2]
})
R.hat <- cor(X)
var_y <- var(y)
sigmasq_init <- 1.5
# Run mr.ash.rss
out <- mr.ash.rss(b.hat, s.hat,
R = R.hat, var_y = var_y, n = n,
sigma2_e = sigmasq_init, s0 = sigma0, w0 = omega0,
mu1_init = rep(0, ncol(X)), tol = 1e-8, max_iter = 1e5,
update_w0 = TRUE, update_sigma = TRUE, compute_ELBO = TRUE,
standardize = FALSE
)