Skip to contents

Performs a Bayesian multiple linear regression of Y on X. That is, this function fits the regression model $$Y = \sum_l X b_l + e,$$ where the elements of \(e\) are i.i.d. normal with zero mean and variance residual_variance, and the sum \(\sum_l b_l\) is a vector of p effects to be estimated. The SuSiE assumption is that each \(b_l\) has exactly one non-zero element.

Usage

mvsusie(
  X,
  Y,
  L = 10,
  prior_variance = 0.2,
  residual_variance = NULL,
  prior_weights = NULL,
  standardize = TRUE,
  intercept = TRUE,
  estimate_residual_variance = TRUE,
  estimate_prior_variance = TRUE,
  estimate_prior_method = "optim",
  estimate_prior_mixture_weights = TRUE,
  mixture_weight_method = "mixsqp",
  check_null_threshold = 0,
  prior_tol = 1e-09,
  model_init = NULL,
  missing_y_method = "approximate",
  coverage = 0.95,
  min_abs_corr = 0.5,
  compute_univariate_zscore = FALSE,
  precompute_cache = TRUE,
  n_thread = 1,
  max_iter = 100,
  tol = 1e-04,
  verbose = TRUE,
  convergence_method = NULL,
  track_fit = FALSE,
  min_outcome_lbf = 0,
  L_greedy = NULL,
  greedy_lbf_cutoff = 0.1
)

mvsusie_rss(
  Z,
  R,
  N,
  Bhat,
  Shat,
  varY,
  prior_variance = 0.2,
  residual_variance = NULL,
  estimate_residual_variance = FALSE,
  ...
)

mvsusie_ss(
  XtX,
  XtY,
  YtY,
  N,
  L = 10,
  X_colmeans = NULL,
  Y_colmeans = NULL,
  prior_variance = 0.2,
  residual_variance = NULL,
  prior_weights = NULL,
  standardize = TRUE,
  estimate_residual_variance = TRUE,
  estimate_prior_variance = TRUE,
  estimate_prior_method = "optim",
  estimate_prior_mixture_weights = TRUE,
  mixture_weight_method = "mixsqp",
  check_null_threshold = 0,
  prior_tol = 1e-09,
  precompute_cache = TRUE,
  model_init = NULL,
  coverage = 0.95,
  min_abs_corr = 0.5,
  n_thread = 1,
  max_iter = 100,
  tol = 1e-04,
  verbose = TRUE,
  track_fit = FALSE,
  min_outcome_lbf = 0,
  L_greedy = NULL,
  greedy_lbf_cutoff = 0.1
)

mvsusie_core(
  X,
  Y,
  L = 10,
  prior_variance = 0.2,
  residual_variance = NULL,
  prior_weights = NULL,
  standardize = TRUE,
  intercept = TRUE,
  estimate_residual_variance = TRUE,
  estimate_prior_variance = TRUE,
  estimate_prior_method = "optim",
  estimate_prior_mixture_weights = TRUE,
  mixture_weight_method = "mixsqp",
  check_null_threshold = 0,
  prior_tol = 1e-09,
  model_init = NULL,
  missing_y_method = "approximate",
  coverage = 0.95,
  min_abs_corr = 0.5,
  compute_univariate_zscore = FALSE,
  precompute_cache = TRUE,
  n_thread = 1,
  max_iter = 100,
  tol = 1e-04,
  verbose = TRUE,
  convergence_method = NULL,
  track_fit = FALSE,
  min_outcome_lbf = 0,
  L_greedy = NULL,
  greedy_lbf_cutoff = 0.1,
  attach_lbf_variable_outcome = TRUE
)

mvsusie_ss_core(
  XtX,
  XtY,
  YtY,
  N,
  L = 10,
  X_colmeans = NULL,
  Y_colmeans = NULL,
  prior_variance = 0.2,
  residual_variance = NULL,
  prior_weights = NULL,
  standardize = TRUE,
  estimate_residual_variance = TRUE,
  estimate_prior_variance = TRUE,
  estimate_prior_method = "optim",
  estimate_prior_mixture_weights = TRUE,
  mixture_weight_method = "mixsqp",
  check_null_threshold = 0,
  prior_tol = 1e-09,
  model_init = NULL,
  coverage = 0.95,
  min_abs_corr = 0.5,
  precompute_cache = TRUE,
  n_thread = 1,
  max_iter = 100,
  tol = 1e-04,
  verbose = TRUE,
  track_fit = FALSE,
  min_outcome_lbf = 0,
  L_greedy = NULL,
  greedy_lbf_cutoff = 0.1
)

Arguments

X

N by J matrix of covariates.

Y

Vector of length N, or N by R matrix of response variables.

L

Maximum number of non-zero effects.

prior_variance

Can be either (1) a vector of length L, or a scalar, for scaled prior variance when Y is univariate (which should then be equivalent to susie); or (2) a matrix for a simple multivariate regression; or (3) a mixture prior from create_mixture_prior.

residual_variance

The residual variance

prior_weights

A vector of length p giving the prior probability that each element is non-zero. Note that the prior weights need to be non-negative but do not need to sum to 1; they will automatically be normalized to sum to 1 so that they represent probabilities. The default setting is that the prior weights are the same for all variables.

standardize

Logical flag specifying whether to standardize columns of X to unit variance prior to fitting. If you do not standardize you may need to think more carefully about specifying the scale of the prior variance. Whatever the value of standardize, the coefficients (returned by coef) are for X on the original input scale. Note that any column of X with zero variance is not standardized, but left as is.

intercept

Should intercept be fitted or set to zero. Setting intercept = FALSE is generally not recommended.

estimate_residual_variance

When estimate_residual_variance = TRUE, the residual variance is estimated at each iteration using \(E_q[R'R] / n\); otherwise it is fixed. For multivariate Y the estimate is a full \(r \times r\) covariance matrix. Supported for all missing data methods: the update formula uses expected sufficient statistics (not the ELBO value), and the impute method includes a Y_cov correction for imputation uncertainty. Defaults to TRUE for mvsusie(), and FALSE for mvsusie_ss() and mvsusie_rss().

estimate_prior_variance

When estimate_prior_variance = TRUE, the prior variance is estimated; otherwise it is fixed. Currently estimate_prior_variance = TRUE only works for univariate Y, or for multivariate Y when the prior variance is a matrix.

estimate_prior_method

The method used for estimating the prior variance; valid choices are "optim", "EM", or "uniroot".

estimate_prior_mixture_weights

When TRUE and prior_variance is a mixture prior, the mixture weights are updated at each iteration. Components with near-zero weight are pruned.

mixture_weight_method

Method for updating mixture weights; "mixsqp" or "EM".

check_null_threshold

When the prior variance is estimated, the estimate is compared against the null, and the prior variance is set to zero unless the log-likelihood using the estimate is larger than that of null by this threshold. For example, setting check_null_threshold = 0.1 will “nudge” the estimate towards zero. When used with estimate_prior_method = "EM", setting check_null_threshold = NA will skip this check.

prior_tol

When the prior variance is estimated, compare the estimated value to this value at the end of the analysis and exclude a single effect from PIP computation if the estimated prior variance is smaller than it.

model_init

A previous model fit with which to initialize.

missing_y_method

Method for handling missing values in Y; "approximate" or "exact".

coverage

Coverage of credible sets.

min_abs_corr

Minimum of absolute value of correlation allowed in a credible set. The setting min_abs_corr = 0.5 corresponds to squared correlation of 0.25, which is a commonly used threshold for genotype data in genetics studies.

compute_univariate_zscore

When compute_univariate_zscore = TRUE, the z-scores from the per-variable univariate regressions are returned. (Note that these z-scores are not actually used to fit the multivariate susie model.)

precompute_cache

If precompute_cache = TRUE, precomputes eigendecomposition and covariance quantities to speed up computations at the cost of increased memory usage.

n_thread

Maximum number of threads to use for parallel computation (only applicable when a mixture prior is used).

max_iter

Maximum number of iterations to perform.

tol

The model fitting will terminate when the increase in ELBOs between two successive iterations is less than tol.

verbose

If TRUE, print progress messages during model fitting. Default is TRUE.

convergence_method

Method for checking convergence; "elbo" uses the objective and "pip" uses PIP/alpha stability.

track_fit

Add attribute trace to the return value which records the algorithm's progress at each iteration.

L_greedy

Integer or NULL. When non-NULL, run a greedy outer loop that grows the effect count from L_greedy up to L in linear steps of size L_greedy until the fit saturates (min(lbf) < greedy_lbf_cutoff) or reaches L. NULL (the default) runs a single fixed-L IBSS, output bit-identical to non-greedy susieR. Passes through to susieR::susie_workhorse.

greedy_lbf_cutoff

Numeric saturation threshold for the greedy outer loop. The fit is considered saturated as soon as any effect's log Bayes factor falls below this value. Default 0.1. Ignored when L_greedy = NULL.

Z

J x R matrix of z-scores.

R

J x J LD matrix.

N

The sample size.

Bhat

Alternative summary data giving the estimated effects (J x R matrix). This, together with Shat, may be provided instead of Z.

Shat

Alternative summary data giving the standard errors of the estimated effects (J x R matrix). This, together with Bhat, may be provided instead of Z.

varY

The sample covariance of Y, defined as \(Y'Y/(N-1)\). When the sample covariance is not provided, the coefficients (returned from coef) are computed on the “standardized” X, y scale.

...

Additional arguments passed to mvsusie_ss.

XtX

A J x J matrix \(X^TX\) in which the columns of \(X\) are centered to have mean zero.

XtY

A J x R matrix \(X^TY\) in which the columns of \(X\) and \(Y\) are centered to have mean zero.

YtY

An R x R matrix \(Y^TY\) in which the columns of \(Y\) are centered to have mean zero.

X_colmeans

A vector of length J giving the column means of \(X\). If it is provided with Y_colmeans, the intercept is estimated; otherwise, the intercept is NA.

Y_colmeans

A vector of length R giving the column means of \(Y\). If it is provided with X_colmeans, the intercept is estimated; otherwise, the intercept is NA.

Value

A multivariate susie fit, which is a list with some or all of the following elements:

alpha

L by p matrix of posterior inclusion probabilities.

mu

L by p (R=1) or L by p by R (R>1) array of posterior means. Per-effect coefficients can be computed as alpha * mu / X_column_scale_factors.

mu2_diag

L by p (R=1) or L by p by R (R>1) array of the diagonal of the posterior second moment matrix.

pi

Prior inclusion probabilities (length p vector).

Xr

N by R matrix of fitted values on the standardized scale.

KL

Vector of single-effect KL divergences.

lbf

Vector of single-effect log-Bayes factors.

sigma2

Residual variance (R by R matrix for R > 1).

V

Prior variance scalar (per effect).

converged

Logical indicating whether the algorithm converged.

elbo

Vector of ELBO values at each iteration.

niter

Number of iterations performed.

sets

Estimated credible sets.

pip

Vector of posterior inclusion probabilities.

z

Matrix of univariate z-scores (when requested).

single_effect_lfsr

L by R matrix of per-effect lfsr.

lfsr

J by R matrix of per-variable lfsr.

conditional_lfsr

L by J by R array of conditional lfsr (given variable j is the single effect).

lbf_outcome

L by R matrix of per-outcome conditional log Bayes factors. Measures per-outcome evidence from the conditional (residualized) data, without cross-outcome borrowing. Used to filter single_effect_lfsr: outcomes with lbf_outcome < 0 (BF < 1) have their lfsr set to 1.

prior_mixture_weights

Vector of estimated prior mixture weights across the K covariance components (only with mixture prior).

posterior_mixture_weights

L by J by K array of posterior mixture component assignments (L by J by (K+1) when null_weight > 0, with the first slice being the null component).

V_structure

List of prior covariance matrices (only with mixture prior).

Examples

# Example with one response.
set.seed(1)
n <- 2000
p <- 1000
beta <- rep(0, p)
beta[1:4] <- 1
X <- matrix(rnorm(n * p), n, p)
Y <- X %*% beta + rnorm(n)
fit <- mvsusie(X, Y, L = 10)
#> mvsusie: N=2000, J=1000, R=1, L=10 [mem: 0.18 GB]
#> Residual variance set, common_cov=TRUE [mem: 0.19 GB]
#> Eigendecomposition cache: K=1, common_cov=TRUE [mem: 0.20 GB]
#> Model initialized: J=1000, R=1, L=10, K=1 [mem: 0.20 GB]
#> iter          ELBO       delta   sigma2      mem      V
#>    1    -3738.6158           -   diag[5.02,5.02]   0.20 GB  [1.11e+00, 9.88e-01, 8.62e-01, 7.94e-01, 0 x 6]
#>    2    -2918.8700    8.20e+02   diag[1.25,1.25]   0.20 GB  [1.12e+00, 1.15e+00, 9.06e-01, 9.95e-01, 0 x 6]
#>    3    -2899.3934    1.95e+01   diag[1.02,1.02]   0.20 GB  [1.11e+00, 1.15e+00, 9.08e-01, 9.95e-01, 2.35e-03, 8.95e-04, 7.43e-04, 6.35e-04, 5.52e-04, 4.85e-04]
#>    4    -2899.3382    5.53e-02   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 1.50e-03, 9.89e-04, 8.81e-04, 7.99e-04, 7.31e-04, 6.73e-04]
#>    5    -2899.3254    1.28e-02   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 1.04e-03, 1.00e-03, 9.37e-04, 8.76e-04, 8.21e-04, 7.72e-04]
#>    6    -2899.3232    2.21e-03   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 8.78e-04, 9.34e-04, 9.35e-04, 9.08e-04, 8.71e-04, 8.34e-04]
#>    7    -2899.3229    3.03e-04   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 8.50e-04, 8.84e-04, 9.06e-04, 9.07e-04, 8.91e-04, 8.68e-04]
#>    8    -2899.3228    6.15e-05   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 8.60e-04, 8.70e-04, 8.85e-04, 8.94e-04, 8.93e-04, 8.82e-04]  converged

# Sufficient statistics example with one response.
X_colmeans <- colMeans(X)
Y_colmeans <- colMeans(Y)
X <- scale(X, center = TRUE, scale = FALSE)
Y <- scale(Y, center = TRUE, scale = FALSE)
XtX <- crossprod(X)
XtY <- crossprod(X, Y)
YtY <- crossprod(Y)
res <- mvsusie_ss(XtX, XtY, YtY, n, L = 10, X_colmeans, Y_colmeans)
#> Eigendecomposition cache: K=1, common_cov=TRUE [mem: 0.20 GB]
#> Model initialized: J=1000, R=1, L=10, K=1 [mem: 0.20 GB]
#> iter          ELBO       delta   sigma2      mem      V
#>    1    -3738.6158           -   diag[5.02,5.02]   0.20 GB  [1.11e+00, 9.88e-01, 8.62e-01, 7.94e-01, 0 x 6]
#>    2    -2918.8700    8.20e+02   diag[1.25,1.25]   0.20 GB  [1.12e+00, 1.15e+00, 9.06e-01, 9.95e-01, 0 x 6]
#>    3    -2899.3934    1.95e+01   diag[1.02,1.02]   0.20 GB  [1.11e+00, 1.15e+00, 9.08e-01, 9.95e-01, 2.35e-03, 8.95e-04, 7.43e-04, 6.35e-04, 5.52e-04, 4.85e-04]
#>    4    -2899.3382    5.53e-02   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 1.50e-03, 9.89e-04, 8.81e-04, 7.99e-04, 7.31e-04, 6.73e-04]
#>    5    -2899.3254    1.28e-02   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 1.04e-03, 1.00e-03, 9.37e-04, 8.76e-04, 8.21e-04, 7.72e-04]
#>    6    -2899.3232    2.21e-03   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 8.78e-04, 9.34e-04, 9.35e-04, 9.08e-04, 8.71e-04, 8.34e-04]
#>    7    -2899.3229    3.03e-04   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 8.50e-04, 8.84e-04, 9.06e-04, 9.07e-04, 8.91e-04, 8.68e-04]
#>    8    -2899.3228    6.15e-05   diag[1.01,1.01]   0.20 GB  [1.11e+00, 1.15e+00, 9.09e-01, 9.95e-01, 8.60e-04, 8.70e-04, 8.85e-04, 8.94e-04, 8.93e-04, 8.82e-04]  converged
#> WARNING: Xcorr is not symmetric; using (Xcorr + t(Xcorr))/2.

# RSS example with one response.
R <- crossprod(X)
z <- susieR::calc_z(X, Y)
res <- mvsusie_rss(z, R, N = n, L = 10)
#> Eigendecomposition cache: K=1, common_cov=TRUE [mem: 0.22 GB]
#> Model initialized: J=1000, R=1, L=10, K=1 [mem: 0.22 GB]
#> iter          ELBO       delta   sigma2      mem      V
#>    1    -2837.3771           -   diag[1,1]   0.22 GB  [0 x 10]
#>    2    -2837.3771    0.00e+00   diag[1,1]   0.22 GB  [0 x 10]  converged
#> WARNING: Xcorr is not symmetric; using (Xcorr + t(Xcorr))/2.

# Example with three responses.
set.seed(1)
n <- 500
p <- 1000
true_eff <- 2
X <- sample(c(0, 1, 2), size = n * p, replace = TRUE)
X <- matrix(X, n, p)
beta1 <- rep(0, p)
beta2 <- rep(0, p)
beta3 <- rep(0, p)
beta1[1:true_eff] <- runif(true_eff)
beta2[1:true_eff] <- runif(true_eff)
beta3[1:true_eff] <- runif(true_eff)
y1 <- X %*% beta1 + rnorm(n)
y2 <- X %*% beta2 + rnorm(n)
y3 <- X %*% beta3 + rnorm(n)
Y <- cbind(y1, y2, y3)
prior <- create_mixture_prior(R = 3)
fit <- mvsusie(X, Y, prior_variance = prior)
#> mvsusie: N=500, J=1000, R=3, L=10 [mem: 0.19 GB]
#> Residual variance set, common_cov=TRUE [mem: 0.19 GB]
#> Prior: K=8 mixture components [mem: 0.19 GB]
#> Eigendecomposition cache: K=8, common_cov=TRUE [mem: 0.19 GB]
#> Model initialized: J=1000, R=3, L=10, K=8 [mem: 0.19 GB]
#> iter          ELBO       delta   sigma2      mem      V
#>    1    -2176.3664           -   diag[0.958,1.42]   0.19 GB  [1.63e-01, 0 x 9]
#>    2    -2160.4622    1.59e+01   diag[0.954,1.12]   0.19 GB  [2.05e-01, 0 x 9]
#>    3    -2160.4621    1.14e-04   diag[0.954,1.12]   0.19 GB  [2.05e-01, 0 x 9]
#>    4    -2160.4621    5.07e-10   diag[0.954,1.12]   0.19 GB  [2.05e-01, 0 x 9]  converged

# Sufficient statistics example with three responses.
X_colmeans <- colMeans(X)
Y_colmeans <- colMeans(Y)
X <- scale(X, center = TRUE, scale = FALSE)
Y <- scale(Y, center = TRUE, scale = FALSE)
XtX <- crossprod(X)
XtY <- crossprod(X, Y)
YtY <- crossprod(Y)
res <- mvsusie_ss(XtX, XtY, YtY, n,
  L = 10, X_colmeans, Y_colmeans,
  prior_variance = prior
)
#> Eigendecomposition cache: K=8, common_cov=TRUE [mem: 0.20 GB]
#> Model initialized: J=1000, R=3, L=10, K=8 [mem: 0.20 GB]
#> iter          ELBO       delta   sigma2      mem      V
#>    1    -2176.3664           -   diag[0.958,1.42]   0.20 GB  [1.63e-01, 0 x 9]
#>    2    -2160.4622    1.59e+01   diag[0.954,1.12]   0.20 GB  [2.05e-01, 0 x 9]
#>    3    -2160.4621    1.14e-04   diag[0.954,1.12]   0.20 GB  [2.05e-01, 0 x 9]
#>    4    -2160.4621    5.07e-10   diag[0.954,1.12]   0.20 GB  [2.05e-01, 0 x 9]  converged
#> WARNING: Xcorr is not symmetric; using (Xcorr + t(Xcorr))/2.

# RSS example with three responses.
R <- crossprod(X)
Z <- calc_z(X, Y)
res <- mvsusie_rss(Z, R, N = n, L = 10, prior_variance = prior)
#> Eigendecomposition cache: K=8, common_cov=TRUE [mem: 0.21 GB]
#> Model initialized: J=1000, R=3, L=10, K=8 [mem: 0.21 GB]
#> iter          ELBO       delta   sigma2      mem      V
#>    1    -2119.8682           -   diag[1,1]   0.21 GB  [0 x 10]
#>    2    -2119.8682    0.00e+00   diag[1,1]   0.21 GB  [0 x 10]  converged
#> WARNING: Xcorr is not symmetric; using (Xcorr + t(Xcorr))/2.