Performs SuSiE regression using z-scores and correlation matrix.
This is the sufficient-statistics RSS interface. For the specialized
regularized eigendecomposition likelihood with lambda > 0, use
susie_rss_lambda.
Usage
susie_rss(
z = NULL,
R = NULL,
n = NULL,
X = NULL,
bhat = NULL,
shat = NULL,
var_y = NULL,
L = min(10, if (is.list(R) && !is.matrix(R)) ncol(R[[1]]) else if (!is.null(R)) ncol(R)
else if (is.list(X) && !is.matrix(X)) ncol(X[[1]]) else ncol(X)),
maf = NULL,
maf_thresh = 0,
scaled_prior_variance = 0.2,
residual_variance = NULL,
prior_weights = NULL,
null_weight = 0,
standardize = TRUE,
estimate_residual_variance = FALSE,
estimate_residual_method = c("MoM", "MLE", "NIG"),
estimate_prior_variance = TRUE,
estimate_prior_method = c("optim", "EM", "simple"),
prior_variance_grid = NULL,
mixture_weights = NULL,
unmappable_effects = c("none", "inf", "ash", "ash_filter_archived"),
check_null_threshold = 0,
prior_tol = 1e-09,
residual_variance_lowerbound = 0,
residual_variance_upperbound = Inf,
model_init = NULL,
s_init = NULL,
coverage = 0.95,
min_abs_corr = 0.5,
max_iter = NULL,
L_greedy = NULL,
greedy_lbf_cutoff = 0.1,
tol = NULL,
convergence_method = c("elbo", "pip"),
verbose = FALSE,
track_fit = FALSE,
check_input = FALSE,
check_prior = TRUE,
n_purity = 100,
r_tol = 1e-08,
refine = FALSE,
R_finite = NULL,
R_mismatch = c("none", "eb"),
R_mismatch_method = c("mle", "map"),
eig_delta_rel = 0.001,
eig_delta_abs = 0,
artifact_threshold = 0.1,
R_sensitivity_threshold = log(20),
alpha0 = NULL,
beta0 = NULL,
init_only = FALSE,
slot_prior = NULL
)Arguments
- z
A p-vector of z-scores.
- R
A p by p correlation matrix. Exactly one of
RorXmust be provided.- n
The sample size, not required but recommended.
- X
A factor matrix (B x p) such that
R = crossprod(X) / nrow(X)approximates the R (correlation) matrix. Whennrow(X) >= ncol(X), the correlation matrixRis formed explicitly and the standard path is used. Whennrow(X) < ncol(X), a low-rank path is used that avoids forming the p x p matrix, reducing per-iteration cost from O(Lp^2) to O(LBp). Columns ofXare standardized internally.- bhat
Alternative summary data giving the estimated effects (a vector of length p). This, together with
shat, may be provided instead ofz.- shat
Alternative summary data giving the standard errors of the estimated effects (a vector of length p). This, together with
bhat, may be provided instead ofz.- var_y
The sample variance of y, defined as \(y'y/(n-1)\). When the sample variance is not provided, the coefficients (returned from
coef) are computed on the “standardized” X, y scale.- L
Maximum number of non-zero effects in the model. If L is larger than the number of covariates, p, L is set to p.
- maf
A p-vector of minor allele frequencies; to be used along with
maf_threshto filter input summary statistics.- maf_thresh
Variants with MAF smaller than this threshold are not used.
- scaled_prior_variance
The prior variance, divided by
var(y)(or by(1/(n-1))ytyforsusie_ss); that is, the prior variance of each non-zero element of b isvar(y) * scaled_prior_variance. The value provided should be either a scalar or a vector of lengthL. Ifestimate_prior_variance = TRUE, this provides initial estimates of the prior variances.- residual_variance
Variance of the residual. If
estimate_residual_variance = TRUE, this value provides the initial estimate of the residual variance. By default, it is set tovar(y)insusieand(1/(n-1))ytyinsusie_ss.- prior_weights
A vector of length p, in which each entry gives the prior probability that corresponding column of X has a nonzero effect on the outcome, y. The weights are internally normalized to sum to 1. When
NULL(the default), uniform prior weights are used (each variable is assigned probability1/p).- null_weight
Prior probability of no effect (a number between 0 and 1, and cannot be exactly 1).
- standardize
If
standardize = TRUE, standardize the columns of X to unit variance prior to fitting (or equivalently standardize XtX and Xty to have the same effect). Note thatscaled_prior_variancespecifies the prior on the coefficients of X after standardization (if it is performed). If you do not standardize, you may need to think more carefully about specifyingscaled_prior_variance. Whatever your choice, the coefficients returned bycoefare given forXon the original input scale. Any column ofXthat has zero variance is not standardized.- estimate_residual_variance
The default is FALSE, the residual variance is fixed to 1 or variance of y. If the in-sample R matrix is provided, we recommend setting
estimate_residual_variance = TRUE.- estimate_residual_method
The method used for estimating residual variance. For the original SuSiE model, "MLE" and "MoM" estimation is equivalent, but for the infinitesimal model, "MoM" is more stable. We recommend using "NIG" when n < 80 for improved coverage, although it is currently only implemented for individual-level data.
- estimate_prior_variance
If
estimate_prior_variance = TRUE, the prior variance is estimated (this is a separate parameter for each of the L effects). If provided,scaled_prior_varianceis then used as an initial value for the optimization. Whenestimate_prior_variance = FALSE, the prior variance for each of the L effects is determined by the value supplied toscaled_prior_variance.- estimate_prior_method
The method used for estimating prior variance. When
estimate_prior_method = "simple"is used, the likelihood at the specified prior variance is compared to the likelihood at a variance of zero, and the setting with the larger likelihood is retained. Whenprior_variance_gridis provided, this is automatically set to"fixed_mixture".- prior_variance_grid
Numeric vector of K prior variances defining a mixture-of-normals prior on effect sizes. When provided, the SER evaluates Bayes factors at each grid point and forms a mixture BF weighted by
mixture_weights. This bypasses the scalar prior variance optimization. Default isNULL(standard scalar V path).- mixture_weights
Numeric vector of K non-negative weights summing to 1, giving the mixture proportions for the variance grid. Default is
NULL, which uses uniform weights whenprior_variance_gridis provided.- unmappable_effects
The method for modeling unmappable effects: "none", "inf", "ash".
- check_null_threshold
When the prior variance is estimated, compare the estimate with the null, and set the prior variance to zero unless the log-likelihood using the estimate is larger by this threshold amount. For example, if you set
check_null_threshold = 0.1, this will "nudge" the estimate towards zero when the difference in log-likelihoods is small. A note of caution that setting this to a value greater than zero may lead the IBSS fitting procedure to occasionally decrease the ELBO. This setting is disabled when usingunmappable_effects = "inf"orunmappable_effects = "ash".- prior_tol
When the prior variance is estimated, compare the estimated value to
prior_tolat the end of the computation, and exclude a single effect from PIP computation if the estimated prior variance is smaller than this tolerance value.- residual_variance_lowerbound
Lower limit on the estimated residual variance. It is only relevant when
estimate_residual_variance = TRUE.- residual_variance_upperbound
Upper limit on the estimated residual variance. It is only relevant when
estimate_residual_variance = TRUE.- model_init
A previous susie fit with which to initialize.
- s_init
Deprecated alias for
model_init.- coverage
A number between 0 and 1 specifying the “coverage” of the estimated confidence sets.
- min_abs_corr
Minimum absolute correlation allowed in a credible set. The default, 0.5, corresponds to a squared correlation of 0.25, which is a commonly used threshold for genotype data in genetic studies. This "purity" filter is applied to the CSs reported in the fit object, so the CS list returned here may be a subset of the one produced by calling
susie_get_cson the same fit without passingXorXcorr(in which case the purity filter is skipped).- max_iter
Maximum number of IBSS iterations to perform. For
susie_rss()andsusie_rss_lambda(),NULLuses50with a hint; other interfaces use100.- L_greedy
Integer or
NULL. When non-NULL, run a greedy outer loop that grows the number of effects fromL_greedyup toLin linear steps until the fit saturates. The defaultNULLruns the usual fixed-Lfit.- greedy_lbf_cutoff
Numeric saturation threshold for the
L_greedyouter loop. Default is 0.1.- tol
A small, non-negative number specifying the convergence tolerance for the IBSS fitting procedure. When
NULL, the default is1e-4; forestimate_residual_method = "NIG", the default is tightened to1e-6.- convergence_method
When
converge_method = "elbo"the fitting procedure halts when the difference in the variational lower bound, or “ELBO” (the objective function to be maximized), is less thantol. Whenconverge_method = "pip"the fitting procedure halts when the maximum absolute difference inalphais less thantol.- verbose
If
verbose = TRUE, the algorithm's progress, a summary of the optimization settings, and refinement progress (ifrefine = TRUE) are printed to the console.- track_fit
If
track_fit = TRUE, a compactsusie_trackobject is returned intrace, containing alpha history, effect summaries and available diagnostics at each iteration of the IBSS fitting procedure.- check_input
If
check_input = TRUE,susie_ssperforms additional checks onXtXandXty. The checks are: (1) check thatXtXis positive semidefinite; (2) check thatXtyis in the space spanned by the non-zero eigenvectors ofXtX.- check_prior
If
check_prior = TRUE, it checks if the estimated prior variance becomes unreasonably large (comparing with 10 * max(abs(z))^2).- n_purity
Passed as argument
n_puritytosusie_get_cs.- r_tol
Tolerance level for eigenvalue check of positive semidefinite matrix
XtX.- refine
If
refine = TRUE, then an additional iterative refinement procedure is used, after the IBSS algorithm, to check and escape from local optima (see details).- R_finite
Controls variance inflation to account for estimating the R matrix from a finite reference panel. Accepts four types of input:
NULL(default)The R matrix is treated as trusted, and no finite-reference variance inflation is applied. With
estimate_residual_variance = TRUE, this keeps the in-sample R warning active.FALSENo finite-reference variance inflation is applied, and the in-sample R warning for
estimate_residual_variance = TRUEis silenced. Use this only whenRis the in-sample correlation matrix.TRUEInfer the reference sample size B from the input
X. SetsB = nrow(X)for single-panel input, or one B per panel for multi-panel input. RequiresXto be provided (errors if onlyRis given, since B cannot be inferred).- Number
Explicit reference sample size B.
When active, this dynamically inflates the null variance of each variable's score statistic at every IBSS iteration to account for finite-reference uncertainty in the Single Effect Regression (SER). When provided, the output includes a
R_finite_diagnosticselement with per-region and per-variable quality metrics.- R_mismatch
R-mismatch correction mode.
"none"(default) is off."eb"is the recommended empirical Bayes procedure for mismatch correction described in Sun et al. (2026+). It updates a region-level variance component during the IBSS iterations and reports a QC score (Q_art) that extends the Zou et al. (2022) column-space check from the input summary vector to the fitted residual after correction. It warns when that residual still projects onto near-null directions of the suppliedR, and auto-disablesestimate_residual_variancewith a warning.- R_mismatch_method
Estimator for the region-level
lambda_biasvariance component whenR_mismatch != "none"."mle"(default) maximizes the working Gaussian likelihood."map"uses a half-Cauchy MAP estimator.- eig_delta_rel, eig_delta_abs
Cutoffs for "low-eigenvalue" directions of
Rused by the QC diagnostic whenR_mismatch != "none". Defaulteig_delta_rel = 1e-3,eig_delta_abs = 0; the threshold ismax(eig_delta_abs, eig_delta_rel * max_eigenvalue(R)). Tighter (smaller) values flag fewer regions.- artifact_threshold
Flag threshold on the QC score
Q_art(a fraction in [0, 1]). Default0.1; flag fires whenQ_art > artifact_threshold. Heuristic, not a calibrated test.- R_sensitivity_threshold
Flag threshold for the credible-set Bayes-factor attenuation diagnostic. Default
log(20); flag fires when a credible set contains a variable whose nominal log BF exceeds its R-adjusted log BF by at least this amount.- alpha0
Numerical parameter for the NIG prior when using
estimate_residual_method = "NIG". Defaults to1/sqrt(n), wherenis the sample size. When callingsusie_rsswith NIG,nmust be supplied; otherwise validation errors.- beta0
Numerical parameter for the NIG prior when using
estimate_residual_method = "NIG". Defaults to1/sqrt(n), wherenis the sample size. When callingsusie_rsswith NIG,nmust be supplied; otherwise validation errors.- init_only
Logical. If
TRUE, return a list withdataandparamsobjects without running the IBSS algorithm. Default isFALSE.- slot_prior
Optional slot activity prior created by
slot_prior_betabinomorslot_prior_poisson. Useslot_prior_betabinom(a_beta, b_beta)for the usual single-locus setting; it places a Beta-Binomial prior on the number of active effects and gives an adaptive multiplicity correction. Useslot_prior_poisson(C, nu)when you want a Gamma-Poisson prior centered on an expected numberCof active effects. When supplied, each single-effect slot has an estimated activity probabilityc_hat; fitted values and PIPs are weighted by these activity probabilities, and convergence is checked usingconvergence_method = "pip".
Value
In addition to the standard "susie" output (see
susie), the returned object may contain:
- R_finite_diagnostics
A list of diagnostics for the R-uncertainty correction (only present when
R_finiteis provided orR_mismatch != "none"), containing:B(the reference sample size);p(number of variables);effective_rank(debiased \(\tilde{r} = p^2 / \|R\|_F^2\));r_over_B(\(\tilde{r}/B\), one number per region; values \(\le 0.2\) indicate the reference panel is adequate);Rhat_diag_deviation(\(|\hat{R}_{jj} - 1|\), one number per variable);lambda_bias(region-level scalar on the defaultlambda = 0sufficient-statistics path whenR_mismatch != "none");B_corrected(effective reference sample size after the R-mismatch correction, \(1/(1/B + \lambda_{\mathrm{bias}})\); substantially smaller than the inputBflags a dominant population mismatch component);per_variable_penalty(final-iteration \(v_j / \sigma^2 = \tau_j^2 / \sigma^2 - 1\), one number per variable; values \(\le 0.2\) indicate minimal power loss, values \(\gg 1\) flag variables where the correction is doing heavy lifting).