Solves the empirical Bayes normal means (EBNM) problem for observations belonging to distinct groups.
ebnm_group(
x,
s = 1,
group,
prior_family = "point_normal",
mode = 0,
scale = "estimate",
g_init = NULL,
fix_g = FALSE,
output = ebnm_output_default(),
...
)
A vector of observations. Missing observations (NA
s) are
not allowed.
A vector of standard errors (or a scalar if all are equal).
Standard errors may not be exactly zero, and
missing standard errors are not allowed. Two prior families have
additional restrictions: when horseshoe priors are used, errors
must be homoskedastic; and since function
deconv
in package deconvolveR
takes
\(z\)-scores, the "deconvolver" family requires that all standard errors
be equal to 1.
A vector of character strings that gives the group to which each
observation belongs. It must have the same length as argument x
. For
an example of usage, see Examples below.
A named vector that specifies the prior family \(G\) for each group. If the same prior family is to be used for all groups, then a character string may be used instead.
A named list that specifies, for each group, the mode of the
respective prior \(g\), or "estimate"
if the mode is to be
estimated from the data. If the mode is the same across groups, then a
scalar may be used instead. If all modes are to be estimated, then
mode = "estimate"
may be used.
A named list that specifies, for each group, the scale
parameter(s) of the respective prior, or "estimate"
if the scale
parameters are to be estimated from the data. If the scale parameter is the
same across groups, then a scalar may be used instead. If all scales are to
be estimated, then scale = "estimate"
may be used.
The prior distributions \(g\). Usually this is left
unspecified (NULL
) and estimated from the data. However, it can be
used in conjuction with fix_g = TRUE
to fix the prior (useful, for
example, to do computations with the "true" \(g\) in simulations). If
g_init
is specified but fix_g = FALSE
, g_init
specifies the initial value of \(g\) used during optimization. If
g_init
is supplied, it should be a named list that specifies, for
each group, a prior of the appropriate class (normalmix
for normal, point-normal,
scale mixture of normals, and deconvolveR
prior families, as well as
for the NPMLE; class laplacemix
for
point-Laplace families; class gammamix
for point-exponential
families; class horseshoe
for horseshoe families; and class
unimix
for unimodal_
families).
If TRUE
, fix the prior \(g\) at g_init
instead
of estimating it.
A character vector indicating which values are to be returned.
Function ebnm_output_default()
provides the default return values, while
ebnm_output_all()
lists all possible return values. See Value
below.
Additional parameters. When a unimodal_
prior family is used,
these parameters are passed to function ash
in package
ashr
. Although it
does not call into ashr
, the scale mixture of normals family accepts
parameter gridmult
for purposes of comparison. When gridmult
is set, an ashr
-style grid will be used instead of the default
ebnm
grid. When the "deconvolver" family is used, additional
parameters are passed to function deconv
in
package deconvolveR
. Families of generalized binary priors take several
additional parameters; see ebnm_generalized_binary
. In all
other cases, additional parameters are ignored.
An ebnm
object. Depending on the argument to output
, the
object is a list containing elements:
data
A data frame containing the observations x
and standard errors s
.
posterior
A data frame of summary results (posterior means, standard deviations, second moments, and local false sign rates).
fitted_g
The fitted prior \(\hat{g}\) (an object of
class normalmix
, laplacemix
,
gammamix
, unimix
,
tnormalmix
, or horseshoe
).
log_likelihood
The optimal log likelihood attained, \(L(\hat{g})\).
posterior_sampler
A function that can be used to
produce samples from the posterior. For all prior families other
than the horseshoe, the sampler takes a single parameter
nsamp
, the number of posterior samples to return per
observation. Since ebnm_horseshoe
returns an MCMC sampler,
it additionally takes parameter burn
, the number of burn-in
samples to discard.
S3 methods coef
, confint
, fitted
, logLik
,
nobs
, plot
, predict
, print
, quantile
,
residuals
, simulate
, summary
, and vcov
have been implemented for ebnm
objects. For details, see the
respective help pages, linked below under See Also.
The EBNM model for grouped data, with observations \(x_j\) belonging to groups \(k = 1, ..., K\), is $$x_j | \theta_j, s_j \sim N(\theta_j, s_j^2)$$ $$\theta_j \sim g_{k(j)} \in G_{k(j)}.$$
Solving the EBNM problem for grouped data is equivalent to solving a separate EBNM problem for each group \(k = 1, ..., K\), with the optimal log likelihood equal to the sum of the optimal log likelihoods for each separate problem.
group <- c(rep("small_sd", 100), rep("large_sd", 100))
theta <- c(rnorm(100, sd = 1), rnorm(100, sd = 10))
s <- 1
x <- theta + rnorm(200, 0, s)
ebnm.group.res <- ebnm_group(x, s, group)
# Use different prior families for each group:
ebnm.group.res <- ebnm_group(
x, s, group,
prior_family = list(small_sd = "normal", large_sd = "normal_scale_mixture")
)
# Different modes and scales can be set similarly:
ebnm.group.res <- ebnm_group(
x, s, group,
mode = list(small_sd = 0, large_sd = "estimate"),
scale = list(small_sd = 1, large_sd = "estimate")
)