R/grid_selection.R
ebnm_scale_npmle.Rd
The default method for setting the scale
parameter for functions
ebnm_npmle
and ebnm_deconvolver
.
ebnm_scale_npmle(
x,
s,
min_K = 3,
max_K = 300,
KLdiv_target = 1/length(x),
pointmass = TRUE
)
A vector of observations. Missing observations (NA
s) are
not allowed.
A vector of standard errors (or a scalar if all are equal). Standard errors may not be exactly zero, and missing standard errors are not allowed.
The minimum number of components \(K\) to include in the mixture of point masses used to approximate the nonparametric family of all distributions.
The maximum number of components \(K\) to include in the approximating mixture of point masses.
The desired bound \(\kappa\) on the KL-divergence from the solution obtained using the approximating mixture to the exact solution. More precisely, the scale parameter is set such that given the exact MLE $$\hat{g} := \mathrm{argmax}_{g \in G} L(g),$$ where \(G\) is the full nonparametric family, and given the MLE for the approximating family \(\tilde{G}\) $$\tilde{g} := \mathrm{argmax}_{g \in \tilde{G}} L(g),$$ we have that $$\mathrm{KL}(\hat{g} \ast N(0, s^2) \mid \tilde{g} \ast N(0, s^2)) \le \kappa,$$ where \(\ast \ N(0, s^2)\) denotes convolution with the normal error distribution (the derivation of the bound assumes homoskedastic observations). For details, see References below.
When the range of the data is so large that max_K
point masses cannot provide a good approximation to the family of all
distributions, then ebnm
will instead use a mixture of normal
distributions, with the standard deviation of each component equal to
scale
\(/ 2\). Setting pointmass = FALSE
gives the
default scale
for this mixture of normal distributions.
To be exact, ebnm
uses a mixture of normal distributions rather than
a mixture of point masses when
$$\frac{\max(x) - \min(x)}{\min(s)} > 3 \ \mathrm{max}_K;$$ for a
rationale, see References below. Note however that ebnm
only uses a mixture of normal distributions when scale = "estimate"
;
if parameter scale
is set manually, then a mixture of point masses
will be used in all cases. To use a mixture of normal distributions with
the scale set manually, an object created by the constructor function
normalmix
must be provided as argument to parameter
g_init
in function ebnm_npmle
or function
ebnm_deconvolver
.
Jason Willwerscheid (2021). Empirical Bayes Matrix Factorization: Methods and Applications. University of Chicago, PhD dissertation.