Compute log-likelihoods and deviances for assessing fit of a topic model or a non-negative matrix factorization (NMF).
loglik_poisson_nmf(X, fit, e = 1e-08)
loglik_multinom_topic_model(X, fit, e = 1e-08)
deviance_poisson_nmf(X, fit, e = 1e-08)
cost(X, A, B, e = 1e-08, family = c("poisson", "multinom"), version)
The n x m matrix of counts or pseudocounts. It can be a
sparse matrix (class "dgCMatrix"
) or dense matrix (class
"matrix"
).
A Poisson NMF or multinomial topic model fit, such as an
output from fit_poisson_nmf
or
fit_topic_model
.
A small, non-negative number added to the terms inside the logarithms to avoid computing logarithms of zero. This prevents numerical problems at the cost of introducing a very small inaccuracy in the computation.
The n x k matrix of loadings. It should be a dense matrix.
The k x m matrix of factors. It should be a dense matrix.
If model = "poisson"
, the loss function values
corresponding to the Poisson non-negative matrix factorization are
computed; if model = "multinom"
, the multinomial topic model
loss function values are returned.
When version == "R"
, the computations are
performed entirely in R; when version == "Rcpp"
, an Rcpp
implementation is used. The R version is typically faster when
X
is a dense matrix, whereas the Rcpp version is faster and
more memory-efficient when X
is a large, sparse matrix. When
not specified, the most suitable version is called depending on
whether X
is dense or sparse.
A numeric vector with one entry per row of X
.
Function cost
computes loss functions proportional
to the negative log-likelihoods, and is mainly for internal use to
quickly compute log-likelihoods and deviances; it should not be
used directly unless you know what you are doing. In particular,
little argument checking is performed by cost
.
# Generate a small counts matrix.
set.seed(1)
out <- simulate_count_data(10,20,3)
X <- out$X
fit <- out[c("F","L")]
class(fit) <- c("poisson_nmf_fit","list")
# Compute the Poisson log-likelihoods and deviances.
data.frame(loglik = loglik_poisson_nmf(X,fit),
deviance = deviance_poisson_nmf(X,fit))
#> loglik deviance
#> i1 -26.70741 28.66406
#> i2 -22.38798 25.56485
#> i3 -18.50695 23.17278
#> i4 -21.45998 24.70071
#> i5 -26.31432 21.71899
#> i6 -15.33949 19.45157
#> i7 -21.03829 20.23547
#> i8 -16.05085 17.48799
#> i9 -20.74577 23.03671
#> i10 -25.25658 17.93207
# Compute multinomial log-likelihoods.
loglik_multinom_topic_model(X,fit)
#> i1 i2 i3 i4 i5 i6 i7 i8
#> -24.15063 -19.55914 -15.95322 -19.13328 -23.80668 -11.59211 -17.17074 -12.79242
#> i9 i10
#> -18.51231 -22.59113