Compute log-likelihoods and deviances for assessing fit of a topic model or a non-negative matrix factorization (NMF).

loglik_poisson_nmf(X, fit, e = 1e-08)

loglik_multinom_topic_model(X, fit, e = 1e-08)

deviance_poisson_nmf(X, fit, e = 1e-08)

cost(X, A, B, e = 1e-08, family = c("poisson", "multinom"), version)

Arguments

X

The n x m matrix of counts or pseudocounts. It can be a sparse matrix (class "dgCMatrix") or dense matrix (class "matrix").

fit

A Poisson NMF or multinomial topic model fit, such as an output from fit_poisson_nmf or fit_topic_model.

e

A small, non-negative number added to the terms inside the logarithms to avoid computing logarithms of zero. This prevents numerical problems at the cost of introducing a very small inaccuracy in the computation.

A

The n x k matrix of loadings. It should be a dense matrix.

B

The k x m matrix of factors. It should be a dense matrix.

family

If model = "poisson", the loss function values corresponding to the Poisson non-negative matrix factorization are computed; if model = "multinom", the multinomial topic model loss function values are returned.

version

When version == "R", the computations are performed entirely in R; when version == "Rcpp", an Rcpp implementation is used. The R version is typically faster when X is a dense matrix, whereas the Rcpp version is faster and more memory-efficient when X is a large, sparse matrix. When not specified, the most suitable version is called depending on whether X is dense or sparse.

Value

A numeric vector with one entry per row of X.

Details

Function cost computes loss functions proportional to the negative log-likelihoods, and is mainly for internal use to quickly compute log-likelihoods and deviances; it should not be used directly unless you know what you are doing. In particular, little argument checking is performed by cost.

Examples


# Generate a small counts matrix.
set.seed(1)
out <- simulate_count_data(10,20,3)
X   <- out$X
fit <- out[c("F","L")]
class(fit) <- c("poisson_nmf_fit","list")

# Compute the Poisson log-likelihoods and deviances.
data.frame(loglik   = loglik_poisson_nmf(X,fit),
           deviance = deviance_poisson_nmf(X,fit))
#>        loglik deviance
#> i1  -26.70741 28.66406
#> i2  -22.38798 25.56485
#> i3  -18.50695 23.17278
#> i4  -21.45998 24.70071
#> i5  -26.31432 21.71899
#> i6  -15.33949 19.45157
#> i7  -21.03829 20.23547
#> i8  -16.05085 17.48799
#> i9  -20.74577 23.03671
#> i10 -25.25658 17.93207

# Compute multinomial log-likelihoods.
loglik_multinom_topic_model(X,fit)
#>        i1        i2        i3        i4        i5        i6        i7        i8 
#> -24.15063 -19.55914 -15.95322 -19.13328 -23.80668 -11.59211 -17.17074 -12.79242 
#>        i9       i10 
#> -18.51231 -22.59113