t-SNE Plot — tsne_plot • fastTopics

Visualize the "structure" of the Poisson NMF loadings or multinomial topic model mixture proportions by projection onto a 2-d surface. Samples in the projection are colored according to the their loadings/proportions. By default, t-SNE is used to compute the 2-d embedding from the loadings or mixture proportions.

tsne_plot(
  fit,
  color = c("mixprop", "loading"),
  k,
  tsne,
  ggplot_call = tsne_plot_ggplot_call,
  plot_grid_call = function(plots) do.call(plot_grid, plots),
  ...
)

tsne_plot_ggplot_call(dat, topic.label, font.size = 9)

Arguments

fit	An object of class “poisson_nmf_fit” or “multinom_topic_model_fit”.
color	The data mapped to the color “aesthetic” in the plot. When `color = "loading"`, the estimated loadings (stored the `L` matrix) in the Poisson NMF model are plotted; when `color = "mixprop"`, the estimated mixture proportions (which are recovered from the loadings by calling `poisson2multinom` are shown. When `fit` is a “multinom_topic_model_fit” object, the only available option is `color = "mixprop"`. In most settings, the mixture proportions are preferred, even if the 2-d embedding is computed from the loading matrix of the Poisson NMF model.
k	The topic, or topics, selected by number or name. One plot is created per selected topic. When not specified, all topics are plotted.
tsne	A 2-d embedding of the samples (rows of X), or a subset of the samples, such as an output from `tsne_from_topics`. It should be a list object with the same structure as a `tsne_from_topics` output; see `tsne_from_topics` for details. If not provided, a 2-d t-SNE embedding will be estimated automatically by calling `tsne_from_topics`.
ggplot_call	The function used to create the plot. Replace `tsne_plot_ggplot_call` with your own function to customize the appearance of the plot.
plot_grid_call	When multiple topics are selected, this is the function used to arrange the plots into a grid using `plot_grid`. It should be a function accepting a single argument, `plots`, a list of `ggplot` objects.
...	Additional arguments passed to `tsne_from_topics`. These arguments are only used if `tsne` is not provided.
dat	A data frame passed as input to `ggplot`, containing, at a minimum, columns “d1”, “d2” (the first and second dimensions in the 2-d embedding), and “loading”.
topic.label	The name or number of the topic being plotted; it is only used to determine the plot title.
font.size	Font size used in plot.

Value

A ggplot object.

Details

This is a lightweight interface primarily intended to expedite creation of scatterplots for visualizing the loadings or mixture proportions in 2-d; most of the “heavy lifting” is done by ggplot2. The 2-d embedding itself is computed by invoking function tsne_from_topics (unless the “tsne” input is provided). For more control over the plot's appearance, the plot can be customized by modifying the ggplot_call and plot_grid_call arguments.

An effective 2-d visualization may also necessitate some fine-tunning of the t-SNE settings, such as the “perplexity”, or the number of samples included in the plot. The t-SNE settings can be controlled by the additional arguments (...) passed to tsne_from_topics; see tsne_from_topics for details. Alternatively, a 2-d embedding may be pre-computed, and passed as argument tsne to tsne_plot.