Create a “Structure plot” from a multinomial topic model fit or other model with “loadings” or “weights”. The Structure plot represents the estimated topic proportions of each sample in a stacked bar chart, with bars of different colors representing different topics. Consequently, samples that have similar topic proportions have similar amounts of each color.
structure_plot(
fit,
topics,
grouping,
loadings_order = "embed",
n = 2000,
colors,
gap = 1,
embed_method = structure_plot_default_embed_method,
ggplot_call = structure_plot_ggplot_call,
...
)
structure_plot_default_embed_method(fit, ...)
# S3 method for poisson_nmf_fit
plot(x, ...)
# S3 method for multinom_topic_model_fit
plot(x, ...)
structure_plot_ggplot_call(dat, colors, ticks = NULL, font.size = 9)
An object of class “poisson_nmf_fit” or
“multinom_topic_model_fit”, or an n x k matrix of topic
proportions, where k is the number of topics. (The elements in each
row of this matrix should sum to 1.) If a Poisson NMF fit is
provided as input, the corresponding multinomial topic model fit is
automatically recovered using poisson2multinom
.
Top-to-bottom ordering of the topics in the Structure
plot; topics[1]
is shown on the top, topics[2]
is
shown next, and so on. If the ordering of the topics is not
specified, the topics are automatically ordered so that the topics
with the greatest total “mass” are at shown at the bottom of
the plot. The topics may be specified by number or by name. Note
that not all of the topics need to be included, so one may also use
this argument to plot a subset of the topics.
Optional categorical variable (a factor) with one
entry for each row of the loadings matrix fit$L
defining a
grouping of the samples (rows). The samples (rows) are arranged
along the horizontal axis according to this grouping, then within
each group according to loadings_order
. If
grouping
is not a factor, an attempt is made to convert it
to a factor using as.factor
. Note that if
loadings_order
is specified manually, grouping
should
be the groups for the rows of fit$L
before reordering.
Ordering of the rows of the loadings matrix
fit$L
along the horizontal axis the Structure plot (after
they have been grouped). If loadings_order = "embed"
, the
ordering is generated automatically from a 1-d embedding,
separately for each group. The rows may be specified by number or
by name. Note that loadings_order
may include all the rows
of fit$L
, or a subset.
The maximum number of samples (rows of the loadings matrix
fit$L
) to include in the plot. Typically there is little to
no benefit in including large number of samples in the Structure
plot due to screen resolution limits. Ignored if
loadings_order
is provided.
Colors used to draw topics in Structure plot.
The horizontal spacing between groups. Ignored if
grouping
is not provided.
The function used to compute an 1-d embedding
from a loadings matrix fit$L
; only used if
loadings_order = "embed"
. The function must accept the
multinomial topic model fit as its first input (“fit”) and
additional arguments may be passed (...). The output should be a
named numeric vector with one entry per row of fit$L
, and
the names of the entries should be the same as the row names of
fit$L
.
The function used to create the plot. Replace
structure_plot_ggplot_call
with your own function to
customize the appearance of the plot.
Additional arguments passed to structure_plot
(for the plot
method) or embed_method
(for
function structure_plot
).
An object of class “poisson_nmf_fit” or
“multinom_topic_model_fit”. If a Poisson NMF fit is provided
as input, the corresponding multinomial topic model fit is
automatically recovered using poisson2multinom
.
A data frame passed as input to
ggplot
, containing, at a minimum, columns
“sample”, “topic” and “prop”: the
“sample” column contains the positions of the samples (rows
of the L matrix) along the horizontal axis; the “topic”
column is a topic (a column of L); and the “prop” column is
the topic proportion for the respective sample.
The placement of the group labels along the horizontal
axis, and their names. For data that are not grouped, use
ticks = NULL
.
Font size used in plot.
A ggplot
object.
The name “Structure plot” comes from its widespread use in population genetics to visualize the results of the Structure software (Rosenberg et al, 2002).
For most uses of the Structure plot in population genetics, there is usually some grouping of the samples (e.g., assignment to pre-defined populations) that guides arrangement of the samples along the horizontal axis in the bar chart. In other applications, such as analysis of gene expression data, a pre-defined grouping may not always be available. Therefore, a “smart” arrangement of the samples is, by default, generated automatically by performing a 1-d embedding of the samples.
Dey, K. K., Hsiao, C. J. and Stephens, M. (2017). Visualizing the structure of RNA-seq expression data using grade of membership models. PLoS Genetics 13, e1006599.
Rosenberg, N. A., Pritchard, J. K., Weber, J. L., Cann, H. M., Kidd, K. K., Zhivotovsky, L. A. and Feldman, M. W. (2002). Genetic structure of human populations. Science 298, 2381–2385.
# \donttest{
set.seed(1)
data(pbmc_facs)
# Get the multinomial topic model fitted to the
# PBMC data.
fit <- pbmc_facs$fit
# Create a Structure plot without labels. The samples (rows of L) are
# automatically arranged along the x-axis using t-SNE to highlight the
# structure in the data.
p1a <- structure_plot(fit)
#> Running tsne on 2000 x 6 matrix.
#> Read the 2000 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.47 seconds (sparsity = 0.184733)!
#> Learning embedding...
#> Iteration 50: error is 57.790679 (50 iterations in 0.25 seconds)
#> Iteration 100: error is 49.861301 (50 iterations in 0.24 seconds)
#> Iteration 150: error is 48.283680 (50 iterations in 0.24 seconds)
#> Iteration 200: error is 47.583292 (50 iterations in 0.24 seconds)
#> Iteration 250: error is 47.184974 (50 iterations in 0.24 seconds)
#> Iteration 300: error is 0.827698 (50 iterations in 0.30 seconds)
#> Iteration 350: error is 0.573090 (50 iterations in 0.24 seconds)
#> Iteration 400: error is 0.463332 (50 iterations in 0.24 seconds)
#> Iteration 450: error is 0.408558 (50 iterations in 0.30 seconds)
#> Iteration 500: error is 0.378760 (50 iterations in 0.26 seconds)
#> Iteration 550: error is 0.361389 (50 iterations in 0.29 seconds)
#> Iteration 600: error is 0.350594 (50 iterations in 0.27 seconds)
#> Iteration 650: error is 0.343521 (50 iterations in 0.24 seconds)
#> Iteration 700: error is 0.338752 (50 iterations in 0.24 seconds)
#> Iteration 750: error is 0.335426 (50 iterations in 0.24 seconds)
#> Iteration 800: error is 0.333039 (50 iterations in 0.24 seconds)
#> Iteration 850: error is 0.331320 (50 iterations in 0.24 seconds)
#> Iteration 900: error is 0.329978 (50 iterations in 0.24 seconds)
#> Iteration 950: error is 0.328926 (50 iterations in 0.24 seconds)
#> Iteration 1000: error is 0.328079 (50 iterations in 0.24 seconds)
#> Fitting performed in 5.03 seconds.
# The first argument to structure_plot may also be an "L" matrix.
# This call to structure_plot should produce the exact same plot as
# the previous call.
set.seed(1)
p1b <- structure_plot(fit$L)
#> Running tsne on 2000 x 6 matrix.
#> Read the 2000 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.47 seconds (sparsity = 0.184733)!
#> Learning embedding...
#> Iteration 50: error is 57.790679 (50 iterations in 0.25 seconds)
#> Iteration 100: error is 49.861301 (50 iterations in 0.24 seconds)
#> Iteration 150: error is 48.283680 (50 iterations in 0.24 seconds)
#> Iteration 200: error is 47.583292 (50 iterations in 0.24 seconds)
#> Iteration 250: error is 47.184974 (50 iterations in 0.24 seconds)
#> Iteration 300: error is 0.827698 (50 iterations in 0.24 seconds)
#> Iteration 350: error is 0.573090 (50 iterations in 0.24 seconds)
#> Iteration 400: error is 0.463332 (50 iterations in 0.24 seconds)
#> Iteration 450: error is 0.408558 (50 iterations in 0.24 seconds)
#> Iteration 500: error is 0.378760 (50 iterations in 0.24 seconds)
#> Iteration 550: error is 0.361389 (50 iterations in 0.24 seconds)
#> Iteration 600: error is 0.350594 (50 iterations in 0.24 seconds)
#> Iteration 650: error is 0.343521 (50 iterations in 0.24 seconds)
#> Iteration 700: error is 0.338752 (50 iterations in 0.24 seconds)
#> Iteration 750: error is 0.335426 (50 iterations in 0.24 seconds)
#> Iteration 800: error is 0.333039 (50 iterations in 0.24 seconds)
#> Iteration 850: error is 0.331320 (50 iterations in 0.24 seconds)
#> Iteration 900: error is 0.329978 (50 iterations in 0.24 seconds)
#> Iteration 950: error is 0.328926 (50 iterations in 0.24 seconds)
#> Iteration 1000: error is 0.328079 (50 iterations in 0.25 seconds)
#> Fitting performed in 4.81 seconds.
# There is no requirement than the rows of L sum up to 1. To
# illustrate, in this next example we have removed topic 5 from the a
# structure plot.
p2a <- structure_plot(L[,-5])
#> Error in eval(expr, envir, enclos): object 'L' not found
# This is perhaps a more elegant way to remove topic 5 from the
# structure plot:
p2b <- structure_plot(fit,topics = c(1:4,6))
#> Running tsne on 2000 x 6 matrix.
#> Read the 2000 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.50 seconds (sparsity = 0.184253)!
#> Learning embedding...
#> Iteration 50: error is 57.541106 (50 iterations in 0.25 seconds)
#> Iteration 100: error is 50.897131 (50 iterations in 0.27 seconds)
#> Iteration 150: error is 49.519105 (50 iterations in 0.25 seconds)
#> Iteration 200: error is 48.861951 (50 iterations in 0.26 seconds)
#> Iteration 250: error is 48.450472 (50 iterations in 0.25 seconds)
#> Iteration 300: error is 0.868351 (50 iterations in 0.24 seconds)
#> Iteration 350: error is 0.609357 (50 iterations in 0.24 seconds)
#> Iteration 400: error is 0.502994 (50 iterations in 0.24 seconds)
#> Iteration 450: error is 0.451161 (50 iterations in 0.24 seconds)
#> Iteration 500: error is 0.423146 (50 iterations in 0.24 seconds)
#> Iteration 550: error is 0.406233 (50 iterations in 0.24 seconds)
#> Iteration 600: error is 0.395648 (50 iterations in 0.27 seconds)
#> Iteration 650: error is 0.388617 (50 iterations in 0.25 seconds)
#> Iteration 700: error is 0.383747 (50 iterations in 0.24 seconds)
#> Iteration 750: error is 0.380477 (50 iterations in 0.24 seconds)
#> Iteration 800: error is 0.378170 (50 iterations in 0.24 seconds)
#> Iteration 850: error is 0.376304 (50 iterations in 0.24 seconds)
#> Iteration 900: error is 0.375157 (50 iterations in 0.24 seconds)
#> Iteration 950: error is 0.374175 (50 iterations in 0.26 seconds)
#> Iteration 1000: error is 0.373407 (50 iterations in 0.25 seconds)
#> Fitting performed in 4.94 seconds.
# Create a Structure plot with the FACS cell-type labels. Within each
# group (cell-type), the cells (rows of L) are automatically arranged
# using t-SNE.
subpop <- pbmc_facs$samples$subpop
p3 <- structure_plot(fit,grouping = subpop)
#> Running tsne on 412 x 6 matrix.
#> Read the 412 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.09 seconds (sparsity = 0.880844)!
#> Learning embedding...
#> Iteration 50: error is 45.600385 (50 iterations in 0.04 seconds)
#> Iteration 100: error is 45.600352 (50 iterations in 0.04 seconds)
#> Iteration 150: error is 45.599196 (50 iterations in 0.04 seconds)
#> Iteration 200: error is 45.556368 (50 iterations in 0.04 seconds)
#> Iteration 250: error is 45.345104 (50 iterations in 0.04 seconds)
#> Iteration 300: error is 0.385697 (50 iterations in 0.04 seconds)
#> Iteration 350: error is 0.382789 (50 iterations in 0.04 seconds)
#> Iteration 400: error is 0.382783 (50 iterations in 0.03 seconds)
#> Iteration 450: error is 0.382783 (50 iterations in 0.03 seconds)
#> Iteration 500: error is 0.382783 (50 iterations in 0.03 seconds)
#> Iteration 550: error is 0.382784 (50 iterations in 0.03 seconds)
#> Iteration 600: error is 0.382784 (50 iterations in 0.03 seconds)
#> Iteration 650: error is 0.382783 (50 iterations in 0.03 seconds)
#> Iteration 700: error is 0.382784 (50 iterations in 0.03 seconds)
#> Iteration 750: error is 0.382784 (50 iterations in 0.03 seconds)
#> Iteration 800: error is 0.382784 (50 iterations in 0.03 seconds)
#> Iteration 850: error is 0.382784 (50 iterations in 0.03 seconds)
#> Iteration 900: error is 0.382783 (50 iterations in 0.03 seconds)
#> Iteration 950: error is 0.382783 (50 iterations in 0.03 seconds)
#> Iteration 1000: error is 0.382783 (50 iterations in 0.03 seconds)
#> Fitting performed in 0.70 seconds.
#> Running tsne on 86 x 6 matrix.
#> Read the 86 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 27.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.00 seconds (sparsity = 0.984045)!
#> Learning embedding...
#> Iteration 50: error is 49.059834 (50 iterations in 0.00 seconds)
#> Iteration 100: error is 47.558270 (50 iterations in 0.00 seconds)
#> Iteration 150: error is 48.030581 (50 iterations in 0.00 seconds)
#> Iteration 200: error is 47.451909 (50 iterations in 0.00 seconds)
#> Iteration 250: error is 51.055748 (50 iterations in 0.00 seconds)
#> Iteration 300: error is 1.572820 (50 iterations in 0.00 seconds)
#> Iteration 350: error is 0.343784 (50 iterations in 0.00 seconds)
#> Iteration 400: error is 0.302579 (50 iterations in 0.00 seconds)
#> Iteration 450: error is 0.302574 (50 iterations in 0.00 seconds)
#> Iteration 500: error is 0.302574 (50 iterations in 0.00 seconds)
#> Iteration 550: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 600: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 650: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 700: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 750: error is 0.302574 (50 iterations in 0.00 seconds)
#> Iteration 800: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 850: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 900: error is 0.302575 (50 iterations in 0.00 seconds)
#> Iteration 950: error is 0.302610 (50 iterations in 0.00 seconds)
#> Iteration 1000: error is 0.302573 (50 iterations in 0.00 seconds)
#> Fitting performed in 0.08 seconds.
#> Running tsne on 370 x 6 matrix.
#> Read the 370 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.08 seconds (sparsity = 0.945259)!
#> Learning embedding...
#> Iteration 50: error is 44.085064 (50 iterations in 0.03 seconds)
#> Iteration 100: error is 42.420411 (50 iterations in 0.03 seconds)
#> Iteration 150: error is 42.223765 (50 iterations in 0.03 seconds)
#> Iteration 200: error is 42.223211 (50 iterations in 0.03 seconds)
#> Iteration 250: error is 42.223259 (50 iterations in 0.03 seconds)
#> Iteration 300: error is 0.268746 (50 iterations in 0.03 seconds)
#> Iteration 350: error is 0.266433 (50 iterations in 0.03 seconds)
#> Iteration 400: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 450: error is 0.266434 (50 iterations in 0.03 seconds)
#> Iteration 500: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 550: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 600: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 650: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 700: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 750: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 800: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 850: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 900: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 950: error is 0.266432 (50 iterations in 0.03 seconds)
#> Iteration 1000: error is 0.266432 (50 iterations in 0.03 seconds)
#> Fitting performed in 0.58 seconds.
#> Running tsne on 350 x 6 matrix.
#> Read the 350 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.07 seconds (sparsity = 0.961486)!
#> Learning embedding...
#> Iteration 50: error is 43.762218 (50 iterations in 0.03 seconds)
#> Iteration 100: error is 43.669706 (50 iterations in 0.03 seconds)
#> Iteration 150: error is 43.257587 (50 iterations in 0.03 seconds)
#> Iteration 200: error is 43.232919 (50 iterations in 0.03 seconds)
#> Iteration 250: error is 43.232572 (50 iterations in 0.03 seconds)
#> Iteration 300: error is 0.266331 (50 iterations in 0.03 seconds)
#> Iteration 350: error is 0.266314 (50 iterations in 0.03 seconds)
#> Iteration 400: error is 0.266316 (50 iterations in 0.03 seconds)
#> Iteration 450: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 500: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 550: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 600: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 650: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 700: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 750: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 800: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 850: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 900: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 950: error is 0.266315 (50 iterations in 0.03 seconds)
#> Iteration 1000: error is 0.266315 (50 iterations in 0.03 seconds)
#> Fitting performed in 0.54 seconds.
#> Running tsne on 782 x 6 matrix.
#> Read the 782 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.18 seconds (sparsity = 0.502387)!
#> Learning embedding...
#> Iteration 50: error is 51.090226 (50 iterations in 0.08 seconds)
#> Iteration 100: error is 48.054738 (50 iterations in 0.08 seconds)
#> Iteration 150: error is 47.992748 (50 iterations in 0.08 seconds)
#> Iteration 200: error is 47.985522 (50 iterations in 0.08 seconds)
#> Iteration 250: error is 47.984764 (50 iterations in 0.08 seconds)
#> Iteration 300: error is 0.382593 (50 iterations in 0.08 seconds)
#> Iteration 350: error is 0.337453 (50 iterations in 0.08 seconds)
#> Iteration 400: error is 0.333028 (50 iterations in 0.08 seconds)
#> Iteration 450: error is 0.332704 (50 iterations in 0.08 seconds)
#> Iteration 500: error is 0.332684 (50 iterations in 0.08 seconds)
#> Iteration 550: error is 0.332678 (50 iterations in 0.08 seconds)
#> Iteration 600: error is 0.332680 (50 iterations in 0.08 seconds)
#> Iteration 650: error is 0.332678 (50 iterations in 0.08 seconds)
#> Iteration 700: error is 0.332677 (50 iterations in 0.08 seconds)
#> Iteration 750: error is 0.332677 (50 iterations in 0.08 seconds)
#> Iteration 800: error is 0.332679 (50 iterations in 0.08 seconds)
#> Iteration 850: error is 0.332679 (50 iterations in 0.08 seconds)
#> Iteration 900: error is 0.332679 (50 iterations in 0.08 seconds)
#> Iteration 950: error is 0.332679 (50 iterations in 0.08 seconds)
#> Iteration 1000: error is 0.332678 (50 iterations in 0.08 seconds)
#> Fitting performed in 1.57 seconds.
# Next, we apply some customizations to improve the plot: (1) use the
# "topics" argument to specify the order in which the topic
# proportions are stacked on top of each other; (2) use the "gap"
# argrument to increase the whitespace between the groups; (3) use "n"
# to decrease the number of rows of L included in the Structure plot;
# and (4) use "colors" to change the colors used to draw the topic
# proportions.
topic_colors <- c("skyblue","forestgreen","darkmagenta",
"dodgerblue","gold","darkorange")
p4 <- structure_plot(fit,grouping = pbmc_facs$samples$subpop,gap = 20,
n = 1500,topics = c(5,6,1,4,2,3),colors = topic_colors)
#> Running tsne on 290 x 6 matrix.
#> Read the 290 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 95.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.06 seconds (sparsity = 0.996171)!
#> Learning embedding...
#> Iteration 50: error is 41.898804 (50 iterations in 0.03 seconds)
#> Iteration 100: error is 41.892539 (50 iterations in 0.03 seconds)
#> Iteration 150: error is 41.892513 (50 iterations in 0.03 seconds)
#> Iteration 200: error is 41.892834 (50 iterations in 0.03 seconds)
#> Iteration 250: error is 41.891277 (50 iterations in 0.03 seconds)
#> Iteration 300: error is 0.267796 (50 iterations in 0.03 seconds)
#> Iteration 350: error is 0.265796 (50 iterations in 0.02 seconds)
#> Iteration 400: error is 0.265780 (50 iterations in 0.02 seconds)
#> Iteration 450: error is 0.265782 (50 iterations in 0.02 seconds)
#> Iteration 500: error is 0.265782 (50 iterations in 0.02 seconds)
#> Iteration 550: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 600: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 650: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 700: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 750: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 800: error is 0.265780 (50 iterations in 0.02 seconds)
#> Iteration 850: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 900: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 950: error is 0.265781 (50 iterations in 0.02 seconds)
#> Iteration 1000: error is 0.265779 (50 iterations in 0.02 seconds)
#> Fitting performed in 0.45 seconds.
#> Running tsne on 69 x 6 matrix.
#> Read the 69 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 21.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.00 seconds (sparsity = 0.976265)!
#> Learning embedding...
#> Iteration 50: error is 52.349206 (50 iterations in 0.00 seconds)
#> Iteration 100: error is 51.175734 (50 iterations in 0.00 seconds)
#> Iteration 150: error is 52.005192 (50 iterations in 0.00 seconds)
#> Iteration 200: error is 52.551508 (50 iterations in 0.00 seconds)
#> Iteration 250: error is 52.360228 (50 iterations in 0.00 seconds)
#> Iteration 300: error is 1.837543 (50 iterations in 0.00 seconds)
#> Iteration 350: error is 0.366747 (50 iterations in 0.00 seconds)
#> Iteration 400: error is 0.353033 (50 iterations in 0.00 seconds)
#> Iteration 450: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 500: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 550: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 600: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 650: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 700: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 750: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 800: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 850: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 900: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 950: error is 0.353022 (50 iterations in 0.00 seconds)
#> Iteration 1000: error is 0.353022 (50 iterations in 0.00 seconds)
#> Fitting performed in 0.05 seconds.
#> Running tsne on 267 x 6 matrix.
#> Read the 267 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 87.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.05 seconds (sparsity = 0.995497)!
#> Learning embedding...
#> Iteration 50: error is 41.733745 (50 iterations in 0.02 seconds)
#> Iteration 100: error is 41.730456 (50 iterations in 0.02 seconds)
#> Iteration 150: error is 41.728497 (50 iterations in 0.02 seconds)
#> Iteration 200: error is 41.730640 (50 iterations in 0.02 seconds)
#> Iteration 250: error is 41.725346 (50 iterations in 0.02 seconds)
#> Iteration 300: error is 0.173100 (50 iterations in 0.02 seconds)
#> Iteration 350: error is 0.171927 (50 iterations in 0.02 seconds)
#> Iteration 400: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 450: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 500: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 550: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 600: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 650: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 700: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 750: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 800: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 850: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 900: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 950: error is 0.171934 (50 iterations in 0.02 seconds)
#> Iteration 1000: error is 0.171934 (50 iterations in 0.02 seconds)
#> Fitting performed in 0.36 seconds.
#> Running tsne on 259 x 6 matrix.
#> Read the 259 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 85.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.04 seconds (sparsity = 0.995930)!
#> Learning embedding...
#> Iteration 50: error is 42.233657 (50 iterations in 0.02 seconds)
#> Iteration 100: error is 42.233622 (50 iterations in 0.02 seconds)
#> Iteration 150: error is 42.234610 (50 iterations in 0.02 seconds)
#> Iteration 200: error is 42.237241 (50 iterations in 0.02 seconds)
#> Iteration 250: error is 42.234999 (50 iterations in 0.02 seconds)
#> Iteration 300: error is 0.527736 (50 iterations in 0.02 seconds)
#> Iteration 350: error is 0.527123 (50 iterations in 0.02 seconds)
#> Iteration 400: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 450: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 500: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 550: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 600: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 650: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 700: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 750: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 800: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 850: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 900: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 950: error is 0.527124 (50 iterations in 0.02 seconds)
#> Iteration 1000: error is 0.527124 (50 iterations in 0.02 seconds)
#> Fitting performed in 0.35 seconds.
#> Running tsne on 615 x 6 matrix.
#> Read the 615 x 6 data matrix successfully!
#> Using no_dims = 1, perplexity = 100.000000, and theta = 0.100000
#> Computing input similarities...
#> Building tree...
#> Done in 0.14 seconds (sparsity = 0.635336)!
#> Learning embedding...
#> Iteration 50: error is 50.084961 (50 iterations in 0.06 seconds)
#> Iteration 100: error is 46.389137 (50 iterations in 0.06 seconds)
#> Iteration 150: error is 46.387429 (50 iterations in 0.06 seconds)
#> Iteration 200: error is 46.387421 (50 iterations in 0.06 seconds)
#> Iteration 250: error is 46.387428 (50 iterations in 0.06 seconds)
#> Iteration 300: error is 0.286516 (50 iterations in 0.06 seconds)
#> Iteration 350: error is 0.264003 (50 iterations in 0.06 seconds)
#> Iteration 400: error is 0.262789 (50 iterations in 0.06 seconds)
#> Iteration 450: error is 0.262776 (50 iterations in 0.06 seconds)
#> Iteration 500: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 550: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 600: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 650: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 700: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 750: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 800: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 850: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 900: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 950: error is 0.262774 (50 iterations in 0.06 seconds)
#> Iteration 1000: error is 0.262774 (50 iterations in 0.06 seconds)
#> Fitting performed in 1.16 seconds.
# In this example, we use UMAP instead of t-SNE to arrange the
# cells in the Structure plot. Note that this can be accomplished in
# a different way by overriding the default setting of
# "embed_method".
y <- drop(umap_from_topics(fit,dims = 1))
#> 09:23:17 UMAP embedding parameters a = 1.896 b = 0.8006
#> 09:23:17 Read 3774 rows and found 6 numeric columns
#> 09:23:17 Using FNN for neighbor search, n_neighbors = 30
#> 09:23:17 Commencing smooth kNN distance calibration using 4 threads
#> with target n_neighbors = 30
#> 09:23:17 111 smooth knn distance failures
#> 09:23:17 Initializing from normalized Laplacian + noise (using irlba)
#> 09:23:19 Commencing optimization for 500 epochs, with 138012 positive edges
#> 09:23:23 Optimization finished
p5 <- structure_plot(fit,loadings_order = order(y),grouping = subpop,
gap = 40,colors = topic_colors)
# We can also use PCA to arrange the cells.
y <- drop(pca_from_topics(fit,dims = 1))
p6 <- structure_plot(fit,loadings_order = order(y),grouping = subpop,
gap = 40,colors = topic_colors)
# In this final example, we plot a random subset of 400 cells, and
# arrange the cells randomly along the horizontal axis of the
# Structure plot.
p7 <- structure_plot(fit,loadings_order = sample(3744,400),gap = 10,
grouping = subpop,colors = topic_colors)
# }