These data are a selection of the reference transcriptome profiles generated via single-cell RNA sequencing (RNA-seq) of 10 bead-enriched subpopulations of PBMCs (Donor A), described in Zheng et al (2017). The data are unique molecular identifier (UMI) counts for 16,791 genes in 3,774 cells. (Genes with no expression in any of the cells were removed.) Since the majority of the UMI counts are zero, they are efficiently stored as a 3,774 x 16,791 sparse matrix. These data are used in the vignette illustrating how 'fastTopics' can be used to analyze to single-cell RNA-seq data. Data for a separate set of 1,000 cells is provided as a “test set” to evaluate out-of-sample predictions.

Format

pbmc_facs is a list with the following elements:

counts

3,774 x 16,791 sparse matrix of UMI counts, with rows corresponding to samples (cells) and columns corresponding to genes. It is an object of class "dgCMatrix").

counts_test

UMI counts for an additional test set of 100 cells.

samples

Data frame containing information about the samples, including cell barcode and source FACS population (“celltype” and “facs_subpop”).

samples_test

Sample information for the additional test set of 100 cells.

genes

Data frame containing information and the genes, including gene symbol and Ensembl identifier.

fit

Poisson non-negative matrix factorization (NMF) fitted to the UMI count data counts, with rank k = 6. See the vignette how the Poisson NMF model fitting was performed.

de

Result of calling de_analysis(fit,counts,pseudocount = 0.1, control = list(ns = 1e4,nc = 4)) after first setting the seed to 1, set.seed(1).

References

G. X. Y. Zheng et al (2017). Massively parallel digital transcriptional profiling of single cells. Nature Communications 8, 14049. doi: 10.1038/ncomms14049

Examples

library(Matrix)
data(pbmc_facs)
cat(sprintf("Number of cells: %d\n",nrow(pbmc_facs$counts)))
#> Number of cells: 3774
cat(sprintf("Number of genes: %d\n",ncol(pbmc_facs$counts)))
#> Number of genes: 16791
cat(sprintf("Proportion of counts that are non-zero: %0.1f%%.\n",
            100*mean(pbmc_facs$counts > 0)))
#> Proportion of counts that are non-zero: 4.3%.