Overview

In this script, we perform the cluster annotations for the 6 clusters obtained for the Deng2014 data tracking development of cells into an embryo.

Loading the theta matrix

deng_topics <- get(load("../rdas/deng_topic_fit.rda"))
theta_mat <- deng_topics[[5]]$theta;
top_features <- ExtractTopFeatures(theta_mat, top_features=100, method="poisson", options="min");

Gene names

gene_names <- rownames(theta_mat);
gene_list <- do.call(rbind, lapply(1:dim(top_features)[1], function(x) gene_names[top_features[x,]]))

Cluster 1 annotations

out <- mygene::queryMany(gene_list[1,],  scopes="symbol", fields=c("name", "summary"), species="mouse");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
query name X_id summary notfound
Timd2 T cell immunoglobulin and mucin domain containing 2 171284 NA NA
Isyna1 myo-inositol 1-phosphate synthase A1 71780 NA NA
Alppl2 alkaline phosphatase, placental-like 2 11650 NA NA
Pramel5 preferentially expressed antigen in melanoma like 5 384077 NA NA
Hsp90ab1 heat shock protein 90 alpha (cytosolic), class B member 1 15516 NA NA
Fbxo15 F-box protein 15 50764 NA NA
Tceb1 transcription elongation factor B (SIII), polypeptide 1 67923 NA NA
Gpd1l glycerol-3-phosphate dehydrogenase 1-like 333433 NA NA
Pemt phosphatidylethanolamine N-methyltransferase 18618 NA NA
Hsp90aa1 heat shock protein 90, alpha (cytosolic), class A member 1 15519 NA NA
Stip1 stress-induced phosphoprotein 1 20867 NA NA
Larp4 La ribonucleoprotein domain family, member 4 207214 NA NA
Dnaja1 DnaJ heat shock protein family (Hsp40) member A1 15502 The protein encoded by this gene is a member of the DnaJ family, whose members act as cochaperones of heat shock protein 70. Heat shock proteins facilitate protein folding, trafficking, prevention of aggregation, and proteolytic degradation. Members of this family are characterized by a highly conserved N-terminal J domain, a glycine/phenylalanine-rich region, four CxxCxGxG zinc finger repeats, and a C-terminal substrate-binding domain. The J domain mediates the interaction with heat shock protein 70 to recruit substrates and regulate ATP hydrolysis activity. Mice deficient for this gene display reduced levels of activation‐induced deaminase, an enzyme that deaminates deoxycytidine at the immunoglobulin genes during immune responses. In addition, mice lacking this gene exhibit severe defects in spermatogenesis. Several pseudogenes of this gene are found on other chromosomes. Alternative splicing results in multiple transcript variants. NA
Cul5 cullin 5 75717 NA NA
Hdgf hepatoma-derived growth factor 15191 NA NA
Npm1 nucleophosmin 1 18148 NA NA
Psat1 phosphoserine aminotransferase 1 107272 NA NA
Sugt1 SGT1, suppressor of G2 allele of SKP1 (S. cerevisiae) 67955 NA NA
Fbxl20 F-box and leucine-rich repeat protein 20 72194 NA NA
Akr1c21 aldo-keto reductase family 1, member C21 77337 NA NA
Hspa8 heat shock protein 8 15481 NA NA
Bhmt betaine-homocysteine methyltransferase 12116 NA NA
Wdr45 WD repeat domain 45 54636 NA NA
Nt5c3l NA NA NA TRUE
Bhmt2 betaine-homocysteine methyltransferase 2 64918 NA NA
Grn granulin 14824 NA NA
Cnbp cellular nucleic acid binding protein 12785 NA NA
Set SET nuclear oncogene 56086 NA NA
Parp8 poly (ADP-ribose) polymerase family, member 8 52552 NA NA
1700021F05Rik RIKEN cDNA 1700021F05 gene 67851 NA NA
Phgdh 3-phosphoglycerate dehydrogenase 236539 NA NA
Rnf7 ring finger protein 7 19823 NA NA
Naalad2 N-acetylated alpha-linked acidic dipeptidase 2 72560 NA NA
Pa2g4 proliferation-associated 2G4 18813 NA NA
Gpd2 glycerol phosphate dehydrogenase 2, mitochondrial 14571 NA NA
Ube2c ubiquitin-conjugating enzyme E2C 68612 NA NA
Atg7 autophagy related 7 74244 This gene encodes an E1-like activating enzyme that is essential for autophagy and cytoplasmic to vacuole transport. The encoded protein is also thought to modulate p53-dependent cell cycle pathways during prolonged metabolic stress. It has been associated with multiple functions, including axon membrane trafficking, axonal homeostasis, mitophagy, adipose differentiation, and hematopoietic stem cell maintenance. Alternative splicing results in multiple transcript variants. NA
Arih2 ariadne RBR E3 ubiquitin protein ligase 2 23807 NA NA
Aprt adenine phosphoribosyl transferase 11821 NA NA
Cthrc1 collagen triple helix repeat containing 1 68588 NA NA
Vnn1 vanin 1 22361 NA NA
Cited1 Cbp/p300-interacting transactivator with Glu/Asp-rich carboxy-terminal domain 1 12705 NA NA
Fdps farnesyl diphosphate synthetase 110196 NA NA
Pmm1 phosphomannomutase 1 29858 NA NA
Gpr137b G protein-coupled receptor 137B 83924 NA NA
Ptma prothymosin alpha 19231 NA NA
Ppt2 palmitoyl-protein thioesterase 2 54397 NA NA
Ubxn1 UBX domain protein 1 225896 NA NA
Sfn stratifin 55948 NA NA
Mybpc2 myosin binding protein C, fast-type 233199 NA NA
2310040G24Rik RIKEN cDNA 2310040G24 gene 381792 NA NA
2310040G24Rik RIKEN cDNA 2310040G24 gene ENSMUSG00000101655 NA NA
Sumo2 small ubiquitin-like modifier 2 170930 NA NA
Dcaf4 DDB1 and CUL4 associated factor 4 73828 NA NA
Mlf2 myeloid leukemia factor 2 30853 NA NA
Def8 differentially expressed in FDCP 8 23854 NA NA
Tmem64 transmembrane protein 64 100201 NA NA
Hprt hypoxanthine guanine phosphoribosyl transferase 15452 The protein encoded by this gene is a transferase, which catalyzes conversion of hypoxanthine to inosine monophosphate and guanine to guanosine monophosphate via transfer of the 5-phosphoribosyl group from 5-phosphoribosyl 1-pyrophosphate. This enzyme plays a central role in the generation of purine nucleotides through the purine salvage pathway. NA
Rplp0 ribosomal protein, large, P0 11837 NA NA
Drg2 developmentally regulated GTP binding protein 2 13495 NA NA
Exosc7 exosome component 7 66446 NA NA
Parg poly (ADP-ribose) glycohydrolase 26430 NA NA
Eif3i eukaryotic translation initiation factor 3, subunit I 54709 NA NA
Aacs acetoacetyl-CoA synthetase 78894 NA NA
Pink1 PTEN induced putative kinase 1 68943 NA NA
Tmem41b transmembrane protein 41B 233724 NA NA
Rps3 ribosomal protein S3 27050 NA NA
Stag2 stromal antigen 2 20843 NA NA
Bnip3l BCL2/adenovirus E1B interacting protein 3-like 12177 NA NA
Ldb1 LIM domain binding 1 16825 NA NA
Cbr2 carbonyl reductase 2 12409 NA NA
Clcnka chloride channel, voltage-sensitive Ka 12733 This gene is a member of the CLC family of voltage-gated chloride channels. The gene is located adjacent to a highly similar chloride channel gene on chromosome 4. This gene is syntenic with human CLCNKB (geneID:1188). Multiple alternatively spliced variants, encoding the same protein, have been found for this gene. NA
Serinc1 serine incorporator 1 56442 NA NA
Azin1 antizyme inhibitor 1 54375 The protein encoded by this gene belongs to the antizyme inhibitor family, which plays a role in cell growth and proliferation by maintaining polyamine homeostasis within the cell. Antizyme inhibitors are homologs of ornithine decarboxylase (ODC, the key enzyme in polyamine biosynthesis) that have lost the ability to decarboxylase ornithine; however, retain the ability to bind to antizymes. Antizymes negatively regulate intracellular polyamine levels by binding to ODC and targeting it for degradation, as well as by inhibiting polyamine uptake. Antizyme inhibitors function as positive regulators of polyamine levels by sequestering antizymes and neutralizing their effect. This gene encodes antizyme inhibitor 1, the first member of this gene family that is ubiquitously expressed, and is localized in the nucleus and cytoplasm. Overexpression of antizyme inhibitor 1 gene has been associated with increased proliferation, cellular transformation and tumorigenesis. Gene knockout studies showed that homozygous mutant mice lacking functional antizyme inhibitor 1 gene died at birth with abnormal liver morphology. RNA editing of this gene, predominantly in the liver tissue, has been linked to the progression of hepatocellular carcinoma. Alternatively spliced transcript variants have been described for this gene. NA
Mta3 metastasis associated 3 116871 NA NA
Mrpl22 mitochondrial ribosomal protein L22 216767 NA NA
Zmpste24 zinc metallopeptidase, STE24 230709 NA NA
Mrps18b mitochondrial ribosomal protein S18B 66973 NA NA
Timm23 translocase of inner mitochondrial membrane 23 53600 NA NA
Kng1 kininogen 1 16644 NA NA
Eif5 eukaryotic translation initiation factor 5 217869 NA NA
Bcat1 branched chain aminotransferase 1, cytosolic 12035 NA NA
Dph3 diphthamine biosynthesis 3 105638 NA NA
Uap1 UDP-N-acetylglucosamine pyrophosphorylase 1 107652 NA NA
Xkr9 X-linked Kx blood group related 9 381246 NA NA
Hcrt hypocretin 15171 This gene encodes a hypothalamic neuropeptide precursor protein that gives rise to two mature neuropeptides, orexin A and orexin B, by proteolytic processing. Orexin A and orexin B, which bind to orphan G-protein coupled receptors Hcrtr1 and Hcrtr2, function in the regulation of sleep and arousal. This neuropeptide arrangement may also play a role in feeding behavior, metabolism, and homeostasis. NA
Agpat9 1-acylglycerol-3-phosphate O-acyltransferase 9 231510 NA NA
Zfp932 zinc finger protein 932 69504 NA NA
Vpreb3 pre-B lymphocyte gene 3 22364 NA NA
Nudt9 nudix (nucleoside diphosphate linked moiety X)-type motif 9 74167 NA NA
Pex16 peroxisomal biogenesis factor 16 18633 NA NA
Ppfibp2 PTPRF interacting protein, binding protein 2 (liprin beta 2) 19024 NA NA
Allc allantoicase 94041 NA NA
Rpl13a ribosomal protein L13A 22121 NA NA
Reep1 receptor accessory protein 1 52250 NA NA
Timm17a translocase of inner mitochondrial membrane 17a 21854 NA NA
Actn2 actinin alpha 2 11472 NA NA
Psmb10 proteasome (prosome, macropain) subunit, beta type 10 19171 NA NA
Ptges3 prostaglandin E synthase 3 (cytosolic) 56351 NA NA
Ctsa cathepsin A 19025 This gene encodes a glycoprotein with deamidase, esterase and carboxypeptidase activities. The encoded protein associates with and provides a protective function to the lysosomal enzymes beta-galactosidase and neuraminidase. Deficiency of the related gene in humans results in galactosialidosis. The proprotein is processed into two shorter chains. Alternatively spliced transcript variants encoding multiple isoforms have been observed for this gene. NA
Hk1 hexokinase 1 15275 NA NA

Cluster 2 annotations

out <- mygene::queryMany(gene_list[2,],  scopes="symbol", fields=c("name", "summary"), species="mouse");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
name query X_id summary notfound
uridine phosphorylase 1 Upp1 22271 NA NA
teratocarcinoma-derived growth factor 1 Tdgf1 21667 NA NA
aquaporin 8 Aqp8 11833 NA NA
fatty acid binding protein 5, epidermal Fabp5 16592 The protein encoded by this gene is part of the fatty acid binding protein family (FABP). FABPs are a family of small, highly conserved, cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands and participate in fatty acid uptake, transport, and metabolism. In humans this gene has been associated with psoriasis and type 2 diabetes. In mouse deficiency of this gene in combination with a deficiency in Fabp4 confers protection against atherosclerosis, diet-induced obesity, insulin resistance and experimental autoimmune encephalomyelitis (the mouse model for multiple sclerosis). Alternative splicing results in multiple transcript variants that encode different protein isoforms. The mouse genome contains many pseudogenes similar to this locus. NA
tyrosine aminotransferase Tat 234724 This gene encodes a liver-specific mitochondrial enzyme that catalyzes the conversion of L-tyrosine into p-hydroxyphenylpyruvate. Regulated by glucocorticoid and polypeptide hormones, this gene’s expression is affected by deletion of a regulatory region near the albino locus on chromosome 7. Mutations in this gene cause tyrosinemia type II in humans. NA
platelet derived growth factor receptor, alpha polypeptide Pdgfra 18595 NA NA
peptide YY Pyy 217212 NA NA
peroxiredoxin 1 Prdx1 18477 NA NA
collagen, type IV, alpha 1 Col4a1 12826 This gene encodes the alpha-1 subunit of the type IV collagens, an essential component of basement membranes. The encoded protein forms a triple helical heterotrimer comprised of two alpha-1 and one alpha-2 subunits that assembles into a type IV collagen network. This gene is located adjacent to the gene encoding alpha-2 subunit. Mice lacking both the alpha-1 and alpha-2 subunits of collagen IV die in utero due to structural deficiencies in the basement membranes and certain mutations in this gene cause perinatal cerebral hemorrhage and porencephaly. Alternative splicing of this gene results in multiple transcript variants. NA
secreted phosphoprotein 1 Spp1 20750 NA NA
Rho family GTPase 3 Rnd3 74194 NA NA
hepatic nuclear factor 4, alpha Hnf4a 15378 The protein encoded by this gene is a transcription factor involved in the development of the pancreas, liver, kidney, and intestines. The encoded protein also functions to maintain glucose homeostasis. Several transcript variants encoding different isoforms have been found for this gene. NA
serine (or cysteine) peptidase inhibitor, clade H, member 1 Serpinh1 12406 NA NA
glutaredoxin Glrx 93692 NA NA
NA 4930583H14Rik NA NA TRUE
solute carrier family 13 (sodium-dependent citrate transporter), member 5 Slc13a5 237831 NA NA
A kinase (PRKA) anchor protein (gravin) 12 Akap12 83397 NA NA
fructose bisphosphatase 2 Fbp2 14120 NA NA
fibronectin 1 Fn1 14268 NA NA
procollagen lysine, 2-oxoglutarate 5-dioxygenase 2 Plod2 26432 NA NA
ring finger protein 130 Rnf130 59044 NA NA
secreted acidic cysteine rich glycoprotein Sparc 20692 NA NA
epoxide hydrolase 2, cytoplasmic Ephx2 13850 NA NA
aconitase 1 Aco1 11428 This gene encodes a member of the aconitase/IPM isomerase protein family. The encoded protein has been identified as a moonlighting protein based on its ability to perform mechanistically distinct functions. Depending on iron levels in the cytosol, the encoded protein can function as either an aconitase enzyme or as an mRNA binding protein. When cellular iron levels are high, the encoded protein functions as an aconitase, an essential enzyme in the TCA cycle that catalyzes the conversion of citrate to isocitrate. When cellular iron levels are low, the encoded protein regulates iron uptake and utilization by binding to iron-responsive elements in the untranslated regions of mRNAs for genes involved in iron metabolism. Disruption of this gene is associated with pulmonary hypertension and polycythemia. NA
tet methylcytosine dioxygenase 1 Tet1 52463 NA NA
myristoylated alanine rich protein kinase C substrate Marcks 17118 NA NA
transformation related protein 53 Trp53 22059 This gene encodes tumor protein p53, which responds to diverse cellular stresses to regulate target genes that induce cell cycle arrest, apoptosis, senescence, DNA repair, or changes in metabolism. p53 protein is expressed at low level in normal cells and at a high level in a variety of transformed cell lines, where it’s believed to contribute to transformation and malignancy. p53 is a DNA-binding protein containing transcription activation, DNA-binding, and oligomerization domains. It is postulated to bind to a p53-binding site and activate expression of downstream genes that inhibit growth and/or invasion, and thus function as a tumor suppressor. Mice deficient for this gene are developmentally normal but are susceptible to spontaneous tumors. Evidence to date shows that this gene contains one promoter, in contrast to alternative promoters of the human gene, and transcribes a few of splice variants which encode different isoforms, although the biological validity or the full-length nature of some variants has not been determined. NA
gap junction protein, alpha 1 Gja1 14609 NA NA
solute carrier family 24, member 5 Slc24a5 317750 NA NA
SH3 domain binding glutamic acid-rich protein-like 3 Sh3bgrl3 73723 NA NA
asparagine synthetase Asns 27053 NA NA
RIKEN cDNA E130012A19 gene E130012A19Rik 103551 NA NA
T-box 15 Tbx15 21384 NA NA
ubiquitin-conjugating enzyme E2L 6 Ube2l6 56791 NA NA
mesoderm development candidate 2 Mesdc2 67943 NA NA
family with sequence similarity 25, member C Fam25c 69134 NA NA
glutamate dehydrogenase 1 Glud1 14661 NA NA
collagen, type IV, alpha 2 Col4a2 12827 This gene encodes the alpha-2 subunit of the type IV collagens, an essential component of basement membranes. The encoded protein forms a triple helical heterotrimer comprised of alpha-1 and alpha-2 subunits that assembles into a type IV collagen network. Canstatin, a peptide derived fom the C-terminus of the collagen chain, is a matrikine that has been shown to inhibit angiogenesis. Homozygous knockout mice for this gene exhibit impaired basement membrane integrity and embryonic lethality. This gene shares a bi-directional promoter with a related gene on chromosome 8. NA
junction adhesion molecule 2 Jam2 67374 NA NA
protein O-glucosyltransferase 1 Poglut1 224143 This gene encodes a protein that can catalyze transfer of either UDP-glucose or UDP-xylose to epidermal growth factor (EGF) repeats, such as those found in Notch. Loss of this gene product results in embryonic lethality. Embryos have neural plate defects, heart defects, and truncations of their posterior axis. Alternate splicing results in multiple transcript variants. NA
expressed sequence C77370 C77370 245555 NA NA
phosphoenolpyruvate carboxykinase 2 (mitochondrial) Pck2 74551 NA NA
bone morphogenetic protein 4 Bmp4 12159 This gene encodes a member of the transforming growth factor beta superfamily of regulatory proteins that plays an important role in the process of bone and cartilage development. The encoded preproprotein undergoes proteolytic processing to generate a disulfide linked homodimeric glycoprotein. Mice lacking the encoded protein die in utero. Transgenic mice overexpressing the encoded protein in a neuron-specific manner exhibit a phenotype resembling the rare hereditary connective tissue disease, fibrodysplasia ossificans progressiva. Alternative splicing results in multiple transcript variants. A pseudogene of this gene has been defined on the X chromosome. NA
Meis homeobox 1 Meis1 17268 NA NA
poly (ADP-ribose) polymerase family, member 1 Parp1 11545 NA NA
phosphoglucomutase 1 Pgm1 66681 NA NA
isocitrate dehydrogenase 1 (NADP+), soluble Idh1 15926 NA NA
gap junction protein, beta 5 Gjb5 14622 NA NA
S-adenosylhomocysteine hydrolase Ahcy 269378 NA NA
milk fat globule-EGF factor 8 protein Mfge8 17304 NA NA
carcinoembryonic antigen-related cell adhesion molecule 10 Ceacam10 26366 NA NA
zinc finger CCCH type, antiviral 1 Zc3hav1 78781 NA NA
gap junction protein, beta 4 Gjb4 14621 NA NA
spermidine/spermine N1-acetyl transferase 1 Sat1 20229 NA NA
podoplanin Pdpn 14726 NA NA
angiotensin II, type I receptor-associated protein Agtrap 11610 NA NA
NA Pkm2 NA NA TRUE
nuclear factor of activated T cells, cytoplasmic, calcineurin dependent 4 Nfatc4 73181 NA NA
ets variant 5 Etv5 104156 NA NA
interferon induced transmembrane protein 2 Ifitm2 80876 NA NA
dipeptidylpeptidase 4 Dpp4 13482 NA NA
fibroblast growth factor 10 Fgf10 14165 NA NA
SRY (sex determining region Y)-box 2 Sox2 20674 This intronless gene encodes a member of the SRY-related HMG-box (SOX) family of transcription factors involved in the regulation of embryonic development and in the determination of cell fate. The product of this gene is required for stem-cell maintenance in the central nervous system, and also regulates gene expression in the stomach. Mutations in a similar gene in human have been associated with optic nerve hypoplasia and with syndromic microphthalmia, a severe form of structural eye malformation. This gene lies within an intron of another gene called SOX2 overlapping transcript (Sox2ot). NA
peroxiredoxin 6 Prdx6 11758 This gene encodes a member of the peroxiredoxin family of peroxidases. The encoded protein is a bifunctional enzyme that has glutathione peroxidase and phospholipase activities. This protein is an antioxidant that reduces peroxidized membrane phospholipids and plays an important role in phospholipid homeostasis based on its ability to generate lysophospholipid substrate for the remodeling pathway of phospholipid synthesis. Mice lacking this gene are sensitive to oxidant stress, have altered lung phospholipid metabolism and susceptible to skin tumorigenesis. Alternate splicing of this gene results in multiple transcript variants. A pseudogene of this gene is found on chromosome 4. NA
adenylate kinase 4 Ak4 11639 NA NA
F-box protein 3 Fbxo3 57443 NA NA
v-ral simian leukemia viral oncogene B Ralb 64143 NA NA
proteasome (prosome, macropain) subunit, beta type 9 (large multifunctional peptidase 2) Psmb9 16912 NA NA
lipase, member H Liph 239759 NA NA
dehydrogenase/reductase (SDR family) member 4 Dhrs4 28200 NA NA
mutS homolog 2 Msh2 17685 NA NA
creatine kinase, brain Ckb 12709 NA NA
protein S (alpha) Pros1 19128 This gene encodes a vitamin K-dependent protein with key roles in multiple biological processes including coagulation, apoptosis and vasculogenesis. The encoded protein undergoes proteolytic processing to generate a mature protein which is secreted into the plasma. Mice lacking the encoded protein die in utero from a fulminant coagulopathy and associated hemorrhages. NA
protease, serine 35 Prss35 244954 NA NA
growth differentiation factor 3 Gdf3 14562 NA NA
thioredoxin domain containing 12 (endoplasmic reticulum) Txndc12 66073 NA NA
zinc finger protein 36, C3H type-like 1 Zfp36l1 12192 NA NA
nuclear casein kinase and cyclin-dependent kinase substrate 1 Nucks1 98415 NA NA
stearoyl-Coenzyme A desaturase 1 Scd1 20249 NA NA
insulin-like growth factor 2 mRNA binding protein 1 Igf2bp1 140486 NA NA
RNA polymerase II associated protein 1 Rpap1 68925 NA NA
TNF receptor-associated protein 1 Trap1 68015 NA NA
polymerase (DNA-directed), delta interacting protein 2 Poldip2 67811 NA NA
minichromosome maintenance complex component 5 Mcm5 17218 The protein encoded by this gene is a member of the oligameric minichromosome maintenance protein complex. During DNA replication, the complex loads onto chromatin in early G1 and is converted into an active replicative helicase during S phase. It functions to limit DNA synthesis to once per cell cycle. During embryogenesis, the encoded protein is negatively regulated through expression of paired box protein Pax 3. Alternative splicing results in multiple transcript variants. NA
protein kinase C, iota Prkci 18759 NA NA
chaperonin containing Tcp1, subunit 6a (zeta) Cct6a 12466 NA NA
chaperonin containing Tcp1, subunit 6a (zeta) Cct6a ENSMUSG00000029447 NA NA
tubulin, beta 5 class I Tubb5 22154 NA NA
LIM domain only 2 Lmo2 16909 NA NA
RAS-related C3 botulinum substrate 2 Rac2 19354 NA NA
autophagy related 13 Atg13 51897 NA NA
glycine C-acetyltransferase (2-amino-3-ketobutyrate-coenzyme A ligase) Gcat 26912 NA NA
sphingomyelin phosphodiesterase, acid-like 3B Smpdl3b 100340 NA NA
origin recognition complex, subunit 3 Orc3 50793 NA NA
Eph receptor A4 Epha4 13838 NA NA
glutathione S-transferase, mu 7 Gstm7 68312 NA NA
cyclin-dependent kinase 4 Cdk4 12567 NA NA
eukaryotic translation initiation factor 4E binding protein 1 Eif4ebp1 13685 NA NA
crystallin, gamma D Crygd 12967 NA NA
mitochondrial carrier 2 Mtch2 56428 This gene encodes a member of the SLC25 family of nuclear-encoded transporters that are localized in the inner mitochondrial membrane. Members of this superfamily are involved in many metabolic pathways and cell functions. Genome-wide association studies in human have identified single-nucleotide polymorphisms in several loci associated with obesity. This gene is one such locus, which is highly expressed in white adipose tissue and adipocytes, and thought to play a regulatory role in adipocyte differentiation and biology. Alternatively spliced transcript variants encoding different isoforms have been found for this gene. A recent study showed this gene to be an authentic stop codon readthrough target, and that its mRNA can give rise to an additional C-terminally extended isoform by use of an alternative in-frame translation termination codon. NA
NFU1 iron-sulfur cluster scaffold Nfu1 56748 NA NA

Cluster 3 annotations

out <- mygene::queryMany(gene_list[3,],  scopes="symbol", fields=c("name", "summary"), species="mouse");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
name X_id summary query notfound
actin, beta 11461 This gene encodes a member of the actin family of proteins. Actins are highly conserved proteins that are among the most abundant proteins in eukaryotic cells and are involved in cell motility, structure, and integrity. Localization, stability, and translation of the transcribed mRNA are regulated through the binding of multiple factors to its 3’ UTR sequence. Homozygous knockout mice for this gene exhibit embryonic lethality. Numerous pseudogenes of this gene have been identified in the mouse genome. Actb NA
keratin 18 16668 NA Krt18 NA
fatty acid binding protein 3, muscle and heart 14077 NA Fabp3 NA
inhibitor of DNA binding 2 15902 NA Id2 NA
tetraspanin 8 216350 NA Tspan8 NA
GM2 ganglioside activator protein 14667 NA Gm2a NA
lectin, galactose binding, soluble 1 16852 NA Lgals1 NA
alcohol dehydrogenase 1 (class I) 11522 NA Adh1 NA
low density lipoprotein receptor-related protein 2 14725 NA Lrp2 NA
cDNA sequence BC051665 218275 NA BC051665 NA
aldo-keto reductase family 1, member B8 14187 NA Akr1b8 NA
transmembrane protease, serine 2 50528 NA Tmprss2 NA
developmental pluripotency associated 1 347708 NA Dppa1 NA
cDNA sequence BC053393 407814 NA BC053393 NA
lymphocyte cytosolic protein 1 18826 NA Lcp1 NA
disabled 2, mitogen-responsive phosphoprotein 13132 NA Dab2 NA
keratin 8 16691 NA Krt8 NA
predicted gene 4926 ENSMUSG00000086909 NA Gm4926 NA
T-cell immunoglobulin and mucin domain containing 2 pseudogene 237749 NA Gm4926 NA
prosaposin 19156 This gene encodes a multifunctional glycoprotein that plays a role in the intracellular metabolism of various sphingolipids or secreted into the plasma, milk or cerebrospinal fluid. The encoded protein undergoes proteolytic processing to generate four different polypeptides known as saposin A, B, C or D, that are required for the hydrolysis of certain sphingolipids by lysosomal hydrolases. Alternately, the encoded protein is secreted into body fluids where it exhibits neurotrophic and myelinotrophic activities. A complete lack of the encoded protein is fatal to mice either at the neonatal stage or within the first month due to severe leukodystrophy and sphingolipid accumulation. Alternative splicing results in multiple transcript variants encoding different isoforms that may undergo similar processing to generate the mature saposins. Psap NA
NA NA NA 5730469M10Rik TRUE
myosin, heavy polypeptide 10, non-muscle 77579 NA Myh10 NA
solute carrier family 2 (facilitated glucose transporter), member 3 20527 NA Slc2a3 NA
UDP-GlcNAc:betaGal beta-1,3-N-acetylglucosaminyltransferase 5 108105 NA B3gnt5 NA
AHNAK nucleoprotein (desmoyokin) 66395 NA Ahnak NA
thymosin, beta 4, X chromosome 19241 NA Tmsb4x NA
E74-like factor 3 13710 NA Elf3 NA
annexin A2 12306 NA Anxa2 NA
transgelin 2 21346 NA Tagln2 NA
GATA binding protein 3 14462 NA Gata3 NA
solute carrier family 15 (H+/peptide transporter), member 2 57738 NA Slc15a2 NA
tropomyosin 1, alpha 22003 NA Tpm1 NA
N-myc downstream regulated gene 1 17988 NA Ndrg1 NA
epithelial cell adhesion molecule 17075 NA Epcam NA
calponin 2 12798 NA Cnn2 NA
guanine nucleotide binding protein (G protein), gamma 2 14702 NA Gng2 NA
moesin 17698 NA Msn NA
myosin, light chain 12B, regulatory 67938 NA Myl12b NA
ectonucleoside triphosphate diphosphohydrolase 1 12495 NA Entpd1 NA
claudin 4 12740 This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. The protein encoded by this gene is a high-affinity receptor for clostridium perfringens enterotoxin (CPE) produced by the bacterium Clostridium perfringens, and the interaction with CPE results in increased membrane permeability by forming small pores in plasma membrane. This protein augments alveolar epithelial barrier function and is induced in acute lung injury. It is highly expressed in pancreatic and ovarian cancers. Cldn4 NA
calmodulin 1 12313 This gene encodes a member of the EF-hand calcium-binding protein family. The encoded protein acts as a calcium sensor and is involved in relaying signals to calcium-sensitive proteins, enzymes and ion channels. The protein-calcium complex binds target proteins to regulate several cellular processes, including smooth muscle contraction, inflammation, apoptosis and the immune response. Mutations in the human gene are associated with catecholaminergic polymorphic ventricular tachycardia and long QT syndrome 14. Alternative splicing results in multiple transcript variants encoding different isoforms. Calm1 NA
destrin 56431 NA Dstn NA
GLI pathogenesis-related 1 (glioma) 73690 NA Glipr1 NA
actin related protein 2/3 complex, subunit 5 67771 NA Arpc5 NA
NA NA NA 2610019F03Rik TRUE
glutamyl aminopeptidase 13809 NA Enpep NA
cystathionine beta-synthase 12411 NA Cbs NA
S100 calcium binding protein A11 20195 NA S100a11 NA
progressive ankylosis 11732 NA Ank NA
tumor-associated calcium signal transducer 2 56753 NA Tacstd2 NA
myosin, heavy polypeptide 9, non-muscle 17886 NA Myh9 NA
IQ motif containing GTPase activating protein 1 29875 NA Iqgap1 NA
ectonucleotide pyrophosphatase/phosphodiesterase 1 18605 This gene encodes a member of the nucleoside pyrophosphatase/phosphodiesterase family of enzymes that catalyzes the hydrolysis of pyrophosphate and phosphodiester bonds in nucleotide triphosphates and oligonucleotides, respectively, to generate nucleoside 5’-monophosphates. The encoded protein is a type II transmembrane glycoprotein that negatively regulates bone mineralization. Mice harboring a nonsense mutation in this gene, termed tiptoe walking (ttw), exhibit ectopic ossification of the spinal ligaments. The encoded protein binds to the insulin receptor, inhibits downstream signaling events and induces insulin resistance and glucose tolerance. This gene is located adjacent to a paralog on chromosome 10. Alternative splicing results in multiple transcript variants encoding different isoforms. Enpp1 NA
hepatitis A virus cellular receptor 1 171283 NA Havcr1 NA
cysteine-rich protein 1 (intestinal) 12925 NA Crip1 NA
actin, gamma, cytoplasmic 1 11465 Actins are highly conserved proteins that are involved in various types of cell motility and maintenance of the cytoskeleton. In vertebrates, three main groups of actin isoforms, alpha, beta, and gamma, have been identified. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exist in most cell types as components of the cytoskeleton, and as mediators of internal cell motility. Actin, gamma 1, encoded by this gene, is a cytoplasmic actin found in non-muscle cells. Alternative splicing results in multiple transcript variants. Actg1 NA
PDZK1 interacting protein 1 67182 NA Pdzk1ip1 NA
WD repeat domain 1 22388 NA Wdr1 NA
capping protein (actin filament), gelsolin-like 12332 NA Capg NA
EF hand domain containing 2 27984 NA Efhd2 NA
eomesodermin 13813 NA Eomes NA
cystatin B 13014 NA Cstb NA
crystallin, alpha B 12955 This gene encodes a member of the small heat-shock protein (HSP20) family. The encoded protein is a molecular chaperone that protects proteins against thermal denaturation and other stresses. This protein is a component of the eye lens, regulates lens differentiation and functions as a refractive element in the lens. This protein is a negative regulator of inflammation, has anti-apoptotic properties and also plays a role in the formation of muscular tissue. Mice lacking this gene exhibit worse experimental autoimmune encephalomyelitis and inflammation of the central nervous system compared to the wild type. In mouse models, this gene has a critical role in alleviating the pathology of the neurodegenerative Alexander disease. Mutations in the human gene are associated with myofibrillar myopathy 2, fatal infantile hypertonic myofibrillar myopathy, multiple types of cataract and dilated cardiomyopathy. Alternative splicing results in multiple transcript variants. Cryab NA
syndecan 4 20971 NA Sdc4 NA
thymosin, beta 10 19240 NA Tmsb10 NA
claudin 6 54419 This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. The protein encoded by this gene is essential for blastocyst formation in preimplantation mouse embryos, and is invloved in and is crucial for the formation and maintenance of the epidermal permeability barrier. This gene is adjacent to another family member Cldn9 on chromosome 17. Cldn6 NA
transmembrane protein 62 96957 NA Tmem62 NA
actin related protein 2/3 complex, subunit 2 76709 NA Arpc2 NA
cadherin 1 12550 This gene encodes E-cadherin, a calcium-dependent cell adhesion molecule that functions in the establishment and maintenance of epithelial cell morphology during embryongenesis and adulthood. The encoded preproprotein undergoes proteolytic processing to generate a mature protein. Targeted mutations disrupting binding of calcium to the encoded protein in mice cause death in utero due to failed blastocyst and trophectoderm formation. This gene is located adjacent to a related cadherin gene on chromosome 8. Cdh1 NA
neuraminidase 1 18010 NA Neu1 NA
claudin 7 53624 This gene encodes a member of the claudin family. Claudins are integral membrane proteins and components of tight junction strands. Tight junction strands serve as a physical barrier to prevent solutes and water from passing freely through the paracellular space between epithelial or endothelial cell sheets, and also play critical roles in maintaining cell polarity and signal transductions. This gene is expressed constitutively in the mammary epithelium throughout development, and might be involved in vesicle trafficking to the basolateral membrane. It is essential for NaCl homeostasis in distal nephrons. The knockout mice lacking this gene showed severe salt wasting, chronic dehydration, and growth retardation, and died within 12 days after birth. Alternatively spliced transcript variants have been found for this gene. Cldn7 NA
solute carrier family 20, member 1 20515 NA Slc20a1 NA
prostaglandin E synthase 64292 NA Ptges NA
chloride intracellular channel 4 (mitochondrial) 29876 NA Clic4 NA
CNDP dipeptidase 2 (metallopeptidase M20 family) 66054 NA Cndp2 NA
actinin alpha 4 60595 NA Actn4 NA
glutathione S-transferase, alpha 4 14860 NA Gsta4 NA
FXYD domain-containing ion transport regulator 4 108017 This gene encodes a member of a family of small membrane proteins that share a 35-amino acid signature sequence domain, beginning with the sequence PFXYD and containing 7 invariant and 6 highly conserved amino acids. The approved human gene nomenclature for the family is FXYD-domain containing ion transport regulator. Mouse FXYD5 has been termed RIC (Related to Ion Channel). FXYD2, also known as the gamma subunit of the Na,K-ATPase, regulates the properties of that enzyme. FXYD1 (phospholemman), FXYD2 (gamma), FXYD3 (MAT-8), FXYD4 (CHIF), and FXYD5 (RIC) have been shown to induce channel activity in experimental expression systems. Transmembrane topology has been established for two family members (FXYD1 and FXYD2), with the N-terminus extracellular and the C-terminus on the cytoplasmic side of the membrane. Fxyd4 NA
tumor rejection antigen P1A 22037 NA Trap1a NA
tropomyosin 4 326618 NA Tpm4 NA
CAP, adenylate cyclase-associated protein 1 (yeast) 12331 The product of this gene plays a role in regulating actin dynamics by binding actin monomers and promoting the turnover of actin filaments. Reduced expression of this gene causes a reduction in actin filament turnover rates, causing multiple defects, including an increase in cell size, stress-fiber alterations, and defects in endocytosis and cell motility. A pseudogene of this gene is found on chromosome 14. Alternative splicing results in multiple transcript variants, but does not affect the protein. Cap1 NA
N-acetyl galactosaminidase, alpha 17939 NA Naga NA
family with sequence similarity 96, member A 68250 NA Fam96a NA
solute carrier family 25 (mitochondrial carrier, adenine nucleotide translocator), member 4 11739 NA Slc25a4 NA
solute carrier family 44, member 4 70129 NA Slc44a4 NA
acyl-CoA thioesterase 2 171210 NA Acot2 NA
formin-like 2 71409 NA Fmnl2 NA
isocitrate dehydrogenase 3 (NAD+) alpha 67834 NA Idh3a NA
tight junction protein 2 21873 NA Tjp2 NA
dimethylarginine dimethylaminohydrolase 1 69219 NA Ddah1 NA
UDP-glucose pyrophosphorylase 2 216558 NA Ugp2 NA
prostaglandin reductase 1 67103 NA Ptgr1 NA
solute carrier family 6 (neurotransmitter transporter), member 14 56774 NA Slc6a14 NA
vestigial like 3 (Drosophila) 73569 NA Vgll3 NA
NA NA NA 2810405K02Rik TRUE
transmembrane protein 45b 235135 NA Tmem45b NA
predicted gene 12169 210535 NA Gm12169 NA
F11 receptor 16456 NA F11r NA
tubulointerstitial nephritis antigen-like 1 94242 NA Tinagl1 NA
coiled-coil domain containing 42 276920 NA Ccdc42 NA
protease, serine 8 (prostasin) 76560 NA Prss8 NA

Cluster 4 annotations

out <- mygene::queryMany(gene_list[4,],  scopes="symbol", fields=c("name", "summary"), species="mouse");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
name query X_id notfound summary
reticulon 2 (Z-band associated protein) Rtn2 20167 NA NA
EBNA1 binding protein 2 Ebna1bp2 69072 NA NA
NA Zfp259 NA TRUE NA
nuclear autoantigenic sperm protein (histone-binding) Nasp 50927 NA NA
centromere protein E Cenpe 229841 NA NA
ring finger protein 216 Rnf216 108086 NA NA
cathepsin L Ctsl 13039 NA This gene encodes a member of the peptidase C1 (papain) family of cysteine proteases. The encoded preproprotein is proteolytically processed to generate multiple protein products. These products include the activation peptide and the cathepsin L1 heavy and light chains. The mature enzyme appears to be important in embryonic development through its processing of histone H3 and may play a role in disease progression in a model of kidney disease. Homozygous knockout mice for this gene exhibit hair loss, skin thickening, bone and heart defects, and enhanced susceptibility to bacterial infection. A pseudogene of this gene has been identified in the genome.
torsin family 1, member B Tor1b 30934 NA NA
ankyrin repeat domain 10 Ankrd10 102334 NA NA
lysosomal-associated membrane protein 2 Lamp2 16784 NA NA
NA 2410001C21Rik NA TRUE NA
DEAD (Asp-Glu-Ala-Asp) box polypeptide 24 Ddx24 27225 NA NA
inositol 1,4,5-trisphosphate 3-kinase C Itpkc 233011 NA NA
RAN binding protein 2 Ranbp2 19386 NA NA
exportin 1 Xpo1 103573 NA NA
polymerase (RNA) III (DNA directed) polypeptide K Polr3k 67005 NA NA
platelet-activating factor acetylhydrolase, isoform 1b, subunit 2 Pafah1b2 18475 NA NA
kelch-like 21 Klhl21 242785 NA NA
yippee-like 2 (Drosophila) Ypel2 77864 NA NA
WD repeat domain 43 Wdr43 72515 NA NA
DPH2 homolog Dph2 67728 NA NA
son of sevenless homolog 1 (Drosophila) Sos1 20662 NA NA
NA 2810432L12Rik NA TRUE NA
ubiquitin protein ligase E3 component n-recognin 5 Ubr5 70790 NA NA
endothelin converting enzyme 2 Ece2 107522 NA NA
peroxisome proliferative activated receptor, gamma, coactivator-related 1 Pprc1 226169 NA NA
choline kinase alpha Chka 12660 NA NA
WD repeat domain 36 Wdr36 225348 NA NA
oral cancer overexpressed 1 Oraov1 72284 NA NA
arrestin domain containing 1 Arrdc1 215705 NA NA
kinesin family member 20B Kif20b 240641 NA NA
ATPase, Na+/K+ transporting, beta 3 polypeptide Atp1b3 11933 NA NA
phosphatidylserine synthase 2 Ptdss2 27388 NA NA
eukaryotic translation initiation factor 5B Eif5b 226982 NA NA
zinc finger protein 644 Zfp644 52397 NA NA
protein kinase C, delta Prkcd 18753 NA NA
eukaryotic translation initiation factor 3, subunit C Eif3c 56347 NA NA
MAD2 mitotic arrest deficient-like 2 Mad2l2 71890 NA NA
snurportin 1 Snupn 66069 NA NA
nuclear transcription factor-Y beta Nfyb 18045 NA NA
RIKEN cDNA 2010320M18 gene 2010320M18Rik 72093 NA NA
RIKEN cDNA 2010320M18 gene 2010320M18Rik ENSMUSG00000100691 NA NA
glutamine fructose-6-phosphate transaminase 1 Gfpt1 14583 NA NA
NA 2610024G14Rik NA TRUE NA
SLU7 splicing factor homolog (S. cerevisiae) Slu7 193116 NA Pre-mRNA splicing occurs in two sequential transesterification steps. The protein encoded by this gene is a splicing factor that has been found to be essential during the second catalytic step in the pre-mRNA splicing process. It associates with the spliceosome and contains a zinc knuckle motif that is found in other splicing factors and is involved in protein-nucleic acid and protein-protein interactions. Alternatively spliced transcript variants have been found for this gene.
Rac GTPase-activating protein 1 Racgap1 26934 NA NA
Crn, crooked neck-like 1 (Drosophila) Crnkl1 66877 NA NA
missing oocyte, meiosis regulator, homolog (Drosophila) Mios 252875 NA NA
solute carrier family 9 (sodium/hydrogen exchanger), member 9 Slc9a9 331004 NA NA
TNFAIP3 interacting protein 1 Tnip1 57783 NA NA
GINS complex subunit 3 (Psf3 homolog) Gins3 78833 NA NA
meningioma expressed antigen 5 (hyaluronidase) Mgea5 76055 NA NA
NIN1/RPN12 binding protein 1 homolog Nob1 67619 NA NA
transmembrane protein 229B Tmem229b 268567 NA NA
6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 Pfkfb3 170768 NA NA
WD repeat domain 81 Wdr81 192652 NA NA
NA B230312A22Rik NA TRUE NA
ESF1 nucleolar pre-rRNA processing protein homolog Esf1 66580 NA NA
solute carrier family 25, member 36 Slc25a36 192287 NA NA
NA 2610027L16Rik NA TRUE NA
autophagy related 3 Atg3 67841 NA NA
gem (nuclear organelle) associated protein 5 Gemin5 216766 NA NA
BCL2-like 11 (apoptosis facilitator) Bcl2l11 12125 NA NA
NA 2310047M10Rik NA TRUE NA
RIKEN cDNA 4930432K21 gene 4930432K21Rik 74666 NA NA
tuftelin interacting protein 11 Tfip11 54723 NA NA
zinc finger and BTB domain containing 10 Zbtb10 229055 NA NA
BCL2-associated transcription factor 1 Bclaf1 72567 NA NA
deoxynucleotidyltransferase, terminal, interacting protein 2 Dnttip2 99480 NA NA
polymerase (RNA) I polypeptide E Polr1e 64424 NA NA
SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 1 Smarcc1 20588 NA NA
arginyl-tRNA synthetase Rars 104458 NA NA
ataxin 7 Atxn7 246103 NA NA
ubiquitin specific peptidase 33 Usp33 170822 NA NA
NA 2310008H09Rik NA TRUE NA
TBC1 domain family, member 9 Tbc1d9 71310 NA NA
catenin, beta like 1 Ctnnbl1 66642 NA NA
zinc finger and BTB domain containing 17 Zbtb17 22642 NA NA
nuclear RNA export factor 1 Nxf1 53319 NA NA
zinc finger, DHHC domain containing 7 Zdhhc7 102193 NA NA
URB2 ribosome biogenesis 2 homolog (S. cerevisiae) Urb2 382038 NA NA
lysine (K)-specific demethylase 5A Kdm5a 214899 NA NA
asparaginase Aspg 104816 NA NA
cathepsin C Ctsc 13032 NA This gene encodes a member of the peptidase C1 (papain) family of cysteine proteases. Alternative splicing results in multiple transcript variants, at least one of which encodes a preproprotein that is proteolytically processed to generate multiple protein products. These products include the dipeptidyl peptidase 1 light, heavy, and exclusion domain chains, which together comprise one subunit of the homotetrameric enzyme. This enzyme has amino dipeptidase activity and may play a role in the activation of granzymes during inflammation. Homozygous knockout mice for this gene exhibit impaired granzyme activation and enhanced survival in a sepsis model.
neurofibromatosis 2 Nf2 18016 NA NA
polymerase (RNA) mitochondrial (DNA directed) Polrmt 216151 NA NA
NA 4632411B12Rik NA TRUE NA
small nuclear RNA activating complex, polypeptide 4 Snapc4 227644 NA NA
DEAD (Asp-Glu-Ala-Asp) box polypeptide 10 Ddx10 77591 NA NA
HEAT repeat containing 1 Heatr1 217995 NA NA
NPC1-like 1 Npc1l1 237636 NA NA
HAUS augmin-like complex, subunit 6 Haus6 230376 NA NA
lectin, mannose-binding 1 like Lman1l 235416 NA NA
THAP domain containing 4 Thap4 67026 NA NA
tektin 2 Tekt2 24084 NA NA
nuclear receptor subfamily 1, group H, member 3 Nr1h3 22259 NA NA
eukaryotic translation initiation factor 4E Eif4e 13684 NA This gene encodes a component of the eukaryotic translation initiation factor 4F complex, which recognizes the 7-methylguanosine cap structure at the 5’ end of messenger RNAs. The encoded protein aids in translation initiation by recruiting ribosomes to the 5’-cap structure. Association of this protein with the 4F complex is the rate-limiting step in translation initiation. This gene acts as a proto-oncogene, and its expression and activation is associated with transformation and tumorigenesis. It has also been associated with autism spectrum disorders. Consistently, knockout of this gene results in increased translation of neuroligins, postsynaptic proteins linked to autism spectrum disorders. Pseudogenes of this gene are found on other chromosomes. Alternative splicing results in multiple transcript variants.
Spi-C transcription factor (Spi-1/PU.1 related) Spic 20728 NA NA
natural killer tumor recognition sequence Nktr 18087 NA NA
TRAF-interacting protein Traip 22036 NA NA
NA A930001N09Rik NA TRUE NA

Cluster 5 annotations

out <- mygene::queryMany(gene_list[5,],  scopes="symbol", fields=c("name", "summary"), species="mouse");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
notfound query X_id name summary
TRUE LOC100502936 NA NA NA
NA Bcl2l10 12049 Bcl2-like 10 NA
NA Tcl1 21432 T cell lymphoma breakpoint 1 NA
NA E330034G19Rik 105418 RIKEN cDNA E330034G19 gene NA
NA Oas1d 100535 2’-5’ oligoadenylate synthetase 1D NA
NA AU022751 102991 expressed sequence AU022751 NA
NA Spin1 20729 spindlin 1 NA
NA Khdc1b 98582 KH domain containing 1B NA
NA D6Ertd527e 52372 DNA segment, Chr 6, ERATO Doi 527, expressed NA
NA Btg4 56057 B cell translocation gene 4 NA
NA Mphosph6 68533 M phase phosphoprotein 6 NA
NA Eif4e1b 218268 eukaryotic translation initiation factor 4E family member 1B NA
NA Fbxw24 382106 F-box and WD-40 domain protein 24 NA
NA Accsl 381411 1-aminocyclopropane-1-carboxylate synthase (non-functional)-like NA
NA Oog3 100012 oogenesin 3 NA
NA Rfpl4 192658 ret finger protein-like 4 NA
NA C86187 97402 expressed sequence C86187 NA
NA Oosp1 170834 oocyte secreted protein 1 NA
NA Pkd2l2 53871 polycystic kidney disease 2-like 2 NA
NA Dazl 13164 deleted in azoospermia-like This gene encodes a member of the depleted in azoospermia-like (DAZL) protein family. Members of this family contain an RNA recognition motif, interact with poly A binding proteins, and may be involved in the initiation of translation. The encoded protein is expressed in the cytoplasm of pluripotent stem cells, and in both male and female germ cells, where it is essential for gametogenesis. Disruption of this gene is associated with infertility. Alternative splicing results in multiple transcript variants.
NA Zfp57 22715 zinc finger protein 57 NA
NA C87499 381590 expressed sequence C87499 NA
NA 4933427D06Rik 232217 RIKEN cDNA 4933427D06 gene NA
NA Rgs2 19735 regulator of G-protein signaling 2 NA
NA H1foo 171506 H1 histone family, member O, oocyte-specific Histones are basic nuclear proteins that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes. Nucleosomes consist of approximately 146 bp of DNA wrapped around a histone octamer composed of pairs of each of the four core histones (H2A, H2B, H3, and H4). The chromatin fiber is further compacted through the interaction of a linker histone, H1, with the DNA between the nucleosomes to form higher order chromatin structures. The protein encoded is a replication-independent histone that is a member of the histone H1 family. This gene contains introns, unlike most histone genes and the encoded protein is expressed only in oocytes.
NA Bmp15 12155 bone morphogenetic protein 15 NA
NA Zbed3 72114 zinc finger, BED type containing 3 This gene encodes a member of the zinc finger protein superfamily. This protein may regulate the Wnt/beta-catenin signaling pathway. This protein may be involved in insulin resistance and type 2 diabetes in humans. Alternative splicing results in multiple transcript variants.
NA Nlrp4f 97895 NLR family, pyrin domain containing 4F NA
NA Oas1e 231699 2’-5’ oligoadenylate synthetase 1E NA
NA Dcp1a 75901 decapping mRNA 1A NA
NA Riok1 71340 RIO kinase 1 (yeast) This gene encodes a member of the RIO family of atypical serine protein kinases. A similar protein in humans is a component of the protein arginine methyltransferase 5 complex that specifically recruits the RNA-binding protein nucleolin as a methylation substrate.
NA Ldhb 16832 lactate dehydrogenase B This gene encodes the B subunit of lactate dehydrogenase enzyme, which catalyzes the interconversion of pyruvate and lactate with concomitant interconversion of NADH and NAD+ in a post-glycolysis process. Recent studies have shown that two protein isoforms are produced from the same mRNA by use of alternative in-frame translation termination codons via a stop codon readthrough mechanism, and that these C-terminally distinct isoforms have different subcellular localization. Alternatively spliced transcript variants have also been found for this gene. Pseudogenes have been identified on chromosomes 1 and 19.
NA Dnajb4 67035 DnaJ heat shock protein family (Hsp40) member B4 NA
NA Gdf9 14566 growth differentiation factor 9 NA
NA Tgfb2 21808 transforming growth factor, beta 2 NA
NA Fbxw15 382105 F-box and WD-40 domain protein 15 NA
NA Obox1 71468 oocyte specific homeobox 1 NA
NA Tbc1d15 66687 TBC1 domain family, member 15 NA
NA Bpgm 12183 2,3-bisphosphoglycerate mutase NA
NA A430033K04Rik 243308 RIKEN cDNA A430033K04 gene NA
NA Rnf114 81018 ring finger protein 114 NA
NA Nsf 18195 N-ethylmaleimide sensitive fusion protein NA
NA Obox2 246792 oocyte specific homeobox 2 NA
TRUE Gm97 NA NA NA
NA Nlrp4a 243880 NLR family, pyrin domain containing 4A NA
NA Meis2 17536 Meis homeobox 2 NA
NA Oas1h 246729 2’-5’ oligoadenylate synthetase 1H NA
NA Anxa7 11750 annexin A7 NA
NA Oas1c 114643 2’-5’ oligoadenylate synthetase 1C NA
NA Fbxw22 382156 F-box and WD-40 domain protein 22 NA
NA Nfya 18044 nuclear transcription factor-Y alpha NA
NA Wee2 381759 WEE1 homolog 2 (S. pombe) NA
NA Fbxw28 668758 F-box and WD-40 domain protein 28 NA
NA Ccdc69 52570 coiled-coil domain containing 69 NA
NA Tet3 194388 tet methylcytosine dioxygenase 3 NA
NA Nlrp4b 210045 NLR family, pyrin domain containing 4B NA
NA Rbm18 67889 RNA binding motif protein 18 NA
NA Obox5 252829 oocyte specific homeobox 5 NA
TRUE 9230115E21Rik NA NA NA
NA Casp8 12370 caspase 8 This gene is part of a family of caspases, aspartate-specific cysteine proteases well studied for their involvement in immune and apoptosis signaling. This protein, an initiator of apoptotic cell death, is activated by death-inducing tumor necrosis family receptors and targets downstream effectors. In mouse deficiency of this gene can cause embryonic lethality. This protein may have a role in embryogenesis. Alternative splicing results in multiple transcript variants that encode different protein isoforms.
NA Pttg1 30939 pituitary tumor-transforming gene 1 NA
NA G6pdx 14381 glucose-6-phosphate dehydrogenase X-linked NA
NA Tcl1b1 27379 T cell leukemia/lymphoma 1B, 1 NA
NA Gm4981 245263 predicted gene 4981 NA
NA Gja4 14612 gap junction protein, alpha 4 NA
NA Slbp 20492 stem-loop binding protein NA
NA Nlrp9b 243874 NLR family, pyrin domain containing 9B NA
NA Tdrd12 71981 tudor domain containing 12 NA
NA Txnip 56338 thioredoxin interacting protein NA
NA Ccndbp1 17151 cyclin D-type binding-protein 1 NA
NA Ube2d3 66105 ubiquitin-conjugating enzyme E2D 3 NA
NA Oog4 242737 oogenesin 4 NA
NA Trim61 260296 tripartite motif-containing 61 NA
NA Slc30a3 22784 solute carrier family 30 (zinc transporter), member 3 NA
NA Rnf185 193670 ring finger protein 185 NA
NA Slc45a3 212980 solute carrier family 45, member 3 NA
NA Cenpn 72155 centromere protein N NA
NA Gm13023 194227 predicted gene 13023 NA
NA Paip1 218693 polyadenylate binding protein-interacting protein 1 NA
NA Pabpc1l 381404 poly(A) binding protein, cytoplasmic 1-like NA
NA Ehf 13661 ets homologous factor NA
NA Smagp 207818 small cell adhesion glycoprotein NA
NA Eif4e3 66892 eukaryotic translation initiation factor 4E member 3 NA
NA Fbxw18 546161 F-box and WD-40 domain protein 18 NA
NA Creb3l4 78284 cAMP responsive element binding protein 3-like 4 This gene encodes a CREB (cyclic AMP-responsive element-binding) protein with a transmembrane domain which localizes it to the ER membrane. The encoded protein may play a role in adiposity and male germ cell development. Homozygous knockout mice for this gene show increased adipogenesis, elevated testicular germ cell apoptosis and defects in spermatogenesis. Alternative splicing results in multiple transcript variants.
NA Gyg 27357 glycogenin NA
NA Znfx1 98999 zinc finger, NFX1-type containing 1 NA
NA Gm13084 381569 predicted gene 13084 NA
NA Mat2b 108645 methionine adenosyltransferase II, beta NA
NA Pnpla3 116939 patatin-like phospholipase domain containing 3 NA
NA Rbbp7 245688 retinoblastoma binding protein 7 NA
NA Cnot7 18983 CCR4-NOT transcription complex, subunit 7 NA
NA Chek1 12649 checkpoint kinase 1 NA
NA Rgs17 56533 regulator of G-protein signaling 17 NA
NA 2700029M09Rik 72612 RIKEN cDNA 2700029M09 gene NA
NA Fbxw14 50757 F-box and WD-40 domain protein 14 NA
NA Nobox 18291 NOBOX oogenesis homeobox NA
NA Trim75 333307 tripartite motif-containing 75 NA
NA Fbxw20 434440 F-box and WD-40 domain protein 20 NA
NA Nlrp9a 233001 NLR family, pyrin domain containing 9A NA

Cluster 6 annotations

out <- mygene::queryMany(gene_list[6,],  scopes="symbol", fields=c("name", "summary"), species="mouse");
## Finished
## Pass returnall=TRUE to return lists of duplicate or missing query terms.
kable(as.data.frame(out))
name query X_id notfound summary
oocyte specific homeobox 3 Obox3 246791 NA NA
zinc finger protein 352 Zfp352 236537 NA NA
predicted gene 8300 Gm8300 666806 NA NA
NA Usp17l5 NA TRUE NA
expressed sequence BB287469 BB287469 544881 NA NA
ret finger protein-like 4B Rfpl4b 215919 NA NA
predicted pseudogene 2022 Gm2022 100039052 NA NA
predicted pseudogene 2022 Gm2022 ENSMUSG00000071217 NA NA
predicted gene 5662 Gm5662 435337 NA NA
predicted gene 11544 Gm11544 ENSMUSG00000081586 NA NA
predicted gene 11544 Gm11544 432591 NA NA
predicted pseudogene 4850 Gm4850 ENSMUSG00000100205 NA NA
THO complex 4 pseudogene Gm4850 226957 NA NA
predicted gene 13078 Gm13078 277666 NA NA
predicted gene 2016 Gm2016 100039042 NA NA
solute carrier family 34 (sodium phosphate), member 2 Slc34a2 20531 NA NA
poly(A) binding protein, nuclear 1 Pabpn1 54196 NA NA
expressed sequence AU015228 AU015228 99169 NA NA
expressed sequence AU015228 AU015228 ENSMUSG00000074804 NA NA
Smad nuclear interacting protein 1 pseudogene Gm4971 244061 NA NA
zinc finger protein 54 Zfp54 22712 NA NA
GC-rich promoter binding protein 1 Gpbp1 73274 NA NA
predicted gene 5039 Gm5039 ENSMUSG00000093847 NA NA
eukaryotic translation initiation factor 1A pseudogene Gm5039 266459 NA NA
NA Gm5156 NA TRUE NA
WD repeat domain 44 Wdr44 72404 NA This gene encodes a protein containing multiple WD repeats. The encoded protein may play a role in vesicle trafficking. Alternate splicing results in multiple transcript variants.
dual oxidase maturation factor 2 Duoxa2 66811 NA NA
NFX-like protein 1700019M22Rik 69423 NA NA
RIKEN cDNA 1700019M22 gene 1700019M22Rik ENSMUSG00000059695 NA NA
predicted gene 4340 Gm4340 100043292 NA NA
NA Tmem92-ps NA TRUE NA
predicted gene 11756 Gm11756 623281 NA NA
predicted gene 11756 Gm11756 ENSMUSG00000093962 NA NA
kelch-like 11 Klhl11 217194 NA NA
NA Dub1a NA TRUE NA
zinc finger and SCAN domain containing 5B Zscan5b 170734 NA NA
predicted gene 16381 Gm16381 100042786 NA NA
transducer of ERBB2, 2 Tob2 57259 NA NA
zinc finger protein 707 Zfp707 69020 NA NA
topoisomerase I binding, arginine/serine-rich Topors 106021 NA NA
ubiquitin specific peptidase 34 Usp34 17847 NA NA
BCL2-associated athanogene 5 Bag5 70369 NA NA
solute carrier family 30 (zinc transporter), member 1 Slc30a1 22782 NA NA
Yy2 transcription factor Yy2 100073351 NA NA
zinc finger and SCAN domain containing 4C Zscan4c 245109 NA NA
proteasome (prosome, macropain) activator subunit 4 Psme4 103554 NA NA
G-protein-coupled receptor 50 Gpr50 14765 NA This gene encodes a multipass membrane protein that is thought to act as a G protein-coupled receptor. Activity of this protein may be important in neurotransmitter and glucocorticoid signalling. Mutation of this gene causes a decreased ability to maintain a constant body temperature, resulting in torpor, as well as an increased metabolic rate. Alternative splicing results in multiple transcript variants.
zinc finger and SCAN domain containing 4D Zscan4d 545913 NA NA
Sfi1 homolog, spindle assembly associated (yeast) Sfi1 78887 NA NA
predicted gene 9125 Gm9125 668359 NA NA
RIKEN cDNA 1700013H16 gene 1700013H16Rik 75514 NA NA
G2/M-phase specific E3 ubiquitin ligase G2e3 217558 NA NA
purine-nucleoside phosphorylase Pnp 18950 NA NA
predicted gene 11487 Gm11487 433719 NA This gene belongs to a family of related genes tandemly arranged in two clusters on chromosome 4. This family, which appears to be mouse-specific and composed of multiple highly similar members, is supported by limited transcript data. Members of the family maintain an intact open reading frame although the encoded protein has no known function. This gene is inferred from alignment of paralogous transcripts.
NA Wapal NA TRUE NA
arylsulfatase K Arsk 77041 NA NA
Rho-associated coiled-coil containing protein kinase 1 Rock1 19877 NA NA
THAP domain containing 11 Thap11 59016 NA NA
DNA-damage-inducible transcript 4-like Ddit4l 73284 NA NA
zinc finger and SCAN domain containing 4F Zscan4f 665902 NA NA
predicted gene 428 Gm428 242502 NA This gene belongs to a family of related genes tandemly arranged in two clusters on chromosome 4. This family, which appears to be mouse-specific and composed of multiple highly similar members, is supported by limited transcript data. Members of the family maintain an intact open reading frame although the encoded protein has no known function. This gene is supported by alignment of transcripts.
LIM homeobox transcription factor 1 alpha Lmx1a 110648 NA NA
cDNA sequence BC080695 BC080695 329986 NA NA
zinc finger protein 954 Zfp954 232853 NA NA
NA 1700023I07Rik NA TRUE NA
frizzled class receptor 7 Fzd7 14369 NA NA
transducer of ErbB-2.1 Tob1 22057 NA NA
homeodomain interacting protein kinase 1 Hipk1 15257 NA NA
activating transcription factor 2 Atf2 11909 NA NA
zinc finger protein 599 Zfp599 235048 NA NA
expressed sequence AA415398 AA415398 433752 NA NA
chemokine (C-X-C motif) ligand 16 Cxcl16 66102 NA NA
RIKEN cDNA 1600025M17 gene 1600025M17Rik ENSMUSG00000085114 NA NA
RIKEN cDNA 1600025M17 gene 1600025M17Rik 72030 NA NA
anthrax toxin receptor 1 Antxr1 69538 NA NA
glutamate-ammonia ligase (glutamine synthetase) Glul 14645 NA NA
phosphodiesterase 12 Pde12 211948 NA NA
immediate early response 5 Ier5 15939 NA NA
NA H47 NA TRUE NA
protein phosphatase 1, regulatory (inhibitor) subunit 15A Ppp1r15a 17872 NA NA
ribosomal modification protein rimK-like family member B Rimklb 108653 NA NA
polo-like kinase 2 Plk2 20620 NA NA
TD and POZ domain containing 1 Tdpoz1 207213 NA NA
AT rich interactive domain 5A (MRF1-like) Arid5a 214855 NA NA
zinc finger protein 217 Zfp217 228913 NA NA
ring finger protein, LIM domain interacting Rlim 19820 NA NA
ADP-ribosylation factor-like 4D Arl4d 80981 NA NA
snail family zinc finger 1 Snai1 20613 NA NA
UTP14, U3 small nucleolar ribonucleoprotein, homolog B (yeast) Utp14b 195434 NA NA
YTH domain containing 1 Ythdc1 231386 NA NA
zinc finger protein 791 Zfp791 244556 NA NA
DBF4 zinc finger Dbf4 27214 NA NA
NA B020018G12Rik NA TRUE NA
predicted gene 12794 Gm12794 332923 NA NA
forkhead box J3 Foxj3 230700 NA NA
cyclin M2 Cnnm2 94219 NA NA
F-box protein 11 Fbxo11 225055 NA NA
kinesin family member 2A Kif2a 16563 NA NA
RNA binding motif protein 8a Rbm8a 60365 NA NA
predicted gene 12789 Gm12789 381536 NA NA
cyclin-dependent kinase 12 Cdk12 69131 NA NA
NA Dub1 NA TRUE NA
family with sequence similarity 83, member D Fam83d 71878 NA NA
leucyl-tRNA synthetase, mitochondrial Lars2 102436 NA NA
deleted in colorectal carcinoma Dcc 13176 NA NA
matrix metallopeptidase 19 Mmp19 58223 NA This gene encodes a member of the matrix metalloproteinase family of extracellular matrix-degrading enzymes that are involved in tissue remodeling, wound repair, progression of atherosclerosis and tumor invasion. The encoded preproprotein undergoes proteolytic processing to generate a mature, zinc-dependent endopeptidase enzyme. Mice lacking the encoded protein develop a diet-induced obesity due to adipocyte hypertophy, exhibit decreased susceptibility to chemical carcinogen-induced skin tumors and early onset of tumoral angiogenesis. Alternative splicing results in multiple transcript variants encoding different isoforms.
UDP glucuronosyltransferase 2 family, polypeptide B36 Ugt2b36 231396 NA NA
protein phosphatase 1, regulatory (inhibitor) subunit 8 Ppp1r8 100336 NA NA