This web page was produced as an assignment for an undergraduate course at Davidson College.

Single-cell RNA-seq



Summary and opinion

Transcription is a dynamic cellular event. In diploid organisms, the presence of two alleles at heterozygous loci means that transcription can take place on either homologous chromosome to produce distinct gene products. Deng et al. (2014) used single-cell RNA-seq to determine the relative abundance of transcripts derived from maternal compared to paternal chromosomes in the early mouse embryo and differentiated tissues. By testing samples derived from a cross between two genetically defined strains, they could use SNPs to assign the chromosome of origin for the majority of transcripts.

Their results are consistent with stochastic gene expression rather than stable expression of one allele like in imprinting. It is unclear why they use the term “monoallelic gene expression,” when they admit that their data is best explained by a non-regulated stochastic process, which is the null model. I would argue that it is a misleading—and undoubtedly attention grabbing—use of the term. Here, for the sake of consistency, I will use “monoallelic gene expression” to refer to detection of one allele in an experiment, which does not imply biological regulation. Despite the misleading vocabulary, the paper does contribute new understanding of the regulation of X inactivation, which (technical advances aside) is the most interesting result.

Figure 1






Panel A shows that single-cell transcriptomes cluster along developmental stages. Each shape is a particular embryo, and each color is a developmental stage. The single-cell RNA-seq data were analyzed by principal component analysis, which defines axes (principal components) that in descending order explain the maximum amount of variance in transcript levels as possible. Clusters generally contain cells from several embryos at the same stage, because regulated patterns of gene expression are fundamental to development.

Panel B shows that by the 4-cell stage, maternal and paternal alleles are approximately equally represented in the transcriptome. In the zygote, all RNAs originate from maternal alleles. We can infer that the paternal pronucleus has not yet fused and become transcriptionally active. The figure also includes control cells from only the parental strains, without performing a cross between the two. They found that their SNP analysis correctly assigns >99% of transcripts to the correct parent strain of origin when testing the two controls. It was important that they demonstrate the accuracy of their SNP method before applying it to determine unknown patterns of gene expression.

Figure 2






Panel A shows that the paternal X chromosome (Xp) becomes less transcriptionally active (‘inactivated’) during early development. A significant bias toward transcription from the maternal X chromosome appears at the 16-cell stage, and the difference is greater when development progresses to the early blastocyst. The parent of origin bias is not present for the autosomes, shown in black and gray. Consistent with this observed transcriptional bias and the established role for the Xist transcript in X inactivation, Xist transcription is high during the 16-cell and early blastocyst stages. Additionally, Xist is female-specific and only transcribed from Xp, as the maternal X chromosome Xm will remain active.

Xist is transcribed from the locus Xic, but panel B shows that X inactivation does not simply spread out in either direction from Xic. In fact, loci near Xic are expressed approximately equally between maternal and paternal alleles. The observation is shown by the height of roughly 0.5 (fraction maternal expression) for the lines at position 100 Mb, where the dotted lines in the plot intersect. Although X inactivation does not spread uniformly from Xic, it is clear that as development progresses from the 4-cell to 16-cell to early blastocyst stages, fewer paternal alleles are expressed overall, shown by red lines above blue and green lines.


Figure 3






The title of the paper claims an observation of monoallelic gene expression. However, low levels of transcript could be lost in the RNA-seq protocol and lead to overestimating the fraction of genes undergoing monoallelic expression. To determine the efficiency of their protocol, the authors tested the likelihood of an allele not being represented in the RNA-seq dataset based on its expression level (panel A). They split the contents from single cells between two independent replicates and compared the results. From that, they inferred that on average 17% of genes showed monoallelic expression, with more highly expressed genes less likely to show monoallelic expression (panels B, C).

Panels D and E follow monoallelic gene expression through early development. Consistent with previous figures, maternal alleles predominate in early development. However, monoallelic expression of both maternal and paternal alleles is observed at an apparently high rate using their single cell method. The fraction of genes undergoing monoallelic expression varies widely between replicates, particularly at the 16-cell stage and later.

While it might seem like nearly 25% of genes are expressed from a single parent during early development, panel E betrays the true conclusion. In fact, a tiny fraction of genes in the embryo experience monoallelic expression, besides the strong bias toward maternal alleles early in development. Panel F shows that their data from the 8-cell stage closely fit a model of stochastic gene expression. Genes expressed at low levels, where some cells might not contain transcript when the analysis is performed, are more likely to show monoallelic expression. Genes expressed at high levels, where all cells are likely to contain transcript, are much more likely to have biallelic expression. In fact, on average, genes with biallelic expression in the 4-cell stage are on average expressed at a 2-fold higher level than genes undergoing monoallelic expression. The observations are consistent with stochastic transcription.

Figure 4






Finally, the authors applied their method to differentiated liver cells or fibroblasts rather than to early embryos. Like previously, genes expressed at low levels in the liver are the majority of genes undergoing monoallelic expression in individual cells. Simply diluting whole liver RNA extracts replicates the phenomenon, which is consistent with inherent stochasticity in measure small levels of transcript. Individual fibroblast cells showed monoallelic gene expression in proportions comparable to cells from the early embryo.


Reference:

Deng Q, Ramsköld D, Reinius B, Sandberg R. 2014. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343:193-196.


Eric Sawyer's Home Page

Genomics Page
Biology Home Page

Email Questions or Comments.


© Copyright 2014 Department of Biology, Davidson College, Davidson, NC 28035