Article Online

Articles Online (Volume 19, Issue 3)


Advanced Single-cell Omics Technologies and Informatics Tools for Genomics, Proteomics, and Bioinformatics Analysis

Luonan Chen, Rong Fan, Fuchou Tang

Page 343-345

Research Article

scDPN for High-throughput Single-cell CNV Detection to Uncover Clonal Evolution During HCC Recurrence

Liang Wu, Miaomiao Jiang, Yuzhou Wang, Biaofeng Zhou, Yunfan Sun, Kaiqian Zhou, Jiarui Xie, Yu Zhong, Zhikun Zhao, Michael Dean, Yong Hou, Shiping Liu

Single-cell genomics provides substantial resources for dissecting cellular heterogeneity and cancer evolution. Unfortunately, classical DNA amplification-based methods have low throughput and introduce coverage bias during sample preamplification. We developed a single-cell DNA library preparation method without preamplification in nanolitre scale (scDPN) to address these issues. The method achieved a throughput of up to 1800 cells per run for copy number variation (CNV) detection. Also, our approach demonstrated a lower level of amplification bias and noise than the multiple displacement amplification (MDA) method and showed high sensitivity and accuracy for cell line and tumor tissue evaluation. We used this approach to profile the tumor clones in paired primary and relapsed tumor samples of hepatocellular carcinoma (HCC). We identified three clonal subpopulations with a multitude of aneuploid alterations across the genome. Furthermore, we observed that a minor clone of the primary tumor containing additional alterations in chromosomes 1q, 10q, and 14q developed into the dominant clone in the recurrent tumor, indicating clonal selection during recurrence in HCC. Overall, this approach provides a comprehensive and scalable solution to understand genome heterogeneity and evolution.
单细胞全基因组测序技术是研究肿瘤的异质性和演化问题的重要手段和工具。传统的单细胞全基因组测序非常依赖于DNA的扩增技术,比如多重置换扩增(MDA),具有通量低和偏好性强等不足。基于此,我们利用转座酶可直接建库的特征,开发了一套基于微孔芯片的高通量、低成本、纳升级反应的单细胞DNA文库制备方法——scDPN。其通量可达 1800 个细胞/每次,结合低深度全基因组测序,非常适用于单细胞水平的基因组拷贝数变异 (CNV) 的检测。通过对细胞系和肿瘤组织的研究发现,scDPN在单细胞水平的CNV检测具有较好的均一性和准确性,且相对于传统的MDA方法,其扩增偏向性和背景噪音更小,能更准确的检测出单细胞水平的CNV事件。进一步地,我们使用scDPN对来自同一个肝细胞癌病人的原发和复发肿瘤样本进行了单细胞CNV的研究。我们发现原发肿瘤具有较高的异质性,其包含了两种CNV差异较大的克隆亚型,分别为主要克隆Type 1型和次要克隆Type 2型,其细胞数占比分别为85%和15%。有意思的是,复发肿瘤细胞均来自于原发肿瘤的次要克隆Type 2型,说明复发过程发生了肿瘤克隆亚型的选择。对两种克隆亚型的深入研究发现,相比于Type 1型,Type 2在染色体 1q、10q 和 14q 出现了新的CNV事件。值得注意的是,10q位置的CNV,即杂合缺失而导致的包括PTEN 、FAS 在内的多个抑癌基因的拷贝数减少,可能是Type 2型在复发过程中被优势选择的原因之一,利用TCGA(The Cancer Genomic Atlas)公共数据的预后验证结果也间接验证了此推论。总而言之,我们新开发的单细胞文库制备方法scDPN,为单细胞基因组学的研究提供了一个全面且可扩展的解决方案。

Page 346-357

Research Article

Mapping Human Pluripotent Stem Cell-derived Erythroid Differentiation by Single-cell Transcriptome Analysis

Zijuan Xin, Wei Zhang, Shangjin Gong, Junwei Zhu, Yanming Li, Zhaojun Zhang, Xiangdong Fang

There is an imbalance between the supply and demand of functional red blood cells (RBCs) in clinical applications. This imbalance can be addressed by regenerating RBCs using several in vitro methods. Induced pluripotent stem cells (iPSCs) can handle the low supply of cord blood and the ethical issues in embryonic stem cell research, and provide a promising strategy to eliminate immune rejection. However, no complete single-cell level differentiation pathway exists for the iPSC-derived erythroid differentiation system. In this study, we used iPSC line BC1 to establish a RBC regeneration system. The 10X Genomics single-cell transcriptome platform was used to map the cell lineage and differentiation trajectory on day 14 of the regeneration system. We observed that iPSC differentiation was not synchronized during embryoid body (EB) culture. The cells (on day 14) mainly consisted of mesodermal and various blood cells, similar to the yolk sac hematopoiesis. We identified six cell classifications and characterized the regulatory transcription factor (TF) networks and cell–cell contacts underlying the system. iPSCs undergo two transformations during the differentiation trajectory, accompanied by the dynamic expression of cell adhesion molecules and estrogen-responsive genes. We identified erythroid cells at different stages, such as burst-forming unit erythroid (BFU-E) and orthochromatic erythroblast (ortho-E) cells, and found that the regulation of TFs (e.g., TFDP1 and FOXO3) is erythroid-stage specific. Immune erythroid cells were identified in our system. This study provides systematic theoretical guidance for optimizing the iPSC-derived erythroid differentiation system, and this system is a useful model for simulating in vivo hematopoietic development and differentiation.
建立了iPSC红细胞再生体系,通过10x scRNA-seq技术首次绘制了该体系第14 天的细胞谱系和分化轨迹。结果显示 iPSCs 分化在胚状体培养过程中不同步,第14天的细胞主要由中胚层和各种血细胞组成,类似于卵黄囊造血。发现了iPSC 在分化轨迹中经历了两次转化,其伴随着细胞粘附分子和雌激素反应基因的动态表达。并首次在红系体外分化系统中鉴定了免疫类红细胞。该研究为优化红细胞再生系统提供了一定的理论指导。

Page 358-376

Research Article

Single-cell Long Non-coding RNA Landscape of T Cells in Human Cancer Immunity

Haitao Luo, Dechao Bu, Lijuan Shao, Yang Li, Liang Sun, Ce Wang, Jing Wang, Wei Yang, Xiaofei Yang, Jun Dong, Yi Zhao, Furong Li

The development of new biomarkers or therapeutic targets for cancer immunotherapies requires deep understanding of T cells. To date, the complete landscape and systematic characterization of long noncoding RNAs (lncRNAs) in T cells in cancer immunity are lacking. Here, by systematically analyzing full-length single-cell RNA sequencing (scRNA-seq) data of more than 20,000 libraries of T cells across three cancer types, we provided the first comprehensive catalog and the functional repertoires of lncRNAs in human T cells. Specifically, we developed a custom pipeline for de novo transcriptome assembly and obtained a novel lncRNA catalog containing 9433 genes. This increased the number of current human lncRNA catalog by 16% and nearly doubled the number of lncRNAs expressed in T cells. We found that a portion of expressed genes in single T cells were lncRNAs which had been overlooked by the majority of previous studies. Based on metacell maps constructed by the MetaCell algorithm that partitions scRNA-seq datasets into disjointed and homogenous groups of cells (metacells), 154 signature lncRNA genes were identified. They were associated with effector, exhausted, and regulatory T cell states. Moreover, 84 of them were functionally annotated based on the co-expression networks, indicating that lncRNAs might broadly participate in the regulation of T cell functions. Our findings provide a new point of view and resource for investigating the mechanisms of T cell regulation in cancer immunity as well as for novel cancer-immune biomarker development and cancer immunotherapies.

Page 377-393

Research Article

Single-cell Transcriptomes Reveal Characteristics of MicroRNAs in Gene Expression Noise Reduction

Tao Hu, Lei Wei, Shuailin Li, Tianrun Cheng, Xuegong Zhang, Xiaowo Wang

Isogenic cells growing in identical environments show cell-to-cell variations because of the stochasticity in gene expression. High levels of variation or noise can disrupt robust gene expression and result in tremendous consequences for cell behaviors. In this work, we showed evidence from single-cell RNA sequencing data analysis that microRNAs (miRNAs) can reduce gene expression noise at the mRNA level in mouse cells. We identified that the miRNA expression level, number of targets, target pool abundance, and miRNA–target interaction strength are the key features contributing to noise repression. miRNAs tend to work together in cooperative subnetworks to repress target noise synergistically in a cell type-specific manner. By building a physical model of post-transcriptional regulation and observing in synthetic gene circuits, we demonstrated that accelerated degradation with elevated transcriptional activation of the miRNA target provides resistance to extrinsic fluctuations. Together, through the integrated analysis of single-cell RNA and miRNA expression profiles, we demonstrated that miRNAs are important post-transcriptional regulators for reducing gene expression noise and conferring robustness to biological processes.
研究问题 基因表达过程中蕴含着不可避免的随机性。即便是基因型完全相同、生长环境完全一致的细胞,也会展现出基因表达的异质性,这种异质性也被称为基因表达噪声。过高的基因表达噪声不利于基因调控过程中的信息准确传输,会对生物系统的鲁棒性造成极大影响。破解天然生命系统中的降噪机制,对我们理解相关生物过程、设计出具有高准确性和鲁棒性的人工基因回路均具有重要意义。本研究将目光集中在microRNA(miRNA)这种生物体内广泛存在的调控分子,开发方法系统刻画其多方面性质与基因表达噪声的关联。 研究方法 本研究开发了系统的基于单细胞转录组测序数据的噪声分析方法,并基于此对多种不同细胞类型中的基因的表达噪声进行定量。在此基础上,本研究结合相应的miRNA-seq数据,利用多种统计学手段探索miRNA的相关性质、以及miRNA子网络对基因表达噪声的影响。同时,本研究还建立了数学模型对miRNA的降噪机制进行理论阐释,并利用合成生物实验证明了相关结论。 主要结果1 单细胞转录组数据分析发现miRNA可以广泛地降低其靶标基因的表达噪声,且其降噪能力与其靶标种类和数量正相关;其中,弱的miRNA靶标越多,该miRNA的降噪能力越明显。 主要结果2 单细胞转录组数据分析发现miRNA以子网络的形式协同发挥降噪功能,且子网络的降噪能力具有细胞类型特异性。 主要结果3 数学模型和合成生物实验结果证明miRNA通过加速mRNA的降解抑制其表达的外源噪声,进而增强基因表达的稳定性。

Page 394-407

Research Article

Single-cell RNA Sequencing Reveals Sexually Dimorphic Transcriptome and Type 2 Diabetes Genes in Mouse Islet β Cells

Gang Liu, Yana Li, Tengjiao Zhang, Mushan Li, Sheng Li, Qing He, Shuxin Liu, Minglu Xu, Tinghui Xiao, Zhen Shao, Weiyang Shi, Weida Li

Type 2 diabetes (T2D) is characterized by the malfunction of pancreatic β cells. Susceptibility and pathogenesis of T2D can be affected by multiple factors, including sex differences. However, the mechanisms underlying sex differences in T2D susceptibility and pathogenesis remain unclear. Using single-cell RNA sequencing (scRNA-seq), we demonstrate the presence of sexually dimorphic transcriptomes in mouse β cells. Using a high-fat diet-induced T2D mouse model, we identified sex-dependent T2D altered genes, suggesting sex-based differences in the pathological mechanisms of T2D. Furthermore, based on islet transplantation experiments, we found that compared to mice with sex-matched islet transplants, sex-mismatched islet transplants in healthy mice showed down-regulation of genes involved in the longevity regulating pathway of β cells. Moreover, the diabetic mice with sex-mismatched islet transplants showed impaired glucose tolerance. These data suggest sexual dimorphism in T2D pathogenicity, indicating that sex should be considered when treating T2D. We hope that our findings could provide new insights for the development of precision medicine in T2D.

Page 408-422

Research Article

Single-cell RNA Sequencing Reveals Thoracolumbar Vertebra Heterogeneity and Rib-genesis in Pigs

Jianbo Li, Ligang Wang, Dawei Yu, Junfeng Hao, Longchao Zhang, Adeniyi C. Adeola, Bingyu Mao, Yun Gao, Shifang Wu, Chunling Zhu, Yongqing Zhang, JilongRen3ChanggaiMu16David M. Irwin, Lixian Wang, Tang Hai, Haibing Xie, Yaping Zhang

Development of thoracolumbar vertebra (TLV) and rib primordium (RP) is a common evolutionary feature across vertebrates, although whole-organism analysis of the expression dynamics of TLV- and RP-related genes has been lacking. Here, we investigated the single-cell transcriptome landscape of thoracic vertebra (TV), lumbar vertebra (LV), and RP cells from a pig embryo at 27 days post-fertilization (dpf) and identified six cell types with distinct gene expression signatures. In-depth dissection of the gene expression dynamics and RNA velocity revealed a coupled process of osteogenesis and angiogenesis during TLV and RP development. Further analysis of cell type-specific and strand-specific expression uncovered the extremely high level of HOXA10 3′-UTR sequence specific to osteoblasts of LV cells, which may function as anti-HOXA10-antisense by counteracting the HOXA10-antisense effect to determine TLV transition. Thus, this work provides a valuable resource for understanding embryonic osteogenesis and angiogenesis underlying vertebrate TLV and RP development at the cell type-specific resolution, which serves as a comprehensive view on the transcriptional profile of animal embryo development.
研究问题: 脊椎动物在胸腰椎分节时胸椎和腰椎是否在细胞组成上存在差异?胸腰椎分节和肋骨形成的生物学过程是什么? 研究方法: 基于smart-seq2单细胞全长转录组测序数据,通过整合多种生物信息学分析工具,鉴定家猪胸腰椎和肋骨特异的细胞类群,利用Monocle2和RNA velocity分析重建胸腰椎和肋骨形成的主要发育过程,并对所构建的胸腰椎特异的细胞类群进行基因差异表达特征的分析和研究,鉴定与胸腰椎发育分化的调控基因。 主要结果1: 构建了家猪胸腰椎分节时胸腰椎和肋骨原基细胞图谱,发现胸腰椎在细胞组成上无差异。 主要结果2: 通过特定细胞类群的基因表达分析发现,Hoxa10在胸腰椎的成骨细胞类群中存在差异化表达。Hoxa10主要在腰椎成骨细胞中特异表达且在其3'UTR区域鉴定到高表达丰度,通过链特异的表达分析发现Hoxa10的高表达主要来自正义链而不是反义链,推测在胸腰椎发育转换过程中存在anti-Hoxa10-antisense的调控作用。 主要结果3: 发育拟时间和RNA速率分析发现,胸腰椎发育和肋骨形成主要发育过程为成骨作用和血管生成。 数据连接:

Page 423-436

Research Article

A Single-cell Transcriptome Atlas of Cashmere Goat Hair Follicle Morphogenesis

Wei Ge, Weidong Zhang, Yuelang Zhang, Yujie Zheng, Fang Li, Shanhe Wang, Jinwang Liu, Shaojing Tan, Zihui Yan, Lu Wang, Wei Shen, Lei Qu, Xin Wang

Cashmere, also known as soft gold, is produced from the secondary hair follicles (SHFs) of cashmere goats. The number of SHFs determines the yield and quality of cashmere; therefore, it is of interest to investigate the transcriptional profiles present during cashmere goat hair follicle development. However, mechanisms underlying this development process remain largely unexplored, and studies regarding hair follicle development mostly use a murine research model. In this study, to provide a comprehensive understanding of cellular heterogeneity and cell fate decisions, single-cell RNA sequencing was performed on 19,705 single cells of the dorsal skin from cashmere goat fetuses at induction (embryonic day 60; E60), organogenesis (E90), and cytodifferentiation (E120) stages. For the first time, unsupervised clustering analysis identified 16 cell clusters, and their corresponding cell types were also characterized. Based on lineage inference, a detailed molecular landscape was revealed along the dermal and epidermal cell lineage developmental pathways. Notably, our current data also confirmed the heterogeneity of dermal papillae from different hair follicle types, which was further validated by immunofluorescence analysis. The current study identifies different biomarkers during cashmere goat hair follicle development and has implications for cashmere goat breeding in the future.
研究问题: 绒山羊毛囊发生过程中的细胞异质性如何?不同细胞类型的标记分子有哪些?细胞命运决定过程是如何调控的?绒山羊与小鼠间毛囊发生过程有何异同? 研究方法: 在本研究中,在毛囊形态发生的诱导阶段(胚胎第60天)、器官形成阶段(胚胎第90天)以及细胞分化阶段(胚胎第120天),利用单细胞转录组测序(single cell RNA sequencing, scRNA seq)对绒山羊胚胎背部皮肤组织中的19705个单细胞进行了单细胞转录组测序。基于t分布随机邻域嵌入(tSNE)聚类分析,成功鉴定了绒山羊毛囊发育过程中的真皮细胞、表皮细胞、毛乳头细胞、内皮细胞、角化细胞以及周皮细胞等细胞类型,并首次详细描绘了其基因表达谱。基于不同细胞类型之间的差异分析,发现了一系列具有细胞类型特异性的标记分子。基于拟时间分化轨迹分析,成功构建了绒山羊表皮细胞谱系及真皮细胞谱系在整个毛囊发育过程中的分化轨迹,阐述了真皮细胞谱系真皮聚集、毛乳头以及表皮细胞谱系基质细胞、毛囊干细胞前体细胞、内根鞘以及毛干细胞特化过程中拟时间基因表达变化情况。 主要结果1: 基于tSNE聚类分析,成功鉴定了绒山羊毛囊发育过程中的真皮细胞、表皮细胞、毛乳头细胞、内皮细胞、角化细胞以及周皮细胞等细胞类型。 主要结果2: 基于不同细胞类型之间的差异分析,发现了一系列具有细胞类型特异性的标记分子。 主要结果3: 基于Monocle算法,成功构建了真皮细胞谱系以及表皮细胞谱系的拟时间分化轨迹。 主要结果4: 通过对绒山羊和小鼠毛囊发生过程中的单细胞转录图谱进行比较分析,发现在毛囊发生早期阶段不同物种之间的相同细胞类型存在大量保守性的标记分子。 数据链接:

Page 437-451

Web Server

GranatumX: A Community-engaging, Modularized, and Flexible Webtool for Single-cell Data Analysis

David G. Garmire, Xun Zhu, Aravind Mantravadi, Qianhui Huang, Breck Yunits, Yu Liu, Thomas Wolfgruber, Olivier Poirion, Tianying Zhao, Cédric Arisdakessian, Stefan Stanojevic, Lana X. Garmire

We present GranatumX, a next-generation software environment for single-cell RNA sequencing (scRNA-seq) data analysis. GranatumX is inspired by the interactive webtool Granatum. GranatumX enables biologists to access the latest scRNA-seq bioinformatics methods in a web-based graphical environment. It also offers software developers the opportunity to rapidly promote their own tools with others in customizable pipelines. The architecture of GranatumX allows for easy inclusion of plugin modules, named Gboxes, which wrap around bioinformatics tools written in various programming languages and on various platforms. GranatumX can be run on the cloud or private servers and generate reproducible results. It is a community-engaging, flexible, and evolving software ecosystem for scRNA-seq analysis, connecting developers with bench scientists. GranatumX is freely accessible at

Page 452-460


scGET: Predicting Cell Fate Transition During Early Embryonic Development by Single-cell Graph Entropy

Jiayuan Zhong, Chongyin Han, Xuhang Zhang, Pei Chen, Rui Liu

During early embryonic development, cell fate commitment represents a critical transition or “tipping point” of embryonic differentiation, at which there is a drastic and qualitative shift of the cell populations. In this study, we presented a computational approach, scGET, to explore the gene–gene associations based on single-cell RNA sequencing (scRNA-seq) data for critical transition prediction. Specifically, by transforming the gene expression data to the local network entropy, the single-cell graph entropy (SGE) value quantitatively characterizes the stability and criticality of gene regulatory networks among cell populations and thus can be employed to detect the critical signal of cell fate or lineage commitment at the single-cell level. Being applied to five scRNA-seq datasets of embryonic differentiation, scGET accurately predicts all the impending cell fate transitions. After identifying the “dark genes” that are non-differentially expressed genes but sensitive to the SGE value, the underlying signaling mechanisms were revealed, suggesting that the synergy of dark genes and their downstream targets may play a key role in various cell development processes. The application in all five datasets demonstrates the effectiveness of scGET in analyzing scRNA-seq data from a network perspective and its potential to track the dynamics of cell differentiation. The source code of scGET is accessible at

Page 461-474


scLink: Inferring Sparse Gene Co-expression Networks from Single-cell Expression Data

Wei Vivian Li, Yanzeng Li

A system-level understanding of the regulation and coordination mechanisms of gene expression is essential for studying the complexity of biological processes in health and disease. With the rapid development of single-cell RNA sequencing technologies, it is now possible to investigate gene interactions in a cell type-specific manner. Here we propose the scLink method, which uses statistical network modeling to understand the co-expression relationships among genes and construct sparse gene co-expression networks from single-cell gene expression data. We use both simulation and real data studies to demonstrate the advantages of scLink and its ability to improve single-cell gene network analysis. The scLink R package is available at
为了研究健康或疾病条件下的生物过程的复杂性,我们必须对基因表达的调控和协同作用的机制进行系统性的研究。单细胞测序技术的快速发展为研究细胞特异性的基因相互作用创造了有利条件。我们在本文中提出一个叫scLink的新方法; 它可以通过统计网络建模利用单细胞基因表达数据研究基因间的共表达关系,并且建立稀疏的基因共表达网络。本文通过多组仿真和实际单细胞数据展示了scLink在单细胞基因共表达网络研究中的优点。实现scLink方法的R包可以在其Github页面下载:

Page 475-492


Polar Gini Curve: A Technique to Discover Gene Expression Spatial Patterns from Single-cell RNA-seq Data

Thanh Minh Nguyen, Jacob John Jeevan, Nuo Xu, Jake Y. Chen

In this work, we describe the development of Polar Gini Curve, a method for characterizing cluster markers by analyzing single-cell RNA sequencing (scRNA-seq) data. Polar Gini Curve combines the gene expression and the 2D coordinates (“spatial”) information to detect patterns of uniformity in any clustered cells from scRNA-seq data. We demonstrate that Polar Gini Curve can help users characterize the shape and density distribution of cells in a particular cluster, which can be generated during routine scRNA-seq data analysis. To quantify the extent to which a gene is uniformly distributed in a cell cluster space, we combine two polar Gini curves (PGCs)—one drawn upon the cell-points expressing the gene (the “foreground curve”) and the other drawn upon all cell-points in the cluster (the “background curve”). We show that genes with highly dissimilar foreground and background curves tend not to uniformly distributed in the cell cluster—thus having spatially divergent gene expression patterns within the cluster. Genes with similar foreground and background curves tend to uniformly distributed in the cell cluster—thus having uniform gene expression patterns within the cluster. Such quantitative attributes of PGCs can be applied to sensitively discover biomarkers across clusters from scRNA-seq data. We demonstrate the performance of the Polar Gini Curve framework in several simulation case studies. Using this framework to analyze a real-world neonatal mouse heart cell dataset, the detected biomarkers may characterize novel subtypes of cardiac muscle cells. The source code and data for Polar Gini Curve could be found at or

Page 493-503


Integration of Droplet Microfluidic Tools for Single-cell Functional Metagenomics: An Engineering Head Start

David Conchouso, AmaniAl-Ma'abadi, Hayedeh Behzad, Mohammed Alarawi, Masahito Hosokawa, Yohei Nishikawa, Haruko Takeyama, Katsuhiko Mineta, Takashi Gojobori

Droplet microfluidic techniques have shown promising outcome to study single cells at high throughput. However, their adoption in laboratories studying “-omics” sciences is still irrelevant due to the complex and multidisciplinary nature of the field. To facilitate their use, here we provide engineering details and organized protocols for integrating three droplet-based microfluidic technologies into the metagenomic pipeline to enable functional screening of bioproducts at high throughput. First, a device encapsulating single cells in droplets at a rate of ∼250 Hz is described considering droplet size and cell growth. Then, we expand on previously reported fluorescence-activated droplet sorting systems to integrate the use of 4 independent fluorescence-exciting lasers (i.e., 405, 488, 561, and 637 nm) in a single platform to make it compatible with different fluorescence-emitting biosensors. For this sorter, both hardware and software are provided and optimized for effortlessly sorting droplets at 60 Hz. Then, a passive droplet merger is also integrated into our pipeline to enable adding new reagents to already-made droplets at a rate of 200 Hz. Finally, we provide an optimized recipe for manufacturing these chips using silicon dry-etching tools. Because of the overall integration and the technical details presented here, our approach allows biologists to quickly use microfluidic technologies and achieve both single-cell resolution and high-throughput capability (>50,000 cells/day) for mining and bioprospecting metagenomic data

Page 504-518