From Mutation Signature to Molecular Mechanism in the RNA World: A Case of SARS-CoV-2
Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
Qi Liu, Shilei Zhao, Cheng-Min Shi, Shuhui Song, Sihui Zhu, Yankai Su, Wenming Zhao, Mingkun Li, Yiming Bao, Yongbiao Xue, Hua Chen
A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is responsible for the ongoing outbreak of coronavirus disease 2019 (COVID-19). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensive sampling bias and existence of infection clusters during the epidemic spread, direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation. In this study, we first present robust estimator for the time to the most recent common ancestor (TMRCA) and the mutation rate, and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2. The mutation rate is inferred to be 8.69 × 10−4 per site per year with a 95% confidence interval (CI) of [8.61 × 10−4, 8.77 × 10−4], and the TMRCA of the samples inferred to be Nov 28, 2019 with a 95% CI of [Oct 20, 2019, Dec 9, 2019]. The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market. We further demonstrate that genetic polymorphism patterns, including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters, are similar to those caused by evolutionary forces such as natural selection. Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2. Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081.
Compositional Variability and Mutation Spectra of Monophyletic SARS-CoV-2 Clades
Xufei Teng, Qianpeng Li, Zhao Li, Yuansheng Zhang, Guangyi Niu, Jingfa Xiao, Jun Yu, Zhang Zhang, Shuhui Song
COVID-19 and its causative pathogen SARS-CoV-2 have rushed the world into a staggering pandemic in a few months, and a global fight against both has been intensifying. Here, we describe an analysis procedure where genome composition and its variables are related, through the genetic code to molecular mechanisms, based on understanding of RNA replication and its feedback loop from mutation to viral proteome sequence fraternity including effective sites on the replicase-transcriptase complex. Our analysis starts with primary sequence information, identity-based phylogeny based on 22,051 SARS-CoV-2 sequences, and evaluation of sequence variation patterns as mutation spectra and its 12 permutations among organized clades. All are tailored to two key mechanisms: strand-biased and function-associated mutations. Our findings are listed as follows: 1) The most dominant mutation is C-to-U permutation, whose abundant second-codon-position counts alter amino acid composition toward higher molecular weight and lower hydrophobicity, albeit assumed most slightly deleterious. 2) The second abundance group includes three negative-strand mutations (U-to-C, A-to-G, and G-to-A) and a positive-strand mutation (G-to-U) due to DNA repair mechanisms after cellular abasic events. 3) A clade-associated biased mutation trend is found attributable to elevated level of negative-sense strand synthesis. 4) Within-clade permutation variation is very informative for associating non-synonymous mutations and viral proteome changes. These findings demand a platform where emerging mutations are mapped onto mostly subtle but fast-adjusting viral proteomes and transcriptomes, to provide biological and clinical information after logical convergence for effective pharmaceutical and diagnostic applications. Such actions are in desperate need, especially in the middle of the War against COVID-19.
Long Non-coding RNA Derived from lncRNA–mRNA Co-expression Networks Modulates the Locust Phase Change
Ting Li, Bing Chen, Pengcheng Yang, Depin Wang, Baozhen Du, Le Kang
Long non-coding RNAs (lncRNAs) regulate various biological processes ranging from gene expression to animal behavior. Although protein-coding genes, microRNAs, and neuropeptides play important roles in the regulation of phenotypic plasticity in migratory locust, empirical studies on the function of lncRNAs in this process remain limited. Here, we applied high-throughput RNA-seq to compare the expression patterns of lncRNAs and mRNAs in the time course of locust phase change. We found that lncRNAs responded more rapidly at the early stages of phase transition. Functional annotations demonstrated that early changed lncRNAs employed different pathways in isolation and crowding phases to cope with changes in the population density. Two overlapping hub lncRNA loci in the crowding and isolation networks were screened for functional verification. One of them, LNC1010057, was validated as a potential regulator of locust phase change. This work offers insights into the molecular mechanism underlying locust phase change and expands the scope of lncRNA functions in animal behavior.
MicroPhenoDB Associates Metagenomic Data with Pathogenic Microbes, Microbial Core Genes, and Human Disease Phenotypes
Guocai Yao, Wenliang Zhang, Minglei Yang, Huan Yang, Jianbo Wang, Haiyue Zhang, Lai Wei, Zhi Xie, Weizhong Li
Microbes play important roles in human health and disease. The interaction between microbes and hosts is a reciprocal relationship, which remains largely under-explored. Current computational resources lack manually and consistently curated data to connect metagenomic data to pathogenic microbes, microbial core genes, and disease phenotypes. We developed the MicroPhenoDB database by manually curating and consistently integrating microbe-disease association data. MicroPhenoDB provides 5677 non-redundant associations between 1781 microbes and 542 human disease phenotypes across more than 22 human body sites. MicroPhenoDB also provides 696,934 relationships between 27,277 unique clade-specific core genes and 685 microbes. Disease phenotypes are classified and described using the Experimental Factor Ontology (EFO). A refined score model was developed to prioritize the associations based on evidential metrics. The sequence search option in MicroPhenoDB enables rapid identification of existing pathogenic microbes in samples without running the usual metagenomic data processing and assembly. MicroPhenoDB offers data browsing, searching, and visualization through user-friendly web interfaces and web service application programming interfaces. MicroPhenoDB is the first database platform to detail the relationships between pathogenic microbes, core genes, and disease phenotypes. It will accelerate metagenomic data analysis and assist studies in decoding microbes related to human diseases. MicroPhenoDB is available through http://www.liwzlab.cn/microphenodb and http://lilab2.sysu.edu.cn/microphenodb.
数据库链接 http://www.liwzlab.cn/microphenodb 和http://lilab2.sysu.edu.cn/microphenodb
Screening of Potential Biomarkers for Gastric Cancer with Diagnostic Value Using Label-free Global Proteome Analysis
Yongxi Song, Jun Wang, Jingxu Sun, Xiaowan Chen, Jinxin Shi, Zhonghua Wu, Dehao Yu, Fei Zhang, Zhenning Wang
Gastric cancer (GC) is known as a top malignant type of tumors worldwide. Despite the recent decrease in mortality rates, the prognosis remains poor. Therefore, it is necessary to find novel biomarkers with early diagnostic value for GC. In this study, we present a large-scale proteomic analysis of 30 GC tissues and 30 matched healthy tissues using label-free global proteome profiling. Our results identified 537 differentially expressed proteins, including 280 upregulated and 257 downregulated proteins. The ingenuity pathway analysis (IPA) results indicated that the sirtuin signaling pathway was the most activated pathway in GC tissues whereas oxidative phosphorylation was the most inhibited. Moreover, the most activated molecular function was cellular movement, including tissue invasion by tumor cell lines. Based on IPA results, 15 hub proteins were screened. Using the receiver operating characteristic curve, most of hub proteins showed a high diagnostic power in distinguishing between tumors and healthy controls. A four-protein (ATP5B-ATP5O-NDUFB4-NDUFB8) diagnostic signature was built using a random forest model. The area under the curve (AUC) values of this model were 0.996 and 0.886 for the training and testing sets, respectively, suggesting that the four-protein signature has a high diagnostic power. This signature was further tested with independent datasets using plasma enzyme-linked immune sorbent assays, resulting in an AUC value of 0.778 for distinguishing GC tissues from healthy controls, and using immunohistochemical tissue microarray analysis, resulting in an AUC value of 0.805. In conclusion, this study identifies potential biomarkers and improves our understanding of the pathogenesis, providing novel therapeutic targets for GC.
Classification of the Gut Microbiota of Patients in Intensive Care Units During Development of Sepsis and Septic Shock
Wanglin Liu, Mingyue Cheng, Jinman Li, Peng Zhang, Hang Fan, Qinghe Hu, Maozhen Han, Longxiang Su, Huaiwu He, Yigang Tong, Kang Ning, Yun Long
The gut microbiota of intensive care unit (ICU) patients displays extreme dysbiosis associated with increased susceptibility to organ failure, sepsis, and septic shock. However, such dysbiosis is difficult to characterize owing to the high dimensional complexity of the gut microbiota. We tested whether the concept of enterotype can be applied to the gut microbiota of ICU patients to describe the dysbiosis. We collected 131 fecal samples from 64 ICU patients diagnosed with sepsis or septic shock and performed 16S rRNA gene sequencing to dissect their gut microbiota compositions. During the development of sepsis or septic shock and during various medical treatments, the ICU patients always exhibited two dysbiotic microbiota patterns, or ICU-enterotypes, which could not be explained by host properties such as age, sex, and body mass index, or external stressors such as infection site and antibiotic use. ICU-enterotype I (ICU E1) comprised predominantly Bacteroides and an unclassified genus of Enterobacteriaceae, while ICU-enterotype II (ICU E2) comprised predominantly Enterococcus. Among more critically ill patients with Acute Physiology and Chronic Health Evaluation II (APACHE II) scores > 18, septic shock was more likely to occur with ICU E1 (P = 0.041). Additionally, ICU E1 was correlated with high serum lactate levels (P = 0.007). Therefore, different patterns of dysbiosis were correlated with different clinical outcomes, suggesting that ICU-enterotypes should be diagnosed as independent clinical indices. Thus, the microbial-based human index classifier we propose is precise and effective for timely monitoring of ICU-enterotypes of individual patients. This work is a first step toward precision medicine for septic patients based on their gut microbiota profiles.
重症监护病房（ICU）患者肠道菌群的紊乱会增加他们患器官衰竭，脓毒症和感染性休克的几率。然而，由于肠道菌群具有高维复杂性，这种紊乱难以被简单定义。针对这一难题，本文作者应用肠型的概念来将这种ICU菌群紊乱进行分类，并定义为ICU肠型。作者从64位诊断为脓毒症或感染性休克的ICU患者中收集了131份粪便样本，并进行了16S rRNA基因测序，以分析其肠道菌群组成。发现在脓毒症或感染性休克的发展过程中，ICU患者的肠道菌群总是能聚类到两种ICU肠型。并且ICU 肠型的存在并不能由宿主特性如年龄，性别和体重指数，或外部压力因素例如感染部位和抗生素来解释。 ICU肠型I型（ICU E1）主要由拟杆菌属和某肠杆菌科的未鉴定属作为驱动菌，而ICU肠型II型（ICU E2）主要由肠球菌作为驱动菌。在APACHE II得分大于18的危重患者中，ICU E1的患者更有可能发生感染性休克（P = 0.041）。此外，ICU E1的患者会有更高的血清乳酸水平（P = 0.007）。作者发现不同ICU肠型的病人会对应不同的病理状态，可将ICU肠型作为独立的临床指标。作者基于此提出的MHI分类器，对于及时监测单个患者的ICU肠型是精确而有效的。这项研究为基于脓毒症患者肠道菌群的精准医学干预迈出了第一步。
Increased Expression of Colonic Mucosal Melatonin in Patients with Irritable Bowel Syndrome Correlated with Gut Dysbiosis
Ben Wang, Shiwei Zhu, Zuojing Liu, Hui Wei, Lu Zhang, Meibo He, Fei Pei, Jindong Zhang, Qinghua Sun, Liping Duan
Dysregulation of the gut microbiota/gut hormone axis contributes to the pathogenesis of irritable bowel syndrome (IBS). Melatonin plays a beneficial role in gut motility and immunity. However, altered expression of local mucosal melatonin in IBS and its relationship with the gut microbiota remain unclear. Therefore, we aimed to detect the colonic melatonin levels and microbiota profiles in patients with diarrhea-predominant IBS (IBS-D) and explore their relationship in germ-free (GF) rats and BON-1 cells. Thirty-two IBS-D patients and twenty-eight healthy controls (HCs) were recruited. Fecal specimens from IBS-D patients and HCs were separately transplanted into GF rats by gavage. The levels of colon mucosal melatonin were assessed by immunohistochemical methods, and fecal microbiota communities were analyzed using 16S rDNA sequencing. The effect of butyrate on melatonin synthesis in BON-1 cells was evaluated by ELISA. Melatonin levels were significantly increased and negatively correlated with visceral hypersensitivity in IBS-D patients. GF rats inoculated with fecal microbiota from IBS-D patients had high colonic melatonin levels. Butyrate-producing Clostridium cluster XIVa species, such as Roseburia species and Lachnospira species, were positively related to colonic mucosal melatonin expression. Butyrate significantly increased melatonin secretion in BON-1 cells. Increased melatonin expression may be an adaptive protective mechanism in the development of IBS-D. Moreover, some Clostridium cluster XIVa species could increase melatonin expression via butyrate production. Modulation of the gut hormone/gut microbiota axis offers a promising target of interest for IBS in the future.
纳入32名符合罗马III诊断标准的IBS-D患者和 28 名健康志愿者，并应用无菌雄性 Sprague-Dawley（SD） 大鼠，分别以 IBS-D 患者和对照者的粪便进行粪菌移植。结肠黏膜褪黑素的表达水平应用免疫组织化学方法进行检测。肠道菌群采用16S rDNA 测序分析。此外，使用丁酸钠干预 BON-1 细胞系，并用 ELISA 方法检测细胞上清中的褪黑素水平。
移植 IBS-D 患者菌群的大鼠结肠黏膜褪黑素水平显著高于移植健康志愿者菌群的大鼠。
产丁酸的梭菌 XIVa 簇的 Roseburia 和Lachnospira 菌群与结肠黏膜褪黑素的水平显著正相关。
丁酸可以显著促进 BON-1 细胞褪黑素的释放。
http://bigd.big.ac.cn/gsa. (GSA: CRA001604)
Antidiabetic Effects of Gegen Qinlian Decoction via the Gut Microbiota Are Attributable to Its Key Ingredient Berberine
Xizhan Xu , Zezheng Gao, Fuquan Yang, Yingying Yang, Liang Chen, Lin Han, Na Zhao, Jiayue Xu, Xinmiao Wang, Yue Ma, Lian Shu, Xiaoxi Hu, Na Lyu, Yuanlong Pan, Baoli Zhu, Linhua Zhao, Xiaolin Tong, Jun Wang
Gegen Qinlian Decoction (GQD), a traditional Chinese medicine (TCM) formula, has long been used for the treatment of common metabolic diseases, including type 2 diabetes mellitus. However, the main limitation of its wider application is ingredient complexity of this formula. Thus, it is critically important to identify the major active ingredients of GQD and to illustrate mechanisms underlying its action. Here, we compared the effects of GQD and berberine, a hypothetical key active pharmaceutical ingredient of GQD, on a diabetic rat model by comprehensive analyses of gut microbiota, short-chain fatty acids, proinflammatory cytokines, and ileum transcriptomics. Our results show that berberine and GQD had similar effects on lowering blood glucose levels, modulating gut microbiota, inducing ileal gene expression, as well as relieving systemic and local inflammation. As expected, both berberine and GQD treatment significantly altered the overall gut microbiota structure and enriched many butyrate-producing bacteria, including Faecalibacterium and Roseburia, thereby attenuating intestinal inflammation and lowering glucose. Levels of short-chain fatty acids in rat feces were also significantly elevated after treatment with berberine or GQD. Moreover, concentration of serum proinflammatory cytokines and expression of immune-related genes, including Nfkb1, Stat1, and Ifnrg1, in pancreatic islets were significantly reduced after treatment. Our study demonstrates that the main effects of GQD can be attributed to berberine via modulating gut microbiota. The strategy employed would facilitate further standardization and widespread application of TCM in many diseases.
Different Gene Networks Are Disturbed by Zika Virus Infection in A Mouse Microcephaly Model
Yafei Chang, Yisheng Jiang, Cui Li, Qin Wang, Feng Zhang, Cheng-Feng Qin, Qing-Feng Wu, Jing Li, Zhiheng Xu
The association of Zika virus (ZIKV) infection with microcephaly has raised alarm worldwide. Their causal link has been confirmed in different animal models infected by ZIKV. However, the molecular mechanisms underlying ZIKV pathogenesis are far from clear. Hence, we performed global gene expression analysis of ZIKV-infected mouse brains to unveil the biological and molecular networks underpinning microcephaly. We found significant dysregulation of the sub-networks associated with brain development, immune response, cell death, microglial cell activation, and autophagy amongst others. We provided detailed analysis of the related complicated gene networks and the links between them. Additionally, we analyzed the signaling pathways that were likely to be involved. This report provides systemic insights into not only the pathogenesis, but also a path to the development of prophylactic and therapeutic strategies against ZIKV infection.
寨卡病毒（ZIKV）感染与小头畸形的关联得到了全世界的关注。它们的因果关系已在ZIKV感染的动物模型和临床研究中得到证实。 但是，ZIKV导致的小头畸形发病机理的分子机制尚不清楚。 因此，我们对ZIKV感染的小鼠大脑进行了详细的全基因组表达分析，以揭示寨卡病毒导致小头畸形所涉及的生物学过程和分子网络。我们发现，与脑发育、免疫反应、细胞死亡、小胶质细胞活化和自噬等相关的分子网络严重失调。我们对相关的复杂基因网络及其之间的联系进行了深入解析。此外，我们分析了多条可能涉及的信号通路。该研究不仅对发病机理提供了系统的见解，而且还将为开发针对ZIKV感染的预防和治疗提供重要的策略。
The Global Landscape of SARS-CoV-2 Genomes, Variants, and Haplotypes in 2019nCoVR
Shuhui Song, Lina Ma, Dong Zou, Dongmei Tian, Cuiping Li, Junwei Zhu, Meili Chen, Anke Wang, Yingke Ma, Mengwei Li, Xufei Teng, Ying Cui, Guangya Duan, Mochen Zhang, Tong Jin, Chengmin Shi, Zhenglin Du, Yadong Zhang, Chuandong Liu, Rujiao Li, Jingyao Zeng, Lili Hao, Shuai Jiang, Hua Chen, Dali Han, Jingfa Xiao, Zhang Zhang, Wenming Zhao, Yongbiao Xue, Yiming Bao
On January 22, 2020, China National Center for Bioinformation (CNCB) released the 2019 Novel Coronavirus Resource (2019nCoVR), an open-access information resource for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). 2019nCoVR features a comprehensive integration of sequence and clinical information for all publicly available SARS-CoV-2 isolates, which are manually curated with value-added annotations and quality evaluated by an automated in-house pipeline. Of particular note, 2019nCoVR offers systematic analyses to generate a dynamic landscape of SARS-CoV-2 genomic variations at a global scale. It provides all identified variants and their detailed statistics for each virus isolate, and congregates the quality score, functional annotation, and population frequency for each variant. Spatiotemporal change for each variant can be visualized and historical viral haplotype network maps for the course of the outbreak are also generated based on all complete and high-quality genomes available. Moreover, 2019nCoVR provides a full collection of SARS-CoV-2 relevant literature on the coronavirus disease 2019 (COVID-19), including published papers from PubMed as well as preprints from services such as bioRxiv and medRxiv through Europe PMC. Furthermore, by linking with relevant databases in CNCB, 2019nCoVR offers data submission services for raw sequence reads and assembled genomes, and data sharing with NCBI. Collectively, SARS-CoV-2 is updated daily to collect the latest information on genome sequences, variants, haplotypes, and literature for a timely reflection, making 2019nCoVR a valuable resource for the global research community. 2019nCoVR is accessible at https://bigd.big.ac.cn/ncov/.