Article Online

Articles Online (Volume 17, Issue 6)


Intriguing Origins of Protein Lysine Methylation: Influencing Cell Function Through Dynamic Methylation

Natalie Mezey, William C.S. Cho, Kyle K. Biggar

Page 551-557

Original Research

Tung Tree (Vernicia fordii) Genome Provides A Resource for Understanding Genome Evolution and Improved Oil Production

Lin Zhang, Meilan Liu, Hongxu Long, Wei Dong, Asher Pasha, Eddi Esteban, Wenying Li, Xiaoming Yang, Ze Li, Aixia Song, Duo Ran, Guang Zhao, Yanling Zeng, Hao Chen, Ming Zou, Jingjing Li, Fan Liang, Meili Xie, Jiang Hu, Depeng Wang, Heping Cao, Nicholas J. Provart, Liangsheng Zhang, Xiaofeng Tan

Tung tree (Vernicia fordii) is an economically important woody oil plant that produces tung oil rich in eleostearic acid. Here, we report a high-quality chromosome-scale genome sequence of tung tree. The genome sequence was assembled by combining Illumina short reads, Pacific Biosciences single-molecule real-time long reads, and Hi-C sequencing data. The size of tung tree genome is 1.12 Gb, with 28,422 predicted genes and over 73% repeat sequences. The V. fordii underwent an ancient genome triplication event shared by core eudicots but no further whole-genome duplication in the subsequent ca. 34.55 million years of evolutionary history of the tung tree lineage. Insertion time analysis revealed that repeat-driven genome expansion might have arisen as a result of long-standing long terminal repeat retrotransposon bursts and lack of efficient DNA deletion mechanisms. The genome harbors 88 resistance genes encoding nucleotide-binding sites; 17 of these genes may be involved in early-infection stage of Fusarium wilt resistance. Further, 651 oil-related genes were identified, 88 of which are predicted to be directly involved in tung oil biosynthesis. Relatively few phosphoenolpyruvate carboxykinase genes, and synergistic effects between transcription factors and oil biosynthesis-related genes might contribute to the high oil content of tung seed. The tung tree genome constitutes a valuable resource for understanding genome evolution, as well as for molecular breeding and genetic improvements for oil production.

Page 558-575

Original Research

Multi-omics Analysis of Primary Cell Culture Models Reveals Genetic and Epigenetic Basis of Intratumoral Phenotypic Diversity

Sixue Liu, Zuyu Yang, Guanghao Li, Chunyan Li, Yanting Luo, Qiang Gong, Xin Wu, Tao Li, Zhiqian Zhang, Baocai Xing, Xiaolan Xu, Xuemei Lu

Uncovering the functionally essential variations related to tumorigenesis and tumor progression from cancer genomics data is still challenging due to the genetic diversity among patients, and extensive inter- and intra-tumoral heterogeneity at different levels of gene expression regulation, including but not limited to the genomic, epigenomic, and transcriptional levels. To minimize the impact of germline genetic heterogeneities, in this study, we establish multiple primary cultures from the primary and recurrent tumors of a single patient with hepatocellular carcinoma (HCC). Multi-omics sequencing was performed for these cultures that encompass the diversity of tumor cells from the same patient. Variations in the genome sequence, epigenetic modification, and gene expression are used to infer the phylogenetic relationships of these cell cultures. We find the discrepancy among the relationships revealed by single nucleotide variations (SNVs) and transcriptional/epigenomic profiles from the cell cultures. We fail to find overlap between sample-specific mutated genes and differentially expressed genes (DEGs), suggesting that most of the heterogeneous SNVs among tumor stages or lineages of the patient are functionally insignificant. Moreover, copy number alterations (CNAs) and DNA methylation variation within gene bodies, rather than promoters, are significantly correlated with gene expression variability among these cell cultures. Pathway analysis of CNA/DNA methylation-related genes indicates that a single cell clone from the recurrent tumor exhibits distinct cellular characteristics and tumorigenicity, and such an observation is further confirmed by cellular experiments both in vitro and in vivo. Our systematic analysis reveals that CNAs and epigenomic changes, rather than SNVs, are more likely to contribute to the phenotypic diversity among subpopulations in the tumor. These findings suggest that new therapeutic strategies targeting gene dosage and epigenetic modification should be considered in personalized cancer medicine. This culture model may be applied to the further identification of plausible determinants of cancer metastasis and relapse.

Page 576-589

Original Research

Meta-analysis Reveals Potential Influence of Oxidative Stress on the Airway Microbiomes of Cystic Fibrosis Patients

Xing Shi, Zhancheng Gao, Qiang Lin, Liping Zhao, Qin Ma, Yu Kang, Jun Yu

The lethal chronic airway infection of the cystic fibrosis (CF) patients is predisposed by colonization of specific CF-philic pathogens or the CF microbiomes, but key processes and reasons of the microbiome settlement in the patients are yet to be fully understood, especially their survival and metabolic dynamics from normal to diseased status under treatment. Here, we report our meta-analysis results on CF airway microbiomes based on metabolic networks reconstructed from genome information at species level. The microbiomes of CF patients appear to engage much more redox-related activities than those of controls, and by constructing a large dataset of anti-oxidative stress (anti-OS) genes, our quantitative evaluation of the anti-OS capacity of each bacterial species in the CF microbiomes confirms strong conservation of the anti-OS responses within genera and also shows that the CF pathogens have significantly higher anti-OS capacity than commensals and other typical respiratory pathogens. In addition, the anti-OS capacity of a relevant species correlates with its relative fitness for the airways of CF patients over that for the airways of controls. Moreover, the total anti-OS capacity of the respiratory microbiome of CF patients is collectively higher than that of controls, which increases with disease progression, especially after episodes of acute exacerbation and antibiotic treatment. According to these results, we propose that the increased OS in the airways of CF patients may play an important role in reshaping airway microbiomes to a more resistant status that favors the pre-infection colonization of the CF pathogens for a higher anti-OS capacity.
囊性纤维化(cystic fibrosis,CF)是一种致命的遗传性疾病,在呼吸系统表现为特定的CF病原体的定植和慢性感染,以及进行性的氧化损伤和肺功能丧失。目前CF呼吸道微生物组研究发现,CF患者呼吸道病原体集中在铜绿假单胞菌、洋葱伯克霍尔德菌群等特定病原体,但CF患者微生物群和病原体定植的关键过程和原因还不清楚。针对这一问题,我们通过对CF患者和健康人肺部微生物组进行荟萃分析,来研究两者之间微生物群物代谢网络的变化,我们发现CF患者呼吸道微生物群比对照组具有更多的氧化还原相关基因。为此,我们通过文献调研,构建了抗氧化应激基因数据集,并定量评估了各个物种的抗氧化应激能力。 通过比较分析,我们发现CF相关病原体的抗氧化应激能力明显高于共生菌和其他常见的呼吸道病原体。此外,相关物种的抗氧化应激能力与其在CF患者呼吸道中的适应性具有密切关联,而且CF患者呼吸道微生物组的总体抗氧化应激能力高于健康对照,并随着疾病的发展而升高,特别是在急性发作和抗生素治疗后具有明显增加。根据这些结果,我们认为CF患者呼吸道中不断升高的氧化应激导致微生物组不断向更高的抗氧化应激能力菌群靠近,有利于CF病原体在感染前的定殖,在微生物组重塑中发挥了重要作用。

Page 590-602

Original Research

Large-scale Identification and Time-course Quantification of Ubiquitylation Events During Maize Seedling De-etiolation

Yue-Feng Wang, Qing Chao, Zhe Li, Tian-Cong Lu, Hai-Yan Zheng, Cai-Feng Zhao, Zhuo Shen, Xiao-Hui Li, Bai-Chen Wang

The ubiquitin system is crucial for the development and fitness of higher plants. De-etiolation, during which green plants initiate photomorphogenesis and establish autotrophy, is a dramatic and complicated process that is tightly regulated by a massive number of ubiquitylation/de-ubiquitylation events. Here we present site-specific quantitative proteomic data for the ubiquitylomes of de-etiolating seedling leaves of Zea mays L. (exposed to light for 1, 6, or 12 h) achieved through immunoprecipitation-based high-resolution mass spectrometry (MS). Through the integrated analysis of multiple ubiquitylomes, we identified and quantified 1926 unique ubiquitylation sites corresponding to 1053 proteins. We analyzed these sites and found five potential ubiquitylation motifs, KA, AXK, KXG, AK, and TK. Time-course studies revealed that the ubiquitylation levels of 214 sites corresponding to 173 proteins were highly correlated across two replicate MS experiments, and significant alterations in the ubiquitylation levels of 78 sites (fold change >1.5) were detected after de-etiolation for 12 h. The majority of the ubiquitylated sites we identified corresponded to substrates involved in protein and DNA metabolism, such as ribosomes and histones. Meanwhile, multiple ubiquitylation sites were detected in proteins whose functions reflect the major physiological changes that occur during plant de-etiolation, such as hormone synthesis/signaling proteins, key C4 photosynthetic enzymes, and light signaling proteins. This study on the ubiquitylome of the maize seedling leaf is the first attempt ever to study the ubiquitylome of a C4 plant and provides the proteomic basis for elucidating the role of ubiquitylation during plant de-etiolation.
蛋白质的泛素化翻译后修饰在植物发育及其健康成长过程中发挥着极为重要的作用。去黄化,亦称为转绿过程,是绿色高等植物开启光形态建成和自养生活的过程。这是一个被大量蛋白质泛素化与去泛素化修饰事件紧密调控的复杂而又剧烈变化的生物学过程。本研究中,我们展示了利用基于免疫共沉淀-高通量蛋白质质谱分析技术获得的玉米幼苗转绿过程中叶片(光照1、6、12小时)的位点特异性泛素化定量蛋白质组学数据。通过整合分析多个泛素化修饰蛋白质组,我们鉴定并定量了位于 1041个蛋白质之上的1926位点的泛素化修饰。通过对这些泛素化修饰位点氨基酸序列分析,我们发现了五个潜在泛素化修饰基序:KA、AXK、KXG、AK和TK。时间进程分析表明,对应于173个蛋白质上的214个氨基酸位点上的泛素化修饰水平在两次生物学重复实验中高度吻合,其中78个修饰位点上的泛素化修饰水平在玉米幼苗被光照12小时后发生了显著性变化(基于定量结果变化倍数大于1.5倍的标准)。本研究所鉴定的被泛素化修饰的蛋白质主要参与了蛋白质和DNA代谢过程,像核糖体蛋白和组蛋白。同时发现参与植物转绿过程主要生理变化的蛋白质上存在多个泛素化修饰位点的情况。例如,植物激素合成和信号转导途径的蛋白质、C4光合作用关键酶和光信号转导途径关键蛋白等。此项基于玉米幼苗泛素化修饰蛋白质组学研究工作,为进一步解析C4植物幼苗转绿过程中蛋白质泛素化修饰的作用提供了依据。

Page 603-622


Schizophrenia-associated MicroRNA–Gene Interactions in the Dorsolateral Prefrontal Cortex

Danielle M. Santarelli, Adam P. Carroll, Heath M. Cairns, Paul A. Tooney, Murray J. Cairns

Schizophrenia-associated anomalies in gene expression in postmortem brain can be attributed to a combination of genetic and environmental influences. Given the small effect size of common variants, it is likely that we may only see the combined impact of some of these at the pathway level in small postmortem studies. At the gene level, however, there may be more impact from common environmental exposures mediated by influential epigenomic modifiers, such as microRNA (miRNA). We hypothesise that dysregulation of miRNAs and their alteration of gene expression have significant implications in the pathophysiology of schizophrenia. In this study, we integrate changes in cortical gene and miRNA expression to identify regulatory interactions and networks associated with the disorder. Gene expression analysis in post-mortem prefrontal dorsolateral cortex (BA 46) (n = 74 matched pairs of schizophrenia, schizoaffective, and control samples) was integrated with miRNA expression in the same cohort to identify gene–miRNA regulatory networks. A significant gene–miRNA interaction network was identified, including miR-92a, miR-495, and miR-134, which converged with differentially expressed genes in pathways involved in neurodevelopment and oligodendrocyte function. The capacity for miRNA to directly regulate gene expression through respective binding sites in BCL11A, PLP1, and SYT11 was also confirmed to support the biological relevance of this integrated network model. The observations in this study support the hypothesis that miRNA dysregulation is an important factor in the complex pathophysiology of schizophrenia.

Page 623-634


A Computational Approach for Modeling the Allele Frequency Spectrum of Populations with Arbitrarily Varying Size

Hua Chen

The allele frequency spectrum (AFS), or site frequency spectrum, is commonly used to summarize the genomic polymorphism pattern of a sample, which is informative for inferring population history and detecting natural selection. In 2013, Chen and Chen developed a method for analytically deriving the AFS for populations with temporally varying size through the coalescence time-scaling function. However, their approach is only applicable to population history scenarios in which the analytical form of the time-scaling function is tractable. In this paper, we propose a computational approach to extend the method to populations with arbitrary complex varying size by numerically approximating the time-scaling function. We demonstrate the performance of the approach by constructing the AFS for two population history scenarios: the logistic growth model and the Gompertz growth model, for which the AFS are unavailable with existing approaches. Software for implementing the algorithm can be downloaded at

Page 635-644


SPOT-Disorder2: Improved Protein Intrinsic Disorder Prediction by Ensembled Deep Learning

Jack Hanson, Kuldip K. Paliwal, Thomas Litfin, Yaoqi Zhou

Intrinsically disordered or unstructured proteins (or regions in proteins) have been found to be important in a wide range of biological functions and implicated in many diseases. Due to the high cost and low efficiency of experimental determination of intrinsic disorder and the exponential increase of unannotated protein sequences, developing complementary computational prediction methods has been an active area of research for several decades. Here, we employed an ensemble of deep Squeeze-and-Excitation residual inception and long short-term memory (LSTM) networks for predicting protein intrinsic disorder with input from evolutionary information and predicted one-dimensional structural properties. The method, called SPOT-Disorder2, offers substantial and consistent improvement not only over our previous technique based on LSTM networks alone, but also over other state-of-the-art techniques in three independent tests with different ratios of disordered to ordered amino acid residues, and for sequences with either rich or limited evolutionary information. More importantly, semi-disordered regions predicted in SPOT-Disorder2 are more accurate in identifying molecular recognition features (MoRFs) than methods directly designed for MoRFs prediction. SPOT-Disorder2 is available as a web server and as a standalone program at
现在已知相当多的蛋白质并没有内在的结构,这些被称为固有无序(intrinsic disorder)的蛋白质有许多重要的功能,并在许多疾病中起着关键的作用。但是,哪个蛋白质,或者某个蛋白质的哪个区域是固有无序的,并没有什么简单、便宜、快速的实验方法来测定。随着未被注释蛋白质数量的指数性增长,用计算方法来预测成为必要的补充手段。这里,我们采用了深层压缩和激发(deep Squeeze-and-Excitation)和长短期记忆(LSTM)神经网络的集合,利用进化信息和预测的一维结构特性来预测蛋白质的结构区和固有无序区。这种被称为SPOT-Disorder2的方法不仅仅极大地改进了我们先前基于LSTM网络的SPOT-Disorder方法,而且也超过其它目前最好的方法。更重要的是,它还能直接用于预测固有无序区内的与其它分子相互作用的功能区(MoRF),比其它专门预测功能区的方法更准确。SPOT-Disorder2作为网上服务器和独立程序可以在上获得。

Page 645-656