Volume: 17, Issue: 4

Preface

Big Data and the Brain: Peeking at the Future

Hongzhu Qu, Hongxing Lei, Xiangdong Fang

人脑是人体中最复杂的器官,它包括至少一千亿个各种类型的神经元,以及由1015个连接构成的复杂神经网络,控制着机体内一切活动过程包括从人类身体功能到思想和感觉。目前全球已经启动多项脑科学项目,包括欧盟的“人脑工程”、美国“推进创新神经技术脑研究计划”(BRAIN)、日本“通过整合神经技术构建服务于疾病研究的大脑地图(Brain/MIND)项目”、人类连接基因组计划(HCP)、中国脑计划(CBP)等,致力于研究大脑在健康和疾病中的工作方式,产生出大量多模态数据,表明脑科学研究进入大数据时代。而在生命组学领域,单细胞测序技术的发展,更加促进了脑科学研究,揭示出脑细胞精细亚群分类、鉴定出与脑部疾病相关的细胞标记与分子标记等。此外,影像数据、三维结构数据等多种数据类型与基因组学数据的整合分析,则是脑科学领域面临的巨大挑战,包括数据内部的标准化、数据间的整合、以及高效存储分析数据的方法等。相信经过大数据的整合挖掘,我们势必会在正常大脑活动,大脑疾病以及人类的自我意识方面取得非凡的发现。本期特刊简要介绍了脑科学领域的进展,主要集中在基因组学和神经影像学上。

Page 333-336


Perspective

Challenges of Processing and Analyzing Big Data in Mesoscopic Whole-brain Imaging

Anan Li, Yue Guan, Hui Gong, Qingming Luo

随着高通量、细胞分辨全脑成像仪器的涌现,以及实验数据的迅速积累,神经科学家们开始有机会在介观水平对大脑进行全面的研究,并逐渐成为研究热点。但介观水平的全脑成像同时也带来了大数据问题,如何有效、快速的处理和分析这些研究数据,已迅速成为最关键的瓶颈问题之一。在本文中,我们首先介绍了介观水平神经科学研究的现状和发展趋势,然后分析了全脑介观成像中的大数据问题以及基本的研究内容,再分别从信息提取、数据整合等多个方面,阐述脑空间信息学大数据研究所面临的挑战,展望应对挑战可能的发展方向和研究思路。

Page 337-343


Review

Deciphering Brain Complexity Using Single-cell Sequencing

Quanhua Mu, Yiyun Chen, Jiguang Wang

The human brain contains billions of highly differentiated and interconnected cells that form intricate neural networks and collectively control the physical activities and high-level cognitive functions, such as memory, decision-making, and social behavior. Big data is required to decipher the complexity of cell types, as well as connectivity and functions of the brain. The newly developed single-cell sequencing technology, which provides a comprehensive landscape of brain cell type diversity by profiling the transcriptome, genome, and/or epigenome of individual cells, has contributed substantially to revealing the complexity and dynamics of the brain and providing new insights into brain development and brain-related disorders. In this review, we first introduce the progresses in both experimental and computational methods of single-cell sequencing technology. Applications of single-cell sequencing-based technologies in brain research, including cell type classification, brain development, and brain disease mechanisms, are then elucidated by representative studies. Lastly, we provided our perspectives into the challenges and future developments in the field of single-cell sequencing. In summary, this mini review aims to provide an overview of how big data generated from single-cell sequencing have empowered the advancements in neuroscience and shed light on the complex problems in understanding brain functions and diseases.
人脑包含数十亿个高度分化的细胞,这些细胞相互连接成复杂的神经网络,共同控制身体活动和高级认知功能,例如记忆,决策和社会行为。大数据(Big data) 为解密细胞类型的复杂性、大脑的连接网络和功能提供了前所未有的机会。近年开发的单细胞测序技术通过表征单个细胞的转录组、基因组和表观基因组,提供了脑细胞类型多样性的全面概况,对揭示大脑的复杂性和动态性做出了重大贡献,并为大脑发育和大脑相关疾病的机理提供了新的理解。在这篇综述中,我们首先介绍了单细胞测序技术在实验和计算方法上的进展。随后,通过代表性研究阐明了基于单细胞测序的技术在脑研究中的应用,包括细胞类型分类、脑发育过程,以及脑疾病机制。最后,我们提供了对单细胞测序领域挑战和未来发展的看法。总而言之,本小型综述旨在概述单细胞测序产生的大数据如何促进神经科学的发展,并阐明理解脑功能和疾病的复杂问题。

Page 344-366


Review

Application of Computational Biology to Decode Brain Transcriptomes

Jie Li, Guang-Zhong Wang

The rapid development of high-throughput sequencing technologies has generated massive valuable brain transcriptome atlases, providing great opportunities for systematically investigating gene expression characteristics across various brain regions throughout a series of developmental stages. Recent studies have revealed that the transcriptional architecture is the key to interpreting the molecular mechanisms of brain complexity. However, our knowledge of brain transcriptional characteristics remains very limited. With the immense efforts to generate high-quality brain transcriptome atlases, new computational approaches to analyze these high-dimensional multivariate data are greatly needed. In this review, we summarize some public resources for brain transcriptome atlases and discuss the general computational pipelines that are commonly used in this field, which would aid in making new discoveries in brain development and disorders.
大脑是人类最神秘的器官,始终吸引着无数研究者的目光。近10年来,高通量测序技术的快速发展产生了大量有价值的脑转录组图谱,为系统研究大脑各个脑区和不同发育阶段的基因表达特征提供了巨大的机遇。最近的研究表明,基因转录是解释大脑复杂性的分子机制的关键。然而,我们对大脑中整体基因转录调控规律的了解仍然非常有限。与此同时,海量脑转录组图谱数据的生成,对数据的存储,分析方法等都提出了更高的要求。本文综述了现有的公共脑转录组数据集,并概述了脑转录组常用的计算分析方法。神经科学大数据和生物信息计算方法的高效结合,将有助于我们在脑发育和神经系统疾病方面有更多的新发现。

Page 367-380


Review

How Big Data and High-performance Computing Drive Brain Science

Shanyu Chen, Zhipeng He, Xinyin Han, Xiaoyu He, Ruilin Li, Haidong Zhu, Dan Zhao, Chuangchuang Dai, Yu Zhang, Zhonghua Lu, Xuebin Chi, Beifang Niu

Brain science accelerates the study of intelligence and behavior, contributes fundamental insights into human cognition, and offers prospective treatments for brain disease. Faced with the challenges posed by imaging technologies and deep learning computational models, big data and high-performance computing (HPC) play essential roles in studying brain function, brain diseases, and large-scale brain models or connectomes. We review the driving forces behind big data and HPC methods applied to brain science, including deep learning, powerful data analysis capabilities, and computational performance solutions, each of which can be used to improve diagnostic accuracy and research output. This work reinforces predictions that big data and HPC will continue to improve brain science by making ultrahigh-performance analysis possible, by improving data standardization and sharing, and by providing new neuromorphic insights.
脑科学加速了人类对智力和行为的研究,它不仅为了解人类认知奠定基础,同时还不断为解决脑疾病提供前瞻性治疗方法。如今,面对由于影像技术和深度学习模型应用于脑科学研究所带来的挑战,大数据和高性能计算在研究大脑功能机制、脑疾病和大规模脑模型或连接组等方面发挥着巨大作用。本文回顾了深度学习、大数据分析和计算性能解决方案等大数据和高性能方法如何驱动脑科学研究的发展,尤其是在提高诊断准确性和突破研究成果方面。未来,为了应对不断增加的大脑数据和模拟计算的挑战,这些大数据和高性能计算方法将持续朝着实现超高性能分析、改进数据标准化和共享、以及提供新的神经形态计算机等方向发展来深化其推动作用。

Page 381-392


Review

Functional Neuroimaging in the New Era of Big Data

Xiang Li, Ning Guo, Quanzheng Li

The field of functional neuroimaging has substantially advanced as a big data science in the past decade, thanks to international collaborative projects and community efforts. Here we conducted a literature review on functional neuroimaging, with focus on three general challenges in big data tasks: data collection and sharing, data infrastructure construction, and data analysis methods. The review covers a wide range of literature types including perspectives, database descriptions, methodology developments, and technical details. We show how each of the challenges was proposed and addressed, and how these solutions formed the three core foundations for the functional neuroimaging as a big data science and helped to build the current data-rich and data-driven community. Furthermore, based on our review of recent literature on the upcoming challenges and opportunities toward future scientific discoveries, we envisioned that the functional neuroimaging community needs to advance from the current foundations to better data integration infrastructure, methodology development toward improved learning capability, and multi-discipline translational research framework for this new era of big data.
在本篇综述中,我们回顾了近十余年来数据科学在神经影像学,特别是功能神经影像的分析中愈发广泛的应用与发展,并在总结了过往文献中涉及的神经影像大数据研究的三个重点任务:数据收集,数据分析与数据信息学。基于这三个重点任务,本文对当前公开的功能神经影像数据库,计算机分析算法以及影像学常用数据库平台进行了汇总。本文进一步提出了未来功能神经影像学作为一门数据科学的发展方向,即功能神经影像学与转化医学的深度结合,功能神经影像学数据整合和体系化建设,以及对应的知识发掘策略和学习系统的发展。

Page 393-401


Review

Brain Banks Spur New Frontiers in Neuropsychiatric Research and Strategies for Analysis and Validation

Le Wang, Yan Xia, Yu Chen, Rujia Dai, Wenying Qiu, Qingtuan Meng, Liz Kuney, Chao Chen

Neuropsychiatric disorders affect hundreds of millions of patients and families worldwide. To decode the molecular framework of these diseases, many studies use human postmortem brain samples. These studies reveal brain-specific genetic and epigenetic patterns via high-throughput sequencing technologies. Identifying best practices for the collection of postmortem brain samples, analyzing such large amounts of sequencing data, and interpreting these results are critical to advance neuropsychiatry. We provide an overview of human brain banks worldwide, including progress in China, highlighting some well-known projects using human postmortem brain samples to understand molecular regulation in both normal brains and those with neuropsychiatric disorders. Finally, we discuss future research strategies, as well as state-of-the-art statistical and experimental methods that are drawn upon brain bank resources to improve our understanding of the agents of neuropsychiatric disorders.
脑疾病的研究是中国脑计划的重要内容,神经精神疾病是脑疾病中重要的组成部分。全世界四分之一的人都会在一生中某个时刻受到精神或神经疾病的困扰。然而,我们对神经精神疾病的认识一直没有重要突破,研发新的有效治疗方法举步维艰。 神经精神疾病呈现的遗传性的特点为了解疾病提供了契机。多种神经精神疾病的全基因组关联分析鉴定了数百个与疾病相关的遗传位点,且多数位于非编码区。对这些遗传信号进行解释和功能分析,是目前神经精神疾病研究的热点。 遗传信号可以通过调控基因表达或者其他组学水平变化而发挥作用。研究神经精神疾病,就需要研究遗传信号在人脑组织发挥的多组学调控作用。脑库是脑组织的主要提供机构,在研究神经精神疾病中发挥着最基础的作用。 Le Wang 等的综述从脑库情况入手,介绍了国内外建立的主要脑库,总结了获取高质量尸检脑样的注意事项,以及这些因素对后续进行高通量测序等分子层面的实验的影响。Le Wang等进一步综述了世界各地的科研团队充分利用脑库样本,从脑发育的不同阶段、不同的脑区、不同疾病状态等方面,开展的一系列与脑发育和脑疾病相关研究项目,以及取得的相应重要研究成果。 同时,如何充分利用脑样本资源,整合多组学数据来探索疾病的发病机理仍然有很多挑战。Le Wang等从样本数据质控、多组学整合和实验验证等方面,深入总结了目前在神经精神疾病多组学研究中主要的遗传分析方法和功能实验手段,提出了神经精神疾病研究可供参考的研究策略。 最后,Le Wang 等综述了脑样本研究神经精神疾病的未来前景和挑战。更大规模的多脑库中心合作共享,脑组织单细胞水平的分析,利用胎脑和类脑揭示脑发育机制,建立中国人自己的人脑遗传调控图谱等是重中之重。神经精神疾病广泛存在于人群,要减轻病人痛苦、家人负担和社会损失,还需要多学科多中心持续不断的合作与努力。

Page 402-414


Review

Translational Informatics for Parkinson’s Disease: from Big Biomedical Data to Small Actionable Alterations

Bairong Shen, Yuxin Lin, Cheng Bi, Shengrong Zhou, Zhongchen Bai, Guangmin Zheng, Jing Zhou

Parkinson's disease (PD) is a common neurological disease in elderly people, and its morbidity and mortality are increasing with the advent of global ageing. The traditional paradigm of moving from small data to big data in biomedical research is shifting toward big data-based identification of small actionable alterations. To highlight the use of big data for precision PD medicine, we review PD big data and informatics for the translation of basic PD research to clinical applications. We emphasize some key findings in clinically actionable changes, such as susceptibility genetic variations for PD risk population screening, biomarkers for the diagnosis and stratification of PD patients, risk factors for PD, and lifestyles for the prevention of PD. The challenges associated with the collection, storage, and modelling of diverse big data for PD precision medicine and healthcare are also summarized. Future perspectives on systems modelling and intelligent medicine for PD monitoring, diagnosis, treatment, and healthcare are discussed in the end.
帕金森症是老年人常见的神经系统疾病,随着全球老龄化的到来,其发病率和死亡率都在增加。本文在回顾帕金森症转化研究的基础上、从信息学角度提出了帕金森症转化研究的一个新科学研究范式、该范式下的四个研究目标和八个挑战。转化研究的传统范式是:从生物医学实验研究的小数据出发、再到大人群中进行验证;这种范式的失败在于疾病异质性,使得基于小数据的模型往往过度拟合、而不具有泛化能力,无法拓展到大数据应用。新的科学范式是基于大数据来识别关键的、可操作的小数据;我们强调从大数据中寻找可执行的关键小数据并用于疾病管理和预防,如:1)帕金森症的遗传易感性用于危险人群筛查;2)生物标志物的发现用于帕金森症的诊断和分级;3)帕金森症的非遗传危险因素如环境因素等用于减少疾病风险; 4)预防帕金森病的健康生活方式等。最后我们讨论了帕金森症精准医疗和健康管理相关的八个信息学挑战、为帕金森症的转化信息学研究指明了方向。

Page 415-429


Original Research

Machine Learning to Detect Alzheimer’s Disease from Circulating Non-coding RNAs

Nicole Ludwig, Tobias Fehlmann, Fabian Kern, Manfred Gogol, Walter Maetzler, Stephanie Deutscher, Simone Gurlit, Claudia Schulte, Anna-Katharina von Thaler, Christian Deuschle, Florian Metzger, Daniela Berg, Ulrike Suenkel, Verena Keller, Christina Backes, Hans-Peter Lenhof, Eckart Meese, Andreas Keller

Blood-borne small non-coding (sncRNAs) are among the prominent candidates for blood-based diagnostic tests. Often, high-throughput approaches are applied to discover biomarker signatures. These have to be validated in larger cohorts and evaluated by adequate statistical learning approaches. Previously, we published high-throughput sequencing based microRNA (miRNA) signatures in Alzheimer’s disease (AD) patients in the United States (US) and Germany. Here, we determined abundance levels of 21 known circulating miRNAs in 465 individuals encompassing AD patients and controls by RT-qPCR. We computed models to assess the relation between miRNA expression and phenotypes, gender, age, or disease severity (Mini-Mental State Examination; MMSE). Of the 21 miRNAs, expression levels of 20 miRNAs were consistently de-regulated in the US and German cohorts. 18 miRNAs were significantly correlated with neurodegeneration (Benjamini-Hochberg adjusted P < 0.05) with highest significance for miR-532-5p (Benjamini-Hochberg adjusted P = 4.8 × 10−30). Machine learning models reached an area under the curve (AUC) value of 87.6% in differentiating AD patients from controls. Further, ten miRNAs were significantly correlated with MMSE, in particular miR-26a/26b-5p (adjusted P = 0.0002). Interestingly, the miRNAs with lower abundance in AD were enriched in monocytes and T-helper cells, while those up-regulated in AD were enriched in serum, exosomes, cytotoxic t-cells, and B-cells. Our study represents the next important step in translational research for a miRNA-based AD test.
血源性小非编码RNA(sncRNA)是用于抽血检验的主要候选检测标志物。通常情况下,疾病生物标志物的发现需要借助高通量测序方法,此外,这些标志物还需经过大群体样本的验证以及统计学习的评估。此前,该团队发表了一篇关于美国和德国阿尔兹海默症(AD)患者的小RNA表达量测序图谱的研究。在本文研究中,该团队通过RT-qPCR测定了共计465名AD以及对照组人员样本中的21个小RNA的表达丰度水平。该团队通过统计模型来评估小RNA表达量和表型(性别、年龄、患病程度)之间的关系。针对美国和德国患者及对照组样本的研究显示,在这21个小RNA中,20个小RNA在患者中表现出“去调节”的状态。18个小RNA与神经退行性疾病显著相关,其中显著性最高的是miR-532-5p。该团队通过机器学习方法对AD患者和对照组进行鉴别,AUC(分类效果评判指标)达到87.6%。此外,10个小RNA与患者患病程度显著相关。有趣的是,在AD患者中表达量下调的小RNA富集在单核细胞和T-辅助细胞,而表达量上调的小RNA则富集于血清、外泌体、细胞毒性T细胞和B细胞。本研究为基于小RNA的AD检测提供了更多借鉴和支持。

Page 430-440


Original Research

Identification of Cognitive Dysfunction in Patients with T2DM Using Whole Brain Functional Connectivity

Zhenyu Liu, Jiangang Liu, Huijuan Yuan, Taiyuan Liu, Xingwei Cui, Zhenchao Tang, Yang Du, Meiyun Wang, Yusong Lin, Jie Tian

Majority of type 2 diabetes mellitus (T2DM) patients are highly susceptible to several forms of cognitive impairments, particularly dementia. However, the underlying neural mechanism of these cognitive impairments remains unclear. We aimed to investigate the correlation between whole brain resting state functional connections (RSFCs) and the cognitive status in 95 patients with T2DM. We constructed an elastic net model to estimate the Montreal Cognitive Assessment (MoCA) scores, which served as an index of the cognitive status of the patients, and to select the RSFCs for further prediction. Subsequently, we utilized a machine learning technique to evaluate the discriminative ability of the connectivity pattern associated with the selected RSFCs. The estimated and chronological MoCA scores were significantly correlated with R = 0.81 and the mean absolute error (MAE) = 1.20. Additionally, cognitive impairments of patients with T2DM can be identified using the RSFC pattern with classification accuracy of 90.54% and the area under the receiver operating characteristic (ROC) curve (AUC) of 0.9737. This connectivity pattern not only included the connections between regions within the default mode network (DMN), but also the functional connectivity between the task-positive networks and the DMN, as well as those within the task-positive networks. The results suggest that an RSFC pattern could be regarded as a potential biomarker to identify the cognitive status of patients with T2DM.
二型糖尿病会导致患者出现认知功能损伤,甚至导致患者痴呆,但其背后的神经机制尚不明确。我们采用全脑功能连接度分析和机器学习的方法,探索二型糖尿病患者在脑功能连接上与正常对照组之间的差异,并尝试利用全脑功能连接度分类来对存在认知功能障碍的患者进行诊断。我们利用弹性网回归的方法建立了脑功能连接度与认知功能评分之间的关联模型,并选择出与认知功能损伤存在显著相关性的关键功能连接因子,利用这些关键因子对患者的认知功能进行预测。我们建立的关联模型能够准确估计患者的认知功能评分,并且利用选择的关键特征可以建立精准的认知功能障碍预测模型,达到精准预测患者认知功能障碍的效果。此外我们还发现,选择出的关键特征不仅包括了默认脑网络内部脑区之间的连接,还包括默认脑网络与功能网络之间的连接以及功能网络内部的连接。通过这个工作,证实了全脑功能连接有可能作为揭示二型糖尿病患者认知功能障碍的一个潜在生物学标志物。

Page 441-452


Database

PsyMuKB: An Integrative De Novo Variant Knowledge Base for Developmental Disorders

Guan Ning Lin, Sijia Guo, Xian Tan, Weidi Wang, Wei Qian, Weichen Song, Jingru Wang, Shunying Yu, Zhen Wang, Donghong Cui, Han Wang

De novo variants (DNVs) are one of the most significant contributors to severe early-onset genetic disorders such as autism spectrum disorder, intellectual disability, and other developmental and neuropsychiatric (DNP) disorders. Presently, a plethora of DNVs have been identified using next-generation sequencing, and many efforts have been made to understand their impact at the gene level. However, there has been little exploration of the effects at the isoform level. The brain contains a high level of alternative splicing and regulation, and exhibits a more divergent splicing program than other tissues. Therefore, it is crucial to explore variants at the transcriptional regulation level to better interpret the mechanisms underlying DNP disorders. To facilitate a better usage and improve the isoform-level interpretation of variants, we developed NeuroPsychiatric Mutation Knowledge Base (PsyMuKB). It contains a comprehensive, carefully curated list of DNVs with transcriptional and translational annotations to enable identification of isoform-specific mutations. PsyMuKB allows a flexible search of genes or variants and provides both table-based descriptions and associated visualizations, such as expression, transcript genomic structures, protein interactions, and the mutation sites mapped on the protein structures. It also provides an easy-to-use web interface, allowing users to rapidly visualize the locations and characteristics of mutations and the expression patterns of the impacted genes and isoforms. PsyMuKB thus constitutes a valuable resource for identifying tissue-specific DNVs for further functional studies of related disorders. PsyMuKB is freely accessible at http://psymukb.net.
新生突变(De novo variants)是导致发育型神经精神疾病(如孤独症、智力障碍等)的重要因素之一。目前,高通量测序技术已鉴定出许多新生变异,并已有很多研究报道了它们在基因水平上的影响,但它们在转录本层面的影响的研究却还很少。众所周知,脑组织有着高频发生的转录本选择性剪接和调控,并比其他组织有多不同的剪接模式。因此,在转录调控水平上去探索突变将能更好地解释神经精神疾病紊乱的缘由。为了能更好地使用突变并提高其在转录本层面影响的理解,我们构建了PsyMuKB(神经精神突变知识库),一个包含多方位多层次注释的转录和翻译注释的新发突变集合,并提供能识别有组织特异性转录本的新生突变功能的知识库。PsyMuKB能灵活的搜索基因或突变,并提供对应的注释和相关内容的可视化(如基因或蛋白表达、转录本结构、蛋白质相互作用和突变位点在蛋白质结构上的映射)。PsyMuKB有一个简便的web界面,可让用户查看突变的位置和特征以及受影响基因和转录本的表达模式。总而言之,PsyMuKB是一个对于科研和临床都很有价值的资源库。可用于识别组织特异性相关的新生突变,并能方便用户对神经精神疾病进一步研究的知识库。可从http://psymukb.net免费访问PsyMuKB。

Page 453-464


Database

GliomaDB: A Web Server for Integrating Glioma Omics Data and Interactive Analysis

Yadong Yang, Yang Sui, Bingbing Xie, Hongzhu Qu, Xiangdong Fang

Gliomas are one of the most common types of brain cancers. Numerous efforts have been devoted to studying the mechanisms of glioma genesis and identifying biomarkers for diagnosis and treatment. To help further investigations, we present a comprehensive database named GliomaDB. GliomaDB includes 21,086 samples from 4303 patients and integrates genomic, transcriptomic, epigenomic, clinical, and gene-drug association data regarding glioblastoma multiforme (GBM) and low-grade glioma (LGG) from The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), the Chinese Glioma Genome Atlas (CGGA), the Memorial Sloan Kettering Cancer Center Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT), the US Food and Drug Administration (FDA), and PharmGKB. GliomaDB offers a user-friendly interface for two main types of functionalities. The first comprises queries of (i) somatic mutations, (ii) gene expression, (iii) microRNA (miRNA) expression, and (iv) DNA methylation. In addition, queries can be executed at the gene, region, and base level. Second, GliomaDB allows users to perform survival analysis, coexpression network visualization, multi-omics data visualization, and targeted drug recommendations based on personalized variations. GliomaDB bridges the gap between glioma genomics big data and the delivery of integrated information for end users, thus enabling both researchers and clinicians to effectively use publicly available data and empowering the progression of precision medicine in glioma. GliomaDB is freely accessible at http://bigd.big.ac.cn/gliomaDB.
脑胶质瘤是一种常见的中枢神经系统肿瘤,在临床上具有难治性、高复发性及高致死性的特点。科学家们在研究胶质瘤发生发展机制及用于诊断和治疗的生物标志物方面做出了许多努力。为了进一步帮助研究人员及临床医生,本文建立了一个名为GliomaDB的脑胶质瘤综合数据库。目前,GliomaDB数据库整合了胶质瘤组学信息以及药物信息,数据来自癌症基因组图谱 The Cancer Genome Atlas (TCGA),美国国立生物技术信息中心NCBI的Gene Expression Omnibus (GEO),中国胶质瘤基因组图谱 The Chinese Glioma Genome Atlas (CGGA), 纪念斯隆—凯特林癌症中心的 The Memorial Sloan Kettering Cancer Center Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT),美国食品药品监督管理局 The US Food and Drug Administration (FDA) 以及遗传药理学和药物基因组学数据库 PharmGKB。GliomaDB为用户提供了友好的界面,主要包括查询以及分析两个功能。在查询界面,用户可以查询到胶质瘤多组学信息,包括体细胞突变、基因表达、microRNA表达、以及DNA甲基化;在分析界面,用户可以进行个性化基于基因表达的生存分析,基因共表达网络,聚类分析以及针对性的靶向药物指导。GliomaDB数据库在整合胶质瘤多组学数据以及药物信息的同时为研究人员以及临床医生提供了免费公开的数据信息及个性化分析,从而为胶质瘤的早期诊断、精准分期及预后评估等提供了有力的支持。可从https://bigd.big.ac.cn/gliomaDB免费访问GliomaDB。

Page 465-471