Volume: 23, Issue: 1

Review Article

Challenges in AI-driven Biomedical Multimodal Data Fusion and Analysis

Junwei Liu (刘俊伟) , Xiaoping Cen (岑萧萍) , Chenxin Yi (伊晨昕) , Feng-ao Wang (王烽傲) , Junxiang Ding (丁俊翔) , Jinyu Cheng (程瑾瑜) , Qinhua Wu (吴沁桦) , Baowen Gai (盖宝文) , Yiwen Zhou (周奕雯) , Ruikun He (贺瑞坤) , Feng Gao (高峰) , Yixue Li (李亦学)

The rapid development of biological and medical examination methods has vastly expanded personal biomedical information, including molecular, cellular, image, and electronic health record datasets. Integrating this wealth of information enables precise disease diagnosis, biomarker identification, and treatment design in clinical settings. Artificial intelligence (AI) techniques, particularly deep learning models, have been extensively employed in biomedical applications, demonstrating increased precision, efficiency, and generalization. The success of the large language and vision models further significantly extends their biomedical applications. However, challenges remain in learning these multimodal biomedical datasets, such as data privacy, fusion, and model interpretation. In this review, we provide a comprehensive overview of various biomedical data modalities, multimodal representation learning methods, and the applications of AI in biomedical data integrative analysis. Additionally, we discuss the challenges in applying these deep learning methods and how to better integrate them into biomedical scenarios. We then propose future directions for adapting deep learning methods with model pretraining and knowledge integration to advance biomedical research and benefit their clinical applications.
随着生物医学检测方法的快速发展,个人生物医学信息的数量和类型得到了大幅扩展,包括基因组学、转录组学、蛋白质组学、代谢组学数据,以及医学影像和电子病历数据(EHRs)。这些多模态数据集在临床场景中具有巨大的应用潜力,可用于精确的疾病诊断、标志物识别和个性化疗法的开发。人工智能(AI)技术,尤其是大语言模型(LLMs)和视觉模型的成功,进一步扩展了AI在生物医学领域的应用。然而,如何有效整合这些跨尺度多模态生物医学数据集,以及在数据隐私、模态缺失和模型的可解释性等问题上依然面临诸多挑战。本综述提供了多种生物医学模态数据、多模态表示学习方法,以及AI在生物医学数据整合分析中的应用的全面概述,深入讨论实际临床应用中的挑战,并展望如何进一步推动人工智能技术在生物医学研究和临床应用中的关键发展方向。

Page qzaf011


Review Article

Long and Accurate: How HiFi Sequencing is Transforming Genomics

Bo Wang (王博) , Peng Jia (贾鹏) , Shenghan Gao (高胜寒) , Huanhuan Zhao (赵焕焕) , Gaoyang Zheng (郑高洋) , Linfeng Xu (许林峰) , Kai Ye (叶凯)

Recent developments in PacBio high-fidelity (HiFi) sequencing technologies have transformed genomic research, with circular consensus sequencing now achieving 99.9% accuracy for long (up to 25 kb) single-molecule reads. This method circumvents biases intrinsic to amplification-based approaches, enabling thorough analysis of complex genomic regions [including tandem repeats, segmental duplications, ribosomal DNA (rDNA) arrays, and centromeres] as well as direct detection of base modifications, furnishing both sequence and epigenetic data concurrently. This has streamlined a number of tasks including genome assembly, variant detection, and full-length transcript analysis. This review provides a comprehensive overview of the applications and challenges of HiFi sequencing across various fields, including genomics, transcriptomics, and epigenetics. By delineating the evolving landscape of HiFi sequencing in multi-omics research, we highlight its potential to deepen our understanding of genetic mechanisms and to advance precision medicine.
近年来,PacBio高保真(high-fidelity, HiFi)测序技术凭借其循环共识测序(Circular Consensus Sequencing, CSS)方法,成功实现了长达25kb的单分子测序,准确率高达99.9%。这一技术突破不仅克服了传统短读长测序的局限性,还为复杂基因组区域,例如可变数目串联重复序列(variable number tandem repeats, VNTRs)、着丝粒序列及rDNA阵列等的深入研究提供了新的可能性。在本文中,作者结合国内外具有代表性的前沿研究,全面综述了HiFi技术在基因组组装、变异检测、表观遗传学、全长转录本分析以及单细胞测序等领域的广泛应用和突出贡献。最后,文章深入探讨了HiFi测序技术当前面临的挑战,并展望了其未来发展方向。随着这些挑战的逐步解决,HiFi技术必将为精准医疗和多组学研究的发展提供强有力的推动力。

Page qzaf003


Review Article

Mass Spectrometry-based Solutions for Single-cell Proteomics

Siqi Li, Shuwei Li, Siqi Liu, Yan Ren

Mass spectrometry-based single-cell proteomics (MS-SCP) is attracting tremendous attention because it is now technically feasible to quantify thousands of proteins in minute samples. Since protein amplification is still not possible, technological improvements in MS-SCP focus on minimizing sample loss while increasing throughput, resolution, and sensitivity, as well as achieving measurement depth, accuracy, and stability comparable to bulk samples. Major advances in MS-SCP have facilitated its application in biological and even medical research. Here, we review the key advancements in MS-SCP technology and discuss the strategies of the typical proteomics workflow to improve MS-SCP analysis from single-cell isolation, sample preparation, and liquid chromatography separation to MS data acquisition and analysis. The review will provide an overall understanding of the development and applications of MS-SCP and inspire more novel ideas regarding the innovation of MS-SCP technology.
近年来,基于质谱的单细胞蛋白质组学(MS-SCP)技术迅速发展,成为生命科学领域的研究热点。得益于分析技术的持续进步,研究人员已能够在极微量样本中实现数千种蛋白质的准确定量。鉴于蛋白质无法如核酸一般扩增,当前MS-SCP研究聚焦于以下关键问题:(1)最大限度减少样本损失;(2)提高检测通量;(3)增强检测灵敏度与分辨率;(4)保障数据深度、准确性及稳定性与常规蛋白质组学相当。 本综述系统梳理了单细胞蛋白质组学的研究进展,涵盖从单细胞分离、样品前处理、液相色谱分离到质谱数据采集与分析的全流程优化策略,旨在为该领域研究者提供技术参考,并推动MS-SCP在基础研究与临床转化中的广泛应用。

Page qzaf012


Original Research

Evaluative Methodology for HRD Testing: Development of Standard Tools for Consistency Assessment

Zheng Jia, Yaqing Liu, Shoufang Qu, Wenbin Li, Lin Gao, Lin Dong, Yun Xing, Yadi Cheng, Huan Fang, Yuting Yi, Yuxing Chu, Chao Zhang, Yanming Xie, Chunli Wang, Zhe Li, Zhihong Zhang, Zhipeng Xu, Yang Wang, Wenxin Zhang, Xiaoping Gu, Shuang Yang, Jinghua Li, Liangshen Wei, Yuanting Zheng, Guohui Ding, Leming Shi, Xin Yi, Jianming Ying, Jie Huang

Homologous recombination deficiency (HRD) has emerged as a critical prognostic and predictive biomarker in oncology. However, current testing methods, especially those reliant on targeted panels, are plagued by inconsistent results from the same samples. This highlights the urgent need for standardized benchmarks to evaluate HRD assay performance. In phases IIa and IIb of the Chinese HRD Harmonization Project, we developed ten pairs of well-characterized DNA reference materials derived from lung, breast, and melanoma cancer cell lines and their matched normal cell lines, keeping each paired with seven cancer-to-normal mass ratios. Reference datasets for allele-specific copy number variations (ASCNVs) and HRD scores were established and validated using three sequencing methods and nine analytical pipelines. The genomic instability scores (GISs) of the reference materials ranged from 11 to 96, enabling validation across various thresholds. The ASCNV reference datasets covered a genomic span of 2340 to 2749 Mb, equivalent to 81.2% to 95.4% of the autosomes in the 37d5 reference genome. These benchmarks were subsequently utilized to assess the accuracy and reproducibility of four HRD panel assays, revealing significant variability in both ASCNV detection and HRD scores. The concordance between panel-detected GISs and reference GISs ranged from 0.81 to 0.94, with only two assays exhibiting high overall agreement with Myriad MyChoice CDx for HRD classification. This study also identified specific challenges in ASCNV detection in HRD-related regions and the profound impact of high ploidy on consistency. The established HRD reference materials and datasets provide a robust toolkit for objective evaluation of HRD testing.

Page qzaf017


Original Research

A Single-cell Atlas of Developing Mouse Palates Reveals Cellular and Molecular Transitions in Periderm Cell Fate

Wenbin Huang (黄文斌) , Zhenwei Qian (钱振伟) , Jieni Zhang (张杰铌) , Yi Ding (丁毅) , Bin Wang (王斌) , Jiuxiang Lin (林久祥) , Xiannian Zhang (张先念) , Huaxiang Zhao (赵华翔) , Feng Chen (陈峰)

Cleft palate is one of the most common congenital craniofacial disorders that affects children’s appearance and oral functions. Investigating the transcriptomes during palatogenesis is crucial for understanding the etiology of this disorder and facilitating prenatal molecular diagnosis. However, there is limited knowledge about the single-cell differentiation dynamics during mid-palatogenesis and late-palatogenesis, specifically regarding the subpopulations and developmental trajectories of periderm, a rare but critical cell population. Here, we explored the single-cell landscape of mouse developing palates from embryonic day (E) 10.5 to E16.5. We systematically depicted the single-cell transcriptomes of mesenchymal and epithelial cells during palatogenesis, including subpopulations and differentiation dynamics. Additionally, we identified four subclusters of palatal periderm and constructed two distinct trajectories of cell fates for periderm cells. Our findings reveal that claudin-family coding genes and Arhgap29 play a role in the non-stick function of the periderm before the palatal shelves contact, and Pitx2 mediates the adhesion of periderm during the contact of opposing palatal shelves. Furthermore, we demonstrate that epithelial–mesenchymal transition (EMT), apoptosis, and migration collectively contribute to the degeneration of periderm cells in the medial epithelial seam. Taken together, our study suggests a novel model of periderm development during palatogenesis and delineates the cellular and molecular transitions in periderm cell determination.
研究问题: 唇腭裂是口腔颅颌面部最常见的先天性出生缺陷,不仅损害了患者的容貌,还会对患者的口颌系统功能及心理健康造成严重的影响。腭裂作为唇腭裂的一种亚型,其分子病因学一直是研究者关注的焦点,也是实现产前分子诊断的重要理论基础。周皮细胞是一类覆盖在上皮表面的单层细胞,尽管数量稀少,却在腭部发育的不同阶段发挥着多样且关键的生理功能。然而,周皮细胞的亚群及其命运决定的机制尚未完全阐明。 研究方法: 本研究通过单细胞转录组学测序,结合生物信息学分析,并通过随后的免疫荧光、RNAscope原位杂交(in situ hybridization, ISH)和体外腭突培养等实验,系统绘制了小鼠腭部发育中的单细胞转录组图谱,特别是鉴定了周皮细胞的四个新亚群,并构建了周皮细胞的发育轨迹:包括融合与角化这两条发育轨迹,进而解析了周皮细胞的发育动力学及关键驱动因子。 主要结果: 1. 全面绘制了小鼠腭部发育的单细胞转录组图谱。 2. 鉴定了周皮细胞的四个新亚群,并构建了周皮细胞的发育轨迹,包括融合与角化这两条发育轨迹。 3. Claudin家族编码基因与Arhgap29参与了周皮细胞在腭突接触融合前的抗粘接功能,而Pitx2介导了周皮细胞在腭突内侧缘接触时的黏附功能。 4. 上皮间充质转化、凋亡与迁移共同作用于中线上皮缝中周皮细胞的消解过程,且上皮间充质转化与凋亡在同一细胞内是互斥的。 5. 周皮细胞的接触方式是决定其命运的关键。

Page qzaf013