设为首页 联系我们 加入收藏

当前位置: 网站首页 期刊分类目录 2024第4期 正文

基于多头自编码网络的单细胞多组学数据无监督降噪

作者:时间:2024-09-11点击数:

全文下载: 202404020.pdf


文章编号: 16726987202404014613 DOI 10.16351/j.16726987.2024.04.020



李双翼a,b 刘发荣a,b 任胜a,b 于彬b*(青岛科技大学 a.数理学院; b.数据科学学院, 山东 青岛 266061)


摘要: 单细胞多组学测序正在广泛应用于生物医学研究中,并产生大量的多样性组学数据。然而原始的单细胞多组学数据包含多种类型的测序噪声和冗余信息,对后续生物医疗层面的分析造成困难。现有的降噪方法主要依赖于单一的数据分布假设,并针对性的处理单个组学数据,这对模型联合处理不同组学数据造成极大地限制。本研究提出一种使用单细胞多组学数据降噪的分析方法,称为scMAED (singlecell multiomics data via a multihead autoencoder network to denoising)。模型在多头自动编码器网络中添加了分类解码器,以无监督的方式来最大程度的去除数据噪声。首先,使用两个编码器独立学习多组学数据的内部特征,并联合输出的低维特征进行共同解码。其次,分类解码器不做任何数据分布假设,通过使用预测的细胞簇标签来反馈数据信息,以最大限度的去除复杂噪声。最后,使用主成分分析和 tSNE进行可视化。本文基于模拟数据集和真实的小鼠数据集对模型进行性能评估,结果显示scMAED在降噪效果上优于实验中的对比方法,并能够极大的改善单细胞多组学数据的质量。


关键词: 单细胞多组学数据; 深度学习; 多头自编码网络; 降噪


中图分类号: Q 811.4文献标志码: A


引用格式: 李双翼, 刘发荣, 任胜, 基于多头自编码网络的单细胞多组学数据无监督降噪[J. 青岛科技大学学报(自然科学版), 2024, 45(4): 146158.


LI Shuangyi, LIU Farong, REN Sheng, et al. Unsupervised denoising of singlecell multiomics data based on multihead autoencoder networkJ. Journal of Qingdao University of Science and TechnologyNatural Science Edition), 2024 454): 146158.


Unsupervised Denoising of SingleCell MultiOmics Data Based on

MultiHead Autoencoder Network


LI Shuangyia,b, LIU Faronga,b, REN Shenga,b, YU Binb

(a. College of Mathematics and Physics; b. College of Data Science, Qingdao University of Science and Technology, Qingdao 266061, China)


Abstract: Singlecell multiomics sequencing is being widely used in biomedical research and generates large amounts of diverse omics data. However, raw singlecell multiomics data contains multiple types of sequencing noise and redundant information, which makes subsequent biomedical analysis difficult. Existing denoising methods mainly rely on a single data distribution assumption and process a single omics data in a targeted manner, which greatly limits the joint processing of different omics data by the model. Therefore, we design and propose an analytical method for denoising using singlecell multiomics data, called scMAED (singlecell multiomics data via a multihead autoencoder network to denoising). The model adds a classification decoder to the multihead autoencoder network to remove the maximum noise from the data in an unsupervised manner. First, two encoders are used to independently learn the internal features of the multiomics data, and jointly decode the output lowdimensional features. Second, the classification decoder does not make any data distribution assumptions, and uses the predicted cell cluster labels to feed back data information to minimize complex noise. Finally, we use principal component analysis and tSNE for visualization. In this paper, we evaluate the performance of the model based on simulated datasets and real mouse datasets. The results show that scMAED is superior to the experimental comparison method in denoising effect, and can greatly improve the quality of singlecell multiomics data.


Key words: singlecell multiomics data; deep learning; multihead autoencoder network; noise reduction


收稿日期: 20231003

基金项目: 国家自然科学基金项目(62172248); 山东省自然科学基金项目(ZR2021MF098).

作者简介: 李双翼(1998—),男,硕士研究生.*通信联系人.


Copyright © 2011-2017 青岛科技大学学报 (自然科学版)