设为首页 联系我们 加入收藏

当前位置: 网站首页 期刊分类目录 2017第2期 正文

基于支持向量机的人类ncRNA基因预测

作者:时间:2017-04-21点击数:

PDF全文下载:2017020112

于彬,陈成,刘健,李珊, 陈瑞欣

(青岛科技大学 数理学院,山东 青岛266061)

摘要:提出一种新的基于支持向量机的人类ncRNA基因预测方法。首先从GENCODE数据库和UCSC数据库中提取人的ncRNA和mRNA序列数据,选择单核苷酸、二核苷酸出现频率等86个特征作为原始数据,其次利用离散小波变换去除冗余信息和噪声,最后建立离散小波变换与支持向量机相结合的ncRNA基因预测模型(DWT-SVM)。实验结果表明DWT-SVM模型对测试集ncRNA的预测准确率为93.71%,优于PCA-SVM和DWT-KNN两种预测模型的预测结果。

 关键词:非编码RNA;基因预测;支持向量机;离散小波变换

中图分类号:Q 811.4  文献标志码:A

引用格式:于彬,陈成,刘健,等.基于支持向量机的人类ncRNA基因预测[J].青岛科技大学学报(自然科学版), 2017, 38(2): 112-118.

YU Bin, CHEN Cheng, LIU Jian, et al.Prediction of human non-coding RNA genes based on support vector machine[J]. Journal of Qingdao University of Science and Technology(Natural Science Edition), 2017, 38(2): 112-118.

Prediction of Human Non-coding RNA Genes Based on Support Vector Machine

YU Bin, CHEN Cheng, LIU Jian, LI Shan, CHEN Ruixin

  (College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China)

 Abstract:  This paper presents a new human ncRNA gene prediction method based on support vector machine. Firstly, ncRNA and mRNA sequence data are extracted from GENCODE database and UCSC database, choosing 86 characteristics such as single nucleotide, two nucleotide as the original data. Secondly using discrete wavelet transform to remove redundant information and noise. Finally ncRNA prediction model combined by discrete wavelet transform and support vector machine is built up. Experimental result show that prediction accuracy of test set ncRNA based on DWT-SVM model is 93.71%, which is better than the prediction result of PCA-SVM and DWT-KNN prediction model.

 Key words: non-coding RNA; gene prediction; support vector machine; discrete wavelet transform

 收稿日期:2015-09-01

 基金项目: 国家自然科学基金项目(51372125);山东省自然科学基金项目(ZR2013AM007, ZR2014FL021);山东省高等学校科技计划项目(J13LI54);青岛科技大学大学生创新训练计划项目(201606002).

 作者简介:于彬(1977-),男,副教授.

文章编号:1672-6987(2017)02-0112-07;DOI:10.16351/j.1672-6987.2017.02.018

Copyright © 2011-2017 青岛科技大学学报 (自然科学版)