全文下载:2012060587
于彬a, 张岩b
(青岛科技大学 a.数理学院; b.机电工程学院, 山东 青岛 266061)
摘要: 对结肠癌的基因表达谱数据进行分析,提出选取其特征基因的新方法。首先考虑到基因表达谱数据高维数、小样本的特点,采用Bhattacharyya距离对数据进行降维,运用遗传算法生成特征基因子集,以支持向量机作为分类器,建立了基于GA-SVM的结肠癌两类别分类模型。实验结果表明,仅需提取10个特征基因就可获得95.62%分类准确率。
关键词: 基因表达谱; 肿瘤分类; 特征基因; 遗传算法; 支持向量机
中图分类号: Q 811.4文献标志码: A
Analysis of Colon Cancer Gene Expression Profiles
Based on GA-SVM Method
YU Bina, ZHANG Yanb
(a.College of Mathematics and Physics;b.College of Electromechanicol Engineering,
Qingdao University of Science and Technology, Qingdao 266061, China)
Abstract: Based on the analysis of colon cancer gene expression profiles data, a new method for selecting its informative genes was proposed. According to the characteristics of high dimensionality and small samples of gene expression profiles data, Bhattacharyya distance was used to reduce dimensionality of the data and informative genes subset was generated by genetic algorithm (GA) in order to utilize support vector machine (SVM) as classifier. Two-class classification model for colon cancer based on GA-SVM was established. The results showed that the accuracy could reach 95.62% by only 10 informative genes.
Key words: gene expression profiles; tumor classification; informative genes; genetic algorithm; support vector machine
收稿日期:2012-03-15
基金项目: 国家自然科学基金项目(30871341);山东省教育厅科研基金项目(J10LA57);山东省优秀中青年科学家科研奖励基金项目(BS2012DX009).
作者简介: 于彬(1977—),男,讲师.